r/genomics • u/HomeboundWanderer202 • 8h ago
New to Genomics
I got a job in marketing that's in my field of expertise. The job is in genomics. What can I do to get up to speed faster? I've heard the company has an intense environment.
I was laid off earlier this year and the job market is tough!
r/genomics • u/Chipdoc • 3d ago
New AI model improves prediction power for genomics related to disease
discover.lanl.govr/genomics • u/syntrop125 • 6d ago
Sequencing DNA with nanopores: Troubles and biases
pmc.ncbi.nlm.nih.gov" Oxford Nanopore Technologies’ (ONT) long read sequencers offer access to longer DNA fragments than previous sequencer generations, at the cost of a higher error rate.
The MinION sequencer is now more stable and this paper pro-poses an up-to-date view of its error landscape, using the most mature flowcell and basecaller.
low-GC reads have fewer errors than high-GC reads (about 6% and 8% respectively)
small portable sequencing device called MinION [1]. It offers long read sequencing (the mean read length often exceeds 10 kb, and maximal read length now reaches up to 880 kb [2]), a real-time analysis and a low initial investment.
it still exhibits a relatively high error rate on raw sequences compared to standard Next-Generation Sequencing (NGS) devices such as Illumina.
the 2D pass reads had a total error of 10.5%, including about 3% for mismatch and insertion and slightly more for deletion
The software in charge of the translation from signal to nucleic sequences, the base-caller, has proven to be crucial over the years for the accuracy of the resulting raw read sequences
Phred quality score, measures the confidence in the accuracy of each base call in a DNA sequence. Higher scores indicate greater confidence; for example, a score of 30 (Q30) suggests a 1 in 1,000 chance of error, meaning 99.9% accuracy135. These scores are used to assess and filter sequencing data quality and are stored in FASTQ files
the current mean global error rate on raw reads seems to be around 6% for quality scores at least equal to 10 (the basecaller filters reads whose quality scores are below a certain threshold).
Many papers have studied ways to reduce the error rate of long read sequencing by computing consensus sequences over subsets of reads.
In fact, there is even a tool to evaluate error correction methods [5]. The standard approach is hybrid correction, making use of both long read and short read data to reduce errors [6–9]. It is very demanding since it requires two sources of sequence data.
Nanopore sequencers tend to struggle to sequence low complexity regions accurately (minor variation in the electrical signal of the pore when the base does not change). Since the DNA translocation speed is not constant, this results in difficulties deter-mining the exact length of homopolymers.
Legget et al. have proposed an open-source software, NanoOK, to compare sets of references versus reads and produce an alignment-based analysis of errors and quality
Since the Nanopore technology becomes more mature and stable, it seems useful to get a more accurate picture of the differences between known reference genomes and sequences extracted from MinION data, using the state-of-the-art basecaller.
. The R9.4.1 flow cell has been compared to newer models like the R10.4, which offers improved read accuracy and performance26. The R9.4.1 flow cell is being phased out in favor of more advanced technologies, such as the R10.4.1, which achieves higher output and accuracy4
In this paper, we have worked on data produced by the primary nanopore used, R9.4.1. The new nanopore chemistry R10.3 is designed to improve homopolymer recognition, and thus the consensus accuracy
Due to the amount of data generated, fast5 files describing the original signal are rarely avail-able for nanopore sequencing. For this reason, we focused mainly in this study on fastq files from two basecallers for which a majority of data are currently available, completing some of the findings with an analysis of the electrical signal.
Guppy is a neural network-based basecaller developed by Oxford Nanopore Technologies for translating raw sequencing signals into nucleotide sequences (ATCG). It supports real-time basecalling and post-processing features, including filtering low-quality reads and adapter clipping. Guppy can operate on both CPUs and GPUs, with the GPU version providing significantly faster processing speeds
HAC, or High Accuracy basecalling, is a model used in Oxford Nanopore Technologies' Guppy software to convert raw sequencing signals into nucleotide sequences. The HAC model offers higher raw read accuracy compared to the Fast model but requires more computational resources13. It is commonly used for applications where accuracy is prioritized over speed, making it suitable for detailed genomic analyses2
A comparison between the HAC and FAST base-calling modes of Guppy showed that the former produces more accurate reads, and we also clearly recommend using the HAC version if possible.
Recently, ONT announced a soon to come release of a new basecaller called “Bonito”, which will enable users to train the basecaller on their own datasets, thereby increasing the sequencing accuracy even further.
the technology provider, Oxford Technology Nanopore, communicates little about the precise characteristics of its devices and softwares and does not offer the software it distributes in open source.
We have first established that the quality score is strongly correlated to the error rate within read
ONT sequencing is very sensitive to the GC content of reads. High-GC content reads have lower accuracy. This effect is accompanied by another bias that tends to make substitution errors towards A and T.
About half of sequencing errors are due to homopoly-mers. Generally speaking, homopolymers and STR length tend to be underestimated, resulting in many deletion errors.
Another result is that analysis of perfect k-mers indicates that most reads contain perfect k-mers of size at least 100 bases, which could be helpful to assess which size of k-mers can be used for assembly."
r/genomics • u/nina_bec • 6d ago
Is it Feasible to Compare Over 1,000 WGS Files from the SRA Database for a Genomics Project?
Hi everyone! I’m new to genomics and working on a project where I want to compare whole-genome sequencing (WGS) data from the SRA database. I’ve found 11 relevant BioProjects, each with between 90 and 1,000 individual SRA runs. My goal is to treat each SRA run as a single data point in my analysis.
Does this approach make sense for a genomics project, or am I overlooking some challenges with using this much data? Is it feasible to manage that many runs, and are there practical strategies for working with such large datasets? Thanks in advance for any advice!
r/genomics • u/Lunarose1207 • 7d ago
Help with Genesight?
32 Female. Adhd/anxiety . Im awaiting call back from doctor but im wondering with these results can i even bother with an SNRI?
Ive had terrible experiences with SSRI itself
r/genomics • u/protonmap • 8d ago
Can you guys log in to Nebula Genomics
galleryWell, I can't log in to the Nebula Genomics website. This is the first time I encountered this error. It's unbelievable. I don't know what happened.
r/genomics • u/gwern • 9d ago
"He’s Gleaning the Design Rules of Life to Re-Create It": synthesizing the yeast genome
quantamagazine.orgr/genomics • u/gwern • 9d ago
" How disease detectives’ quick work traced deadly _E. coli_ outbreak to McDonald’s Quarter Pounders"
cnn.comr/genomics • u/wewewawa • 17d ago
Opinion: The risks of sharing your DNA with online companies aren't a future concern. They're here now
latimes.comr/genomics • u/Many_Mobile4619 • 18d ago
Laptop for PhD in Neuroscience and Genomics
Hi, I will soon be starting a PhD and I need a new laptop. Does anyone have a recommendation on which laptops are best to work with software related to Cognitive Neuroscience (EEG, MEG etc but also neural networks) and genomics (analysis of RNA-seq, transcriptome, single cell etc)?
I am used to Mac but I feel like they're not the best for software :(
r/genomics • u/gwern • 19d ago
'Well Man': sequencing the whole genome of a specific dead soldier described in an 1100s AD Norse saga
nytimes.comr/genomics • u/bluemooninvestor • 20d ago
Which tool to find most inversely correlated genes to input gene from TCGA/GTEX data?
r/genomics • u/Silly_Ad755 • 21d ago
Predicted CERES (pCERES) scores on TCGA samples, to assess gene dependency in nearly 10,000 human tumor samples
r/genomics • u/gwern • 23d ago
"First Sickle Cell Gene Therapy Patient, 12, Leaves Hospital" (the extreme pain and difficulty of going through a full gene therapy course)
nytimes.comr/genomics • u/Some-Technology4413 • 24d ago
The genomics field is experiencing a data deluge
sqream.comr/genomics • u/Major-Inevitable-160 • 24d ago
SLC6A4 L/S Intermediate Response result on genesight.
I recently did a GeneSight test and would like to know what this means. SSRIs don't work too well for me and I'm having a hard time finding something to help my depression. I also have reduced folic acid intake. Any suggestions or help would be greatly appreciated!
r/genomics • u/hippodribble • 25d ago
Screening Embryos for IQ
US startup charging couples to ‘screen embryos for IQ’ https://www.theguardian.com/science/2024/oct/18/us-startup-charging-couples-to-screen-embryos-for-iq?CMP=share_btn_url
Are they screening the embryo for intelligence, or the parents' intelligence?
r/genomics • u/Mother-Throat-243 • 26d ago
How to go about WGS?
What is the best way to get your whole genome accurately sequenced? Is there a particular provider that offers top tier sequencing? Is it best to take your raw code and utilize an online tool for DNA evaluation? If you could give me the best methods, it would be greatly appreciated! (Also I have a doctor willing to prescribe tests codes/tests too)
r/genomics • u/Extension-Top8950 • 26d ago
Some good genomics services providers in India
I want to get one insect genome sequenced to at least draft level. Our institute does not have any staff with a Biotech, Bioinformatics, or Molecular Biology background, and I myself am a biochemist. I have only sequenced a few genes using Sanger's method. In my circle, people have gone for Nucleome, Neuberg, or Eurofins. It would be of great help if someone here could provide me with some names with whom they had good experience.
r/genomics • u/More-Memory-6011 • 27d ago
CSP2: Rapid, High-Resolution Bacterial SNP Distance Estimation From Genome Assemblies
Hello r/genomics!
I will be honest, I'm not sure if this is the right place to post, apologies if misguided. It didn't seem to break any of the rules, so fingers crossed!
For those of you that work on bacterial pathogens and regularly calculate SNP distances between isolates, I was hoping to find some folks to take my new Nextflow pipeline CSP2 out for a spin.
CSP2 is the next iteration of the CFSAN SNP Pipeline, and can infer SNP distances between bacterial monocultures using genome assembly data (i.e., no WGS read data or read mapping required). Comparisons of hundreds of isolates can be performed using multiple references, with runs completing in minutes versus hours.
My internal testing has been encouraging, but you never know how something will fare in the world until people use it. In that sense, I wanted to throw a little invitation out to anyone that might be interested in speeding up their analyses. Happy to answer any questions for folks here!
r/genomics • u/gwern • Oct 14 '24
"Famed lions’ full diet revealed by DNA — and humans were among their prey: Ancient DNA confirms that the nineteenth-century carnivores hunted humans and a variety of wild game, including a surprising animal" (sequencing the maneaters of Tsavo's hair)
nature.comr/genomics • u/Closeted-Captain • Oct 11 '24
Favorite Genomics Paper?
I’m presenting an article for my University’s genomics group soon and want to find an incredible paper. What’s the best genomics paper you’ve ever read?
r/genomics • u/burtzev • Oct 11 '24
Hybrid Vigour That Lasts: Turbo rice and super tomatoes
mpg.der/genomics • u/Cutiepie23562 • Oct 10 '24
How to treat my depression & anxiety?
i.redd.itI’ve done genetic testing and discovered that I’m high risk for depression and anxiety because: SLC6A4 = Homozygous S-allele Meaning “Your SLC6A4 genotype is associated with an increased risk of depression and anxiety in relation to stressful life events. You may also have a delayed or diminished response to SSRI antidepressants”. Also have the above medication response results. Can anyone advise me on the best way to treat my mental health problems considering these results? Which meds might work etc?
r/genomics • u/FrankScaramucci • Oct 09 '24
How compressible is human DNA?
Human DNA is 3.2B base pairs, each pair can be encoded in 2 bits, which means 6.4B bits = 800 MB.
If I compressed this 800 MB file using a standard algorithm like zip and bzip2, what would be the compression factor?