Questions tagged [rna-seq]

RNA-Seq (named as an abbreviation of "RNA sequencing") is a technology-based sequencing technique which uses next-generation sequencing (NGS) to reveal the presence and quantity of RNA in a biological sample at a given moment, analyzing the continuously changing cellular transcriptome.

RNA-Seq (named as an abbreviation of "RNA sequencing") is a technology-based sequencing technique which uses next-generation sequencing (NGS) to reveal the presence and quantity of RNA in a biological sample at a given moment, analyzing the continuously changing cellular transcriptome.

Related tags:

193 questions
5
votes
1 answer

Snakemake --forceall --dag results in mysterius Error: : syntax error in line 1 near 'File' from Graphvis

My attempts to construct DAG or rulegraph from RNA-seq pipeline using snakemake results in error message from graphviz. 'Error: : syntax error in line 1 near 'File'. The error can be corrected by commenting out two print commands with no visible…
Manninm
  • 151
  • 1
  • 7
4
votes
1 answer

Optimal number of threads for GNU parallel

I think I have a fairly basic question. I just discovered the GNU parallel package and I think my workflow can really benefit from it! I am using a loop which loops through my read files and generates the desired output. The command that is…
3
votes
2 answers

rowSums if at least 1 group (set of columns) has greater than N counts in all replicates

I am working with RNA-seq data. I want to filter out genes where there are fewer than N counts in both replicates in at least one of my treatment groups. My data is in a DESeq object, and the count data is structured like this, where each row is the…
3
votes
4 answers

Receiving "TypeError: __init__() got an unexpected keyword argument 'basey'" In this tutorial

I've been trying to run through this tutorial (https://bedapub.github.io/besca/tutorials/scRNAseq_tutorial.html) for the past day and constantly get an error after running this portion: bc.pl.kp_genes(adata, min_genes=min_genes, ax = ax1) The error…
SorenFlying
  • 33
  • 1
  • 3
3
votes
1 answer

Snakemake, RNA-seq : How can I execute one subpart of a pipeline or another subpart based on the characteristics of the sample that is analysed?

I am using snakemake to design a RNAseq-data analysis pipeline. While I've managed to do that, I want to make my pipeline to be as adaptable as possible and make it able to deal with single-reads (SE) data or paired-end (PE) data within the same run…
athiebaut
  • 153
  • 9
3
votes
0 answers

Correcting RNA-seq dataset for known batch effect

I'm analyzing an RNA-seq dataset where a human cell line has been exposed to multiple chemical compounds at multiple doses. When running QC I have noticed the presence of a batch effect due to the different plates the cells were treated (not a…
Bithorax
  • 43
  • 4
2
votes
1 answer

Remove part of FASTA file heading annotated genome using python

I wanted to remove part of the headings/annotations for a FASTA genome file so I could maintain only the locus tags and the protein description. Eg. Convert: lcl|CP000438.1_cds_ABJ14958.1_2 [gene=dnaN] [locus_tag=PA14_00020] [protein=DNA polymerase…
2
votes
1 answer

Trying to decompress file with gzip but error: "name 'd' is not defined"

I'm pretty new to bioinformatics and I'm trying to decompress a fastq.gz file to convert it into .bam (I'm trying to later analyze this transcriptomic data with DESeq2). I'm in the very beginning of decompressing the file using Jupyter notebooks and…
2
votes
1 answer

Snakemake expand+zip function unexpected behavior

I am trying to use Snakemake to process calls to the rnaQUAST tool with multiple inputs delineated by two sets of different, but paired keywords. I do not want all combinations of these keywords, only specific combinations. It is my understanding…
2
votes
4 answers

How to create a regex expression to get a substring between 2 pipes

I have a dataset that I'm trying to work with where I need to get the text between two pipe delimiters. The length of the text is variable so I can't use length to get it. This is the string: ENST00000000233.10|ENSG00000004059.11|OTTHUMG000 I want…
Ben Tanner
  • 31
  • 3
2
votes
2 answers

Is Nextflow really inconsistent or am I doing something wrong using nf-core/rnaseq?

I want to preface this with I am very new to Nextflow, and if I don't include a key to debugging I am sorry please just let me know. ==================================== Case 1: I tried to run this command: nextflow run nf-core/rnaseq --aligner…
2
votes
1 answer

Pull out genes/observations from cutree_rows groups in pheatmap

How can you pull out the genes/observations from the row groups generated from cutree_rows = 3 in pheatmap? would be obj$tree_row$...? obj <- pheatmap(mat, annotation_col = anno, fontsize_row = 10, show_colnames = F, show_rownames = F, cutree_cols =…
Ecg
  • 908
  • 1
  • 10
  • 28
2
votes
2 answers

Add black outline for different geom_point shapes on DESeq2 PCA

I am running a PCA with the DESeq2 package and would like to obtain a black outline on the shapes which are already based on an observation.The round ones work, but the other shapes do not. Examples such as Make stat_ellipse {ggplot2} outline…
Ecg
  • 908
  • 1
  • 10
  • 28
2
votes
2 answers

subsetting anndata on basis of louvain clusters

I want to subset anndata on basis of clusters, but i am not able to understand how to do it. I am running scVelo pipeline, and in that i ran tl.louvain function to cluster cells on basis of louvain. I got around 32 clusters, of which cluster 2 and 4…
sidrah maryam
  • 45
  • 2
  • 8
2
votes
2 answers

How can I read a 63 GB .csv file into RStudio from the Allen Brain Map using R?

Using RStudio, I am trying to read in the Gene_expression_matrix.csv file from the Brain Allen Institute, and the file is too large, even for computers with large amounts of RAM (I have access to and have tried it on a laptop with 64 GB RAM and a…
Val
  • 59
  • 6
1
2 3
12 13