Questions tagged [blast]

BLAST is a Basic Local Alignment Search Tool for comparing biological sequence information.

Given some query string, it finds similar (although not necessarily identical) biological sequence strings in a large set of possible candidates. BLAST supports searches with possible sequence mismatches, deletions and insertions. BLAST is open source, written in C++, and was originally developed in 1990.

Wikipedia summary

241 questions
7
votes
2 answers

Python implementation of BLAST alignment algorithm?

Is anyone aware of a pure python implementation of BLAST alignment? I am trying to study this algorithm...
234523458
  • 151
  • 1
  • 3
5
votes
2 answers

Is it possible to pass a string variable to a BLAST search instead of a file?

I'm writing a python script and want to pass the query sequence information into blastn as a string variable rather than a FASTA format file if possible. I used Biopython's SeqIO to store several transcript names as key and its sequences as the…
4
votes
2 answers

BLAST Database error: No alias or index file found for nucleotide database

I am trying to run blastn, and then also SIFT standalone. I am having database configuration issues however as I am getting the following: arron@arron-Ideapad-Z570 ~/Phd/programs/sift4.0.3b $ blastn -query test/lacI.fasta -db db/swissprot/ BLAST…
brucezepplin
  • 9,202
  • 26
  • 76
  • 129
4
votes
1 answer

Biopython NCBIWWW.qblast test file -hangs on

When I try to run a test file provided by Biopython for NCBIWWW.qblast online search, it just hangs on and on and never responds. The same happens when I am trying to run any script on my own that includes NCBIWWW.qblast: it just arrives to this…
3
votes
1 answer

Making Blast database from FASTA in Python

How can I do this? I use Biopython and saw manual already. Of course I can make blastdb from FASTA using "makeblastdb" in standalone NCBI BLAST+, but I want to whole process in one program. It seems there are two possible solutions. Find a function…
3
votes
4 answers

Filtering a fasta file with sequences that match a certain string in another file

With BLAST I have obtained a file with two tab-separated columns, one with species names and the other with a gene name (the name of the most similar gene in a reference database). My goal is to find in the first file all the species names for which…
MarcD
  • 31
  • 4
3
votes
2 answers

Filtering a dataframe of BLAST sequences to get within each cluster the maximum pident_x

I have a problem, I need to parse the following dataframe: cluster_name qseqid sseqid pident_x qstart qend sstar send 2 1 seq1_0035_0035 seq13_0042_0035 0.73 42 133 46 189 3 1 seq1_0035_0035 seq13_0042_0035 0.73 …
Grendel
  • 555
  • 1
  • 4
  • 11
3
votes
0 answers

Error running BLAST when database is on external hard drive

I have two Macs: a Desktop with a 3 TB hard drive, and a laptop with a 512 GB SSD and an attached 8 TB USB-C external hard drive. I have the same BLAST database set up on both; for the laptop, the database is on the external hard drive. When I run a…
brt381
  • 83
  • 5
3
votes
3 answers

Iterate through files in a directory, create output files, linux

I am trying to iterate through every file in a specific directory (called sequences), and perform two functions on each file. I know that the functions (the 'blastp' and 'cat' lines) work, since I can run them on individual files. Ordinarily I would…
lynkyra
  • 59
  • 5
3
votes
2 answers

blast could not create a unit counts container

I build a blast local database. However, when I run the blastn command I got this error message: T0…
Hamid_UMB
  • 317
  • 4
  • 16
3
votes
0 answers

Python script skips writing trimmed DNA sequences to files

Edit 2/18: I figured out the issue. It's not the code directly, although someone has pointed out this sample I have put up is not the way I should have put it up. I apologize! The issue is the blastx results. They were not meeting the threshold set…
BrianW
  • 31
  • 2
3
votes
3 answers

NamedTemporaryFile exists, but external program can't access it

This is a follow-up of sorts to this question about using NamedTemporaryFile() I have a function that creates and writes to a temporary file. I then want to use that file in a different function, which calls a terminal command that uses that file…
kevbonham
  • 999
  • 7
  • 24
3
votes
2 answers

Nested Quotes in Perl System()

I'm trying to modify a perl script. Here is the part I am trying to modify: Original: system ("tblastn -db $BLASTDB -query $TMP/prot$$.fa \\ -word_size 6 -max_target_seqs 5 -seg yes -num_threads $THREADS -lcase_masking \\ …
Blaze
  • 31
  • 4
3
votes
2 answers

Python: Running Multidimensional Scaling with Incomplete Pairwise Dissimilarity Matrix in HDF5 format

I am working with large datasets of protein-protein similarities generated in NCBI BLAST. I have stored the results in a large pairwise matrices (25,000 x 25,000) and I am using multidimensional scaling (MDS) to visualize the data. These matrices…
3
votes
1 answer

custom blast db with NcbiblastxCommandline

it's the first time that i use blast inside biopython, and i'm having a problem. i created a custom blast database from a fasta file which contain 20 sequence using : os.system('makeblastdb -in newtest.fasta -dbtype nucl -out newtest.db') and…
ifreak
  • 1,726
  • 4
  • 27
  • 45
1
2 3
16 17