1
 import pandas as pd
 import numpy as np
 from Bio import*
 from Bio import SeqIO
 import time 
 import h5py



 def vectorizeSequence(seq):
# the order of the letters is not arbitrary.
# Flip the matrix up-down and left-right for reverse compliment
  ltrdict = {'a':[1,0,0,0],'c':[0,1,0,0],'g':[0,0,1,0],'t':[0,0,0,1], 'n':[0,0,0,0]}
      return np.array([ltrdict[x] for x in seq])


  starttime = time.time()
  fasta_sequences = SeqIO.parse(open("contigs.fasta"),'fasta')

 #fasta_sequences = str(seq.seq)

 #GC(fasta_sequences)

   with h5py.File('genomeEncoded.h5', 'w') as hf:
      for fasta in fasta_sequences:
    # get the fasta files.
    
    name, sequence = fasta.id, fasta.seq.tostring()  # HERE APPEARS ERROR
    # Write the chromosome name
    new_file.write(name)
    #  encoding scheme
    data = vectorizeSequence(sequence.lower())
    print (name + " is one hot encoded!")
    # write to hdf5 
    hf.create_dataset(name, data=data)
    print (name + " is written to dataset")


  endtime = time.time()       
  print ("Encoding is done in " + str(endtime))

Traceback (most recent call last): File "FASTA_ENCODING4ML.py", line 30, in name, sequence = fasta.id, fasta.seq.tostring() AttributeError: 'Seq' object has no attribute 'tostring'

player777
  • 131
  • 4
  • 3
    It means that you cannot call `tostring` on your `seq` object because that object doesn't have that method. You'd have to go and find out what the correct way is to turn `seq` into string. Maybe try `str(fasta.seq)` instead. – cadolphs Jan 23 '21 at 15:48
  • Does this answer your question? [Does Python have a toString() equivalent, and can I convert a db.Model element to String?](https://stackoverflow.com/questions/16768302/does-python-have-a-tostring-equivalent-and-can-i-convert-a-db-model-element-t) – JosefZ Jan 23 '21 at 15:51
  • 2
    @JosefZ That's not relevant. This question is about biopython, which used to have a `.tostring()` method, now uses `str()` instead – Chris_Rands Jan 23 '21 at 18:03

1 Answers1

0

To convert a Biopython Seq object to a string, use str().

For example:

str(Seq('ATCGTGC'))
>>>>'ATCGTGC'