-1

I have a FASTA file with three defined elements in the "description" line.

The first element, defined as dato[0], is the one that has to carry out with the condition and the third element, defined as dato[2], is the one that I want to sum. The FASTA description line is like this:

PIN4 HOIAQKS02C4SWQ 1761
PIN1 HOIAQKS02D3JZ3 572

And I want to sum the values (dato[2]) that carry out the condition dato[0] == PIN1 in one row and the condition dato[0] == PIN4 in another.

I am using the following code:

from Bio import SeqIO

secuencias=SeqIO.parse("/Users/imac/Desktop/Pruebas_UniFrac/otu1_alpin1+4.fasta", "fasta")

PIN_records=list(SeqIO.parse("/Users/imac/Desktop/Pruebas_UniFrac/otu1_alpin1+4.fasta", "fasta")

archivo1=open("/Users/imac/Desktop/Pruebas_UniFrac/pruebaalpin1+4_fin.fasta", "w")
archivo2=open("/Users/imac/Desktop/Pruebas_UniFrac/pruebaalpin1+4_seqsotus.fasta", "w")
archivo3=open("/Users/imac/Desktop/Pruebas_UniFrac/pruebaalpin1+4_sumas.fasta", "w")

x = 0
y = x+1
for linea in secuencias:
    dato = linea.description.split(" ")
    seqs = str(linea.seq)

    if dato[0] != "PIN1":
        if dato[0] != "PIN4":
            if dato[0] == "consensus":
               archivo1.write("hacia arriba OTU" + str(y) + "\n" + "x" + "\n" + "x" + "\n")
               archivo2.write(">" + "OTU" + str(y) + "\n" + seqs + "\n")
               archivo3.write("fin del OTU" + "\n")
               y = y+1
        else:
         archivo1.write(str(dato[0]) + "," + str(dato[2]) + "\n")
         #num = int(dato[2])
         #archivo3.write("PIN4=" + str(sum(dato[2])) + "\n")
         #archivo3.write("PIN4=%d\n" % sum(dato[2]))
         archivo3.write("PIN4={}\n".format(sum(dato[2])))
    else:
     archivo1.write(str(dato[0]) + "," + str(dato[2]) + "\n")
     #num = int(dato[2])
     #archivo3.write("PIN1=" + str(sum(dato[2])) + "\n")
     #archivo3.write("PIN1=%d\n" % sum(dato[2]))
     archivo3.write("PIN1={}\n".format(sum(dato[2])))

archivo1.close()
archivo2.close()
archivo3.close()

And when I do that, I get this error message:

TypeError: unsupported operand type(s) for +: 'int' and 'str'

How can I do that?

After following posterior comments, I have introduced changes in my code, but I can't get it working properly and I do not know how to fix it.

With this code, I get the following error:

File "./lectura_msaout_pruebaalpin1+4_final.py", line 16
    archivo1=open("/Users/imac/Desktop/Pruebas_UniFrac/pruebaalpin1+4_fin.fasta", "w")
           ^
SyntaxError: invalid syntax 
Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Ma_fermar
  • 33
  • 2
  • 10
  • Did you try [searching](https://www.google.com/search?q=TypeError%3A+unsupported+operand+type(s)+for+%2B%3A+'int'+and+'str') for your error message? – David Cain Jul 02 '13 at 16:50
  • I did, but the solutions I found wasn't uitable for my problem, I think. – Ma_fermar Jul 04 '13 at 12:39
  • Some future advice: Please be more explicit about problems than "it doesn't work." It's hard to divine exactly what your issue is. Also, you should make an effort to [code in English](http://www.codinghorror.com/blog/2009/03/the-ugly-american-programmer.html). Spanish has a lot of cognates with English, but you're losing a lot of your potential audience when you code in another language (also, it's confusing to see English-based syntax with Spanish varnames). – David Cain Jul 04 '13 at 14:10
  • When I say "It still doesn't work" I meant I still have the same problem as before: "TypeError: unsupported operand type(s) for +: 'int' and 'str". Thanks a lot for your advice – Ma_fermar Jul 05 '13 at 07:33

2 Answers2

0

Your code has two main issues.

  1. You're trying to call sum() on string data.
  2. You're trying to format a numeric value as a string.

Fixing summation

You want to sum an iterable of numeric values, as summing is undefined for string values. You can convert string values to an integer by calling int() on each value (use the map() function to do this).

Example:

>>> sum(["1", "2", "3"])
TypeError: unsupported operand type(s) for +: 'int' and 'str'
>>> sum([1, 2, 3])
6
map(int, ["1", "2", "3"])
[1, 2, 3]
>>> sum(map(int, ["1", "2", "3"]))
6

Application to your code

Do you really want to sum the single digits of dato[2]? It'd look like this:

>>> dato = ['PIN4', 'HOIAQKS02C4SWQ', '1761']
>>> sum(map(int, dato[2]))  # 1 + 7 + 6 + 1
15

Fixing the string formatting

You can't append an integer to a string (see Python String and Integer concatenation).

The solution is to either convert the integer to a string before concatenating, or to format the integer within a string. In your case, the solutions look like this:

  1. Convert to string:

    archivo3.write("PIN1=" + str(dato_2_sum) + "\n")
    
  2. Use string formatting:

    archivo3.write("PIN1=%d\n" % dato_2_sum)
    
  3. Use newstyle formatting:

    archivo3.write("PIN1={}\n".format(dato_2_sum)
    
Community
  • 1
  • 1
David Cain
  • 16,484
  • 14
  • 65
  • 75
  • I still have the same problem, with any of the three options. I will copy the complete code in an answer. – Ma_fermar Jul 03 '13 at 09:00
  • I have edited the post with the whole code I am using. It still doesn't work, even after the three suggestions to convert the integrer to a string or format the integrer within a string. – Ma_fermar Jul 04 '13 at 07:01
  • I really don't want to sum the single digits of dato [2], but to sum all the dato[2] values from different fasta parts (I mean, from different data introduced by '>') if they carry out one condition, as it could be: >"PIN4" "HOIAQKS02C4SWQ" "1761" >"PIN4" "HOIAQKS02D3JZ3" "572"------> They both carry out the condition for "dato[0]==PIN4", so we can sum "dato[2]" from both===> "1761"+"572"="2333". Thanks a lot, one more time – Ma_fermar Jul 05 '13 at 07:46
  • Iterate over all sequences, building a list of records (say, `PIN_records`) that satisfies your requirement, then sum index 2 for each record. – David Cain Jul 06 '13 at 17:55
  • As you can see, I am very new at python, and I do not know how to do that. I have tried to write the list in many parts and I did not get anything that works properly. I got different errors, since syntax errors to unexpected indents. I am going to edit my question with the new text to see if you can see where I am making mistakes. Thanks a lot, I am quite lost with this and if you could help me working it out it could save a lot of time in my work. Thank you very much. – Ma_fermar Jul 08 '13 at 09:30
0

Finally I have fixed up my problem by creating counters outside the 'for' cycle and creating a sum but without the 'sum' command and by changing between 'str' and 'int'. My 'almost finished' complete code is the following:

#!/usr/bin/python


from Bio import SeqIO



sequences=SeqIO.parse("/Users/imac/Desktop/Pruebas_UniFrac/otu1_alpin1+4.fasta", "fasta")





file1=open("/Users/imac/Desktop/Pruebas_UniFrac/pruebaalpin1+4_fin.fasta", "w")
file2=open("/Users/imac/Desktop/Pruebas_UniFrac/pruebaalpin1+4_seqsotus.fasta", "w")
file3=open("/Users/imac/Desktop/Pruebas_UniFrac/pruebaalpin1+4_sumas.fasta", "w")


numTotalPin1=0
numTotalPin4=0



x=0
y=x+1

for line in sequences:


    data=line.description.split(" ")



    seqs=str(line.seq)


    if data[0]!="PIN1":
        if data[0]!="PIN4":
            if data[0]=="consensus":
               file1.write("upstream OTU" + str(y) + "\n" + "x" + "\n" + "x" + "\n")
               file2.write(">" + "OTU" + str(y) +"\n" + seqs + "\n")
               file3.write("OTU"+ str(y) + "\n")
               file3.write("PIN1=" + str(numTotalPin1) + "\n")
               file3.write("PIN4=" + str(numTotalPin4) + "\n")
               file3.write("end of OTU"+ str(y) + "\n")
               y=y+1
               numTotalPin1=0
               numTotalPin4=0
        else:
         file1.write(str(data[0]) + "," + str(data[2]) + "\n")
         num=int(data[2])
         numTotalPin4=numTotalPin4 + int(data[2])


    else:
     file1.write(str(data[0]) + "," + str(data[2]) + "\n")
     num=int(data[2])
     numTotalPin1=numTotalPin1 + int(data[2])



file1.close()
file2.close()
file3.close()

I hope that someone can find this code helpful. Thanks for your help.

Ma_fermar
  • 33
  • 2
  • 10