0

I've also tried using newString.strip('\n') in addition to the ones already in the code, but it doesn't do anything. I am inputing a .fasta file which shouldn't be a problem. Thanks in advance.

def createLists(fil3):
    f = open(fil3, "r")
    text = f.read()

    listOfSpecies = []
    listOfSequences = []

    i = 0
    check = 0

    while (check != -1):
        startIndex = text.find(">",i)
        endIndex = text.find("\n",i)
        listOfSpecies.append(text[startIndex+1:endIndex])

        if(text.find(">",endIndex) != -1):
            i = text.find(">",endIndex)
            newString = text[endIndex+1: i]
            newString.strip()
            newString.splitlines()
            listOfSequences.append(newString)

        else:
            newString = text[endIndex+1:]
            newString.strip()
            newString.strip('\n')
            listOfSequences.append(newString)
            return (listOfSpecies,listOfSequences)


def cluster(fil3):
    print createLists(fil3)


cluster("ProteinSequencesAligned.fasta")
Alberto Does
  • 189
  • 1
  • 3
  • 14

2 Answers2

4

Strings are immutable:

In [1]: s = 'lala\n'

In [2]: s.strip()
Out[2]: 'lala'

In [3]: s
Out[3]: 'lala\n'

In [4]: s = s.strip()

In [5]: s
Out[5]: 'lala'

So just do:

new_string = text[end_index+1:].strip()

And please follow PEP 8. Also, you could rewrite your loop just using a for loop over the lines. Python files support direct iteration:

In [6]: with open('download.py') as fobj:
   ...:     for line in fobj:
   ...:         print line

And if you don't use the with statement, make sure you close the file with the close() method at the end of your function.

rubik
  • 8,814
  • 9
  • 58
  • 88
  • Edited to give a couple of suggestions. – rubik May 06 '12 at 07:04
  • Thank you, I just replaced the strip with splitlines. Also is the PEP 8 comment in reference to the naming of newString rather than new_string? – Alberto Does May 06 '12 at 07:16
  • @AlbertoDoes: It's a general advice: most for variables naming, but also for spaces (for example around operators), or conditionals (`if cond or cond2` is better than `if(cond) or (cond2)`). – rubik May 06 '12 at 07:20
  • Great, thanks for the advice. I didn't realize the splitline method made new additions to the list. I found a way around the problem by using new_string = text[endIndex+1:].replace('\n', '') – Alberto Does May 06 '12 at 07:26
0

Well In the end I found the best solution to be new_string = text[endIndex+1:].replace('\n', '')

Alberto Does
  • 189
  • 1
  • 3
  • 14