-2

I have a file that contains a String and i want to search for a substring in a specific range, This is my string:

newick;
((raccoon,bear),((sea_lion, seal),((monkey,cat), weasel)),dog);

And this is my code for it:

def removeNewick(tree):
    for x in tree:
        new_set = x.replace('newick;', '')
    print(new_set)


filepath = "C:\\Users\\msi gaming\\Desktop\\small_tree.tre"
tree = open(filepath)
removeNewick(tree)

But i know for sure that if this string 'newick' is going to appear, then it would be in the first 10 characters of the string so how do i edit my for loop only to loop over the first ten characters?

FHTMitchell
  • 11,793
  • 2
  • 35
  • 47
  • 1
    you could test if you find the pattern in the sliced string, but I think it would take more time to slice the string than to perform the replace on the whole string. Are you looking for performance? – Jean-François Fabre May 30 '18 at 14:13
  • 1
    BTW how was this tree file generated, why do you have to parse it now to remove the newick word? This seems odd. (For those who don't know newick is a file format for storing trees https://en.wikipedia.org/wiki/Newick_format) – Chris_Rands May 30 '18 at 14:16
  • 1
    It's a small dataset and i usually use the ETE3 library to read it but that fails when the word Newick is at the beginning. –  May 30 '18 at 14:17
  • what tool *generates* the tree with the word `newick;`? – Chris_Rands May 30 '18 at 14:20
  • 1
    "it would be in the first 10 characters of the string" do you mean "of the file" ? – Jean-François Fabre May 30 '18 at 14:21

1 Answers1

0

Ok so, tree is a file

def remove_newick(tree):
    for x in tree:
        if x.startswith('newick;'):
            print('')
        else:
            print(x)

str.startswith() is a string method that only checks as many characters as needed, and is the most efficient way to check if a string starts with a certain substring.


For the record, don't do

tree = open(filepath)
remove_newick(tree)

It's dangerous to not close files. Instead do

with open(filepath) as tree:
    remove_newick(tree)
FHTMitchell
  • 11,793
  • 2
  • 35
  • 47
  • 1
    but OP stated that "f this string 'newick' is going to appear, then it would be in the first 10 characters of the string", not necessarily at the start... – Jean-François Fabre May 30 '18 at 14:19
  • glad it worked. make sure to mark this as correct so we don't get answer spam and such – L_Church May 30 '18 at 14:19
  • 1
    This is incorrect. This removes "newick" from the start of *any* line in the file, not just from the first 10 characters in the file. – Aran-Fey May 30 '18 at 14:20
  • 1
    @Aran-Fey what? "But i know for sure that if this string 'newick' is going to appear, then it would be in the first 10 characters of the string". he means each line string. – FHTMitchell May 30 '18 at 14:21
  • There are optional parameters which give indexes for the start and end of the search anyway. https://www.tutorialspoint.com/python/string_startswith.htm "str" "beg" "end" – L_Church May 30 '18 at 14:21
  • even so, it is not correct: your code checks the start of each line. – Jean-François Fabre May 30 '18 at 14:22
  • Even if the OP *did* mean that, your code doesn't do that. It removes the text from the start of the line, not from the first 10 characters in the line. – Aran-Fey May 30 '18 at 14:22
  • 2
    Based on their actual application (feeding the tree to ETE3) I'm pretty sure removing all examples of `'newick;'` *is* the appropriate behaviour, the tool only needs the actual tree data contained within the paranetheses – Chris_Rands May 30 '18 at 14:23
  • @Chris_Rands Pretty much my interpretation. OP wants to skip lines which start with `"newick;"`. End of story. – FHTMitchell May 30 '18 at 14:23
  • 1
    ok, then it's a duplicate: "how to remove a starting pattern in each line of a file". This question sucks (and the answer doesn't bring anything new) – Jean-François Fabre May 30 '18 at 14:27
  • 1
    god jean that's so mean imma have to write a blog post about your behaviour of telling us a question sucked. <3 – L_Church May 30 '18 at 14:28
  • 1
    the _question_ sucks. Not targetting any user. I'll read your blog post. – Jean-François Fabre May 30 '18 at 14:29
  • whoosh...? i can't tell! :c – L_Church May 30 '18 at 14:30