0

thanks for answering my questions, I just want to clarify what I was asking.
I'm trying to split the following string using str.split('+'), I'm reading in the strings from a text file:

['\A'+'ABBOTT\s|\s'+'ABBOTT\s|\s'+'ABBOTT$|\A'+'ABBOTT LABORATORIES\s|\s'+'ABBOTT LABORATORIES\s|\s'+'ABBOTT LABORATORIES$']

the desired results would be:

['\A'+'ABBOTT\s|\s', 'ABBOTT\s|\s', 'ABBOTT$|\A', 'ABBOTT LABORATORIES\s|\s', 'ABBOTT LABORATORIES\s|\s', 'ABBOTT LABORATORIES$']

But instead I'm getting: ["'\A'", "'ABBOTT\s|\s'", "'ABBOTT\s|\s'", "'ABBOTT$|\A'", "'ABBOTT LABORATORIES\s|\s'", "'ABBOTT LABORATORIES\s|\s'", "'ABBOTT LABORATORIES$'"]

But I can't get the \ to stay as a single, after splitting Thanks again!!!

Heather m
  • 3
  • 3
  • 2
    That's just its representation. Try printing it. Take a look at ths question: [How to replace a double backslash with a single backslash in python?](http://stackoverflow.com/questions/6752485/how-to-replace-a-double-backslash-with-a-single-backslash-in-python) – Nate Dec 18 '11 at 13:11
  • 1
    some things to note: if possible provide working source code, to make your problem reproducible for others; let the number of problems per question approach one where possible; don't change the main problem of your question, submit a new one when there are follow-up problems. – moooeeeep Dec 18 '11 at 20:22

1 Answers1

0

If you print the array directly it will look as if the value of the first entry is ['\\A', this is also how the value should be written when dealing with it in your script.

That is because print in that context will wrap the string inside "" and escape characters such as \ inside (turning them into \\ in the printed string).

"['\\A'+'ABBOTT\\s|\\s'+'ABBOTT\\s|\\s'+'ABBOTT$|\\A'+'ABBOTT LABORATORIES\\s|\\s'+'ABBOTT LABORATORIES\\s|\\s'+'ABBOTT LABORATORIES$']"

The "real" value of your string isn't like that.

Python 2.7.2 (default, Aug 22 2011, 13:53:27) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin

>>> print "['\\A'+'ABBOTT\\s|\\s'+'ABBOTT LABORATORIES$']".split ('+')
["['\\A'", "'ABBOTT\\s|\\s'", "'ABBOTT LABORATORIES$']"]

>>> print "['\\A'+'ABBOTT\\s|\\s'+'ABBOTT LABORATORIES$']".split ('+')[0]
['\A'
Filip Roséen - refp
  • 62,493
  • 20
  • 150
  • 196
  • Hi, thanks for your answer, I just tried that, but I get ["'\\\\A'", "'ABBOTT\\\\s|\\\\s'", "'ABBOTT\\\\s|\\\\s'", "'ABBOTT$|\\\\A'", "'ABBOTT LABORATORIES\\\\s|\\\\s'", "'ABBOTT LABORATORIES\\\\s|\\\\s'", "'ABBOTT LABORATORIES$'"] I think the problem is that I'm reading in the line from a text file, and I'm partitioning on a variable that has the string stored. thanks again!!! – Heather m Dec 18 '11 at 14:27
  • Hi again, I think if there is a way to remove the "" that is currently wrapping my imported text, the split should work fine. Any advice??? thanks!!!! – Heather m Dec 18 '11 at 14:50
  • @Heatherm You could use `'string'` instead of `""` but then you will have to escape the `'` present in your string. – Filip Roséen - refp Dec 18 '11 at 17:26
  • @HeaterM Also, please mark the answer as accepted since your original question has been answered, this will mark the problem as solved, thanks. – Filip Roséen - refp Dec 18 '11 at 17:27
  • Hi, thanks again for your answer. I think I was not clear enough in my questions before. (sry, my first time posting, I was trying to simply my question) I import the following line from a .txt file: (line='\A'+'ABBOTT\s|\s'+'ABBOTT\s|\s'+'ABBOTT$|\A'+'ABBOTT LABORATORIES\s|\s'+'ABBOTT LABORATORIES\s|\s'+'ABBOTT LABORATORIES$') Iwant the following two statements to produce the same results: re.compile(line, re.I+re.M); re.compile('\A'+'ABBOTT\s|\s'+'ABBOTT\s|\s'+'ABBOTT$|\A'+'ABBOTT LABORATORIES\s|\s'+'ABBOTT LABORATORIES\s|\s'+'ABBOTT LABORATORIES$', re.I+re.M); – Heather m Dec 18 '11 at 18:39
  • I'm getting very different results. Thanks again, @moooeeeep I hope you don't mind looking at this again :) – Heather m Dec 18 '11 at 18:43