14

You create raw string from a string this way:

test_file=open(r'c:\Python27\test.txt','r')

How do you create a raw variable from a string variable, such as

path = 'c:\Python27\test.txt'

test_file=open(rpath,'r')

Because I have a file path:

file_path = "C:\Users\b_zz\Desktop\my_file"

When I do:

data_list = open(os.path.expandvars(file_path),"r").readlines()

I get:

Traceback (most recent call last):
  File "<pyshell#32>", line 1, in <module>
    scheduled_data_list = open(os.path.expandvars(file_path),"r").readlines()
IOError: [Errno 22] invalid mode ('r') or filename: 'C:\\Users\x08_zz\\Desktop\\my_file'
alwbtc
  • 28,057
  • 62
  • 134
  • 188

3 Answers3

11

There is no such thing as "raw string" once the string is created in the process. The "" and r"" ways of specifying the string exist only in the source code itself.

That means "\x01" will create a string consisting of one byte 0x01, but r"\x01" will create a string consisting of 4 bytes '0x5c', '0x78', '0x30', '0x31'. (assuming we're talking about python 2 and ignoring encodings for a while).

You mentioned in the comment that you're taking the string from the user (either gui or console input will work the same here) - in that case string character escapes will not be processed, so there's nothing you have to do about it. You can check it easily like this (or whatever the windows equivalent is, I only speak *nix):

% cat > test <<EOF                                             
heredoc> \x41
heredoc> EOF
% < test python -c "import sys; print sys.stdin.read()"
\x41
viraptor
  • 33,322
  • 10
  • 107
  • 191
7

My solution to convert string to raw string (works with this sequences only: '\a', \b', '\f', '\n', '\r', '\t', '\v' . List of all escape sequences is here):

def str_to_raw(s):
    raw_map = {8:r'\b', 7:r'\a', 12:r'\f', 10:r'\n', 13:r'\r', 9:r'\t', 11:r'\v'}
    return r''.join(i if ord(i) > 32 else raw_map.get(ord(i), i) for i in s)

Demo:

>>> file_path = "C:\Users\b_zz\Desktop\fy_file"
>>> file_path
'C:\\Users\x08_zz\\Desktop\x0cy_file'
>>> str_to_raw(file_path)
'C:\\Users\\b_zz\\Desktop\\fy_file'
ndpu
  • 22,225
  • 6
  • 54
  • 69
  • But the I get the path string from a GUI input. How do I add "r" to the beginning? – alwbtc Feb 06 '14 at 14:36
  • 1
    What the user is asking is, how can i take an unknown string and make it so that the path doesn't get binary-represented (a "raw" string rather than a interpreted string) – Torxed Feb 06 '14 at 14:38
  • 8
    In memory there are no raw strings. A raw string is just a helper for source code. If you get the string via (GUI)input everything is OK. – Matthias Feb 06 '14 at 14:41
  • 1
    @alwbtc **How** do you get the path string from the user? If you get a \b character in there, I don't think you got what you wanted from them. – Travis Griggs Feb 06 '14 at 15:47
  • 1
    You'll get exactly what the user provided. No string transformation/unescaping happens when you just look at/use a value. The "\b" part is not related to the string itself, but rather it's an artifact of source code parsing. – viraptor Feb 06 '14 at 17:06
  • @viraptor for \b case: im search 8 in string and return r'\b' in that place. And r'\b' is actually two characters '\' and 'b' in raw string. Test it – ndpu Feb 06 '14 at 17:13
  • @ndpu Yes, because that's what you put in the source. Basically you're saying in the first line is construct a string that contains an 0x08 byte after "C:\Users". If file_path comes from the gui, this will not happen. Also `str_to_raw` is incorrect - even if it was needed, it would fail on "C:\x2345" for example. – viraptor Feb 06 '14 at 17:18
  • @viraptor author of question say that he got this from input, why i shouldnt trust him? And my solution just works, i dont understand you. No fail for "C:\0x2345" – ndpu Feb 06 '14 at 17:26
  • @ndpu `In [2]: str_to_raw("C:\x2345")`, `Out[2]: 'C:#45'` this is not what you want to get. I don't trust the author, because I he doesn't understand the difference between source and in-memory representation of the strings ("How do you create a raw variable from a string variable") - nothing personal, just trying to explain what really happens. There's a small chance that GUI really does such conversion, but that would be a bug in the GUI implementation. (and as shown in this example, it's not fixable in the code receiving the value) – viraptor Feb 06 '14 at 17:31
  • @viraptor maybe bug, ok. About fail - it works for me in console python 2.7 `>>> str_to_raw("C:\0x2345") 'C:\x00x2345'` – ndpu Feb 06 '14 at 17:36
  • I edited the comment with `\x2345` instead of `\0x2345`, but the original is still incorrect. Do you see how the function appended the `x00` part that didn't exist there in the first place? – viraptor Feb 06 '14 at 17:37
  • @viraptor but "C:\x2345" is exactly what you got from my function: `>>> "C:\x2345" 'C:#45'`. It is not fail but feature – ndpu Feb 06 '14 at 17:41
  • We'll have to disagree then :) I think it's a bug if you want to reverse some transformation, but fail to do it for all cases. (which is impossible to do here) Also this function won't help the question author since the issue begins in some other place in the code - he shouldn't just patch the place where the issue is first seen. – viraptor Feb 06 '14 at 17:50
0

The solution by ndpu works for me.

I could not resist the temptation to enhance it (make it compatible with ancient Python 2 versions and hoping to speed it up):

_dRawMap = {8:r'\b', 7:r'\a', 12:r'\f', 10:r'\n', 13:r'\r', 9:r'\t', 11:r'\v'}

def getRawGotStr(s):
    #
    return r''.join( [ _dRawMap.get( ord(c), c ) for c in s ] )

I did a careful time trial, and it turns out that the original code by ndpu is a little faster. List comprehensions are fast, but generator expressions are faster.

Rick Graves
  • 517
  • 5
  • 11