0
tconst ordering nconst category job Characters
tt0069049 10.0 nm0613657 editor \N \N
tt0069049 1.0 nm0001379 actor \N| ["Jake Hannaford"]

I have the above dataset in a csv file, I've loaded the dataset using pandas and I'm trying to replace the \N vals with math.nan (using the math and pandas libraries). Here is what I have.

cast_data = pd.read_csv("cast.csv")
cast_data.replace(cast_data.replace(r"^\N*$", math.nan, regex=True))

Here's the error I'm getting, can someone explain why? thanks.

raise source.error("missing {")
re.error: missing { at position 3
blacklion
  • 3
  • 2

1 Answers1

0

\N is the Unicode name escape, for example:

>>> '\N{tilde}'
'~'

So you just need to add another backslash there: \\N.

This is documented here for the re module:

Most of the standard escapes supported by Python string literals are also accepted by the regular expression parser:

\a      \b      \f      \n
\N      \r      \t      \u
\U      \v      \x      \

...

Changed in version 3.8: The '\N{name}' escape sequence has been added. As in string literals, it expands to the named Unicode character (e.g. '\N{EM DASH}').

wjandrea
  • 28,235
  • 9
  • 60
  • 81
  • thanks that solve my problem but now only the \N in the last row of the datasets are converted to math.nan the rest are still showing up as \N. Any idea why? – blacklion Jul 08 '22 at 03:51
  • @blacklion I'm not sure. Please ask a new question and include a [mre]. For specifics, see [How to make good reproducible pandas examples](/q/20109391/4518341). Also, please don't forget to [upvote answers you find useful and accept the best one](/help/someone-answers) :) – wjandrea Jul 08 '22 at 04:07
  • @blacklion I see you've asked a new question already. BTW, the title is almost the same, which is confusing, so you might want to give these two questions different titles to distinguish them. Like, you could rename this one, "Trying to replace values in a dataframe with regex, but getting 're.error: missing {' ". For tips on writing a good title, check out [ask]. – wjandrea Jul 08 '22 at 04:12