21

I'm not a Python developer, but I'm using a Python script to convert SQLite to MySQL

The suggested script gets close, but no cigar, as they say.

The line giving me a problem is:

line = re.sub(r"([^'])'t'(.)", r"\1THIS_IS_TRUE\2", line)

...along with the equivalent line for false ('f'), of course.

The problem I'm seeing is that only the first occurrence of 't' in any given line is replaced.

So, input to the script,

INSERT INTO "cars" VALUES(56,'Bugatti Veyron','BUG 1',32,'t','t','2011-12-14 18:39:16.556916','2011-12-15 11:25:03.675058','81');

...gives...

INSERT INTO "cars" VALUES(56,'Bugatti Veyron','BUG 1',32,THIS_IS_TRUE,'t','2011-12-14 18:39:16.556916','2011-12-15 11:25:03.675058','81');

I mentioned I'm not a Python developer, but I have tried to fix this myself. According to the documentation, I understand that re.sub should replace all occurrences of 't'.

I'd appreciate a hint as to why I'm only seeing the first occurrence replaced, thanks.

Community
  • 1
  • 1
Snips
  • 6,575
  • 7
  • 40
  • 64

4 Answers4

12

The two substitutions you'd want in your example overlap - the comma between your two instances of 't' will be matched by (.) in the first case, so ([^']) in the second case never gets a chance to match it. This slightly modified version might help:

line = re.sub(r"(?<!')'t'(?=.)", r"THIS_IS_TRUE", line)

This version uses lookahead and lookbehind syntax, described here.

Zero Piraeus
  • 56,143
  • 27
  • 150
  • 160
3

How about

line = line.replace("'t'", "THIS_IS_TRUE").replace("'f'", "THIS_IS_FALSE")

without using re. This replaces all occurrences of 't' and 'f'. Just make sure that no car is named t.

eumiro
  • 207,213
  • 34
  • 299
  • 261
  • Thanks, but I need something a bit more robust than just blanket replacement of 't' (and 'f'). – Snips Nov 13 '12 at 15:33
2

The first match you see is ,'t',. Python proceeds starting with the next character, which is ' (before the second t), subsequently, it cannot match the ([^']) part and skips the second 't'.

In other words, subsequent matches to be replaced cannot overlap.

Alexander Pavlov
  • 31,598
  • 5
  • 67
  • 93
1

using re.sub(r"\bt\b","THIS_IS_TRUE",line):

In [21]: strs="""INSERT INTO "cars" VALUES(56,'Bugatti Veyron','BUG 1',32,'t','t','2011-12-14 18:39:16.556916','2011-12-15 11:25:03.675058','81');"""

In [22]: print re.sub(r"\bt\b","THIS_IS_TRUE",strs)

INSERT INTO "cars" VALUES(56,'Bugatti Veyron','BUG 1',32,'THIS_IS_TRUE','THIS_IS_TRUE','2011-12-14 18:39:16.556916','2011-12-15 11:25:03.675058','81');
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504