-3

I need to find all parts of a large text string in this particular pattern:

"\t\t" + number (between 1-999) + "\t\t" 

and then replace each occurrence with:

TEXT+"\t\t"+same number+"\t\t" 

So, the end result is:

'TEXT\t\t24\t\tblah blah blahTEXT\t\t56\t\t'... and so on...

The various numbers are between 1-999 so it needs some kind of wildcard.

Please can somebody show me how to do it? Thanks!

Kjuly
  • 34,476
  • 22
  • 104
  • 118
Jonty
  • 1
  • 3

1 Answers1

0

You'll want to use Python's re library, and in particular the re.sub function:

import re  # re is Python's regex library
SAMPLE_TEXT = "\t\t45\t\tbsadfd\t\t839\t\tds532\t\t0\t\t"  # Test text to run the regex on

# Run the regex using re.sub (for substitute)
# re.sub takes three arguments: the regex expression,
# a function to return the substituted text,
# and the text you're running the regex on.

# The regex looks for substrings of the form:
# Two tabs ("\t\t"), followed by one to three digits 0-9 ("[0-9]{1,3}"),
# followed by two more tabs.

# The lambda function takes in a match object x,
# and returns the full text of that object (x.group(0))
# with "TEXT" prepended.
output = re.sub("\t\t[0-9]{1,3}\t\t",
                lambda x: "TEXT" + x.group(0),
                SAMPLE_TEXT)

print output  # Print the resulting string.
Greg Edelston
  • 534
  • 2
  • 12
  • Excellent! I don't know how it works - but it works! The part I don't understand is: lambda x: "TEXT" + x.group(0), is lambda part of the language? I need to research it but I don't know what "chapter" heading this comes under in a Python book... anyway, much thanks for a great answer! – Jonty Apr 17 '15 at 19:53
  • I've added comments to the code, but a lambda function is part of Python. You can find the documentation for it [here](https://docs.python.org/2/reference/expressions.html#lambda). It creates an anonymous function that takes in a variable x, and returns the value after the colon. Equivalently, I could have written: `def foo(x): return "TEXT" + x.group(0)`, and used `foo` as the second argument of `re.sub`. – Greg Edelston Apr 17 '15 at 20:02
  • Thanks for commenting the solution - it's so elegant, a "one line" multipurpose text processing tool that you'd never find in any "cookbook" :-) Thanks for sharing so generously. Much appreciated. – Jonty Apr 20 '15 at 18:26