0

I have a very long string, and this is just part of it:

s = 'States AL Date 2011 01 03 YES States MD Date 2009 08 09 NO'

How to substring them so that the output is:

'States AL Date 2011 01 03 YES', 'States MD Date 2009 08 09 NO'...

Each substring starts with key word "States" and has a fixed length of 30. Thx!

Barmar
  • 741,623
  • 53
  • 500
  • 612
Lisa
  • 21
  • 3

1 Answers1

0

You want to use regexp search.

import re
s = 'States AL Date 2011 01 03 YES States MD Date 2009 08 09 NO'
results = re.findall('States [A-Z]{2} Date \d{4} \d{2} \d{2} (?:YES|NO)', s)

If you just want to have substrings of length 30 starting at every 'States' substring:

import re
s = 'States AL Date 2011 01 03 YES States MD Date 2009 08 09 NO'
results = [s[m.start():m.start()+30] for m in re.finditer('States', s)]
Peter Lang
  • 357
  • 1
  • 5
  • Thanks, What if they do not always end with 'yes' or 'no'? they might end up with 'T0', or 'B6'. only the length is fixed (30) – Lisa May 28 '21 at 19:09
  • `[s[m.start():m.start()+30] for m in re.finditer('States', s)]`, but I also edited answer. – Peter Lang May 28 '21 at 19:12