1

In this data:

[‘23 2312 dfr tr 133’,
 ‘2344 fdeed’,
 ‘der3212fr342 96’]

I would like a function which would return values where there are a certan number of numbers in a row. It doesn’t matter about spaces, or other text, as long as there are a certain numbers in a row. (No more, no less) For example:


2 numbers in a row:
[‘23’,’’,’96’]

3 numbers in a row:
[‘133’,’’,’342’]

4 numbers in a row:
[‘2312’,’2344’,’3212’]

Thank you

yatu
  • 86,083
  • 12
  • 84
  • 139
tedioustortoise
  • 259
  • 3
  • 20
  • How are the data stored? In a text file as such? Or are they stored in any predefined structure, like an array, or a list for example? – swiss_knight Apr 04 '20 at 09:32
  • are you looking for something like [this](https://stackoverflow.com/a/24878232/10197418)? – FObersteiner Apr 04 '20 at 09:34
  • Imported to form a list. No external file. The list is a list of strings, like one at the top of the question. Thank you. – tedioustortoise Apr 04 '20 at 09:34
  • 1
    Why the empty strings in the middle of your first two outputs? Also, please provide valid data, your quotes are not normal quotes and will cause a syntax error. – Thierry Lathuille Apr 04 '20 at 09:35
  • I wonder what will happen if the first string in list is "23 2312 dfr tr 13" if the number is 2.It will return `['23','','96']` or `['23','13','','96']` or `[['23,','13'],'','96']` or what? – jizhihaoSAMA Apr 04 '20 at 09:54

1 Answers1

3

One way could be using re.findall to extract the contiguous digits from the strings and keep those which have length n:

l = ['23 2312 dfr tr 133',
     '2344 fdeed',
     'der3212fr342 96']

import re

def length_n_digits(l,n):
    return [s for i in l for s in 
            re.findall(rf'(?<!\d)\d{{{n}}}(?!\d)', i) or ['']]

Note that the double braces '{{}}' are just to escape the inner braces and no interpolation takes place. (?<!\d) and (?!\d) are to lookaround and ensure that it only matches when the sequence of n digits is not surrounded by other digits.


length_n_digits(l, 2)
# ['23', '', '96']

length_n_digits(l, 3)
# ['133', '', '342']

length_n_digits(l, 4)
# ['2312', '2344', '3212']
yatu
  • 86,083
  • 12
  • 84
  • 139