-1

I have a list with strings.

list_of_strings

They look like that:

'/folder1/folder2/folder3/folder4/folder5/exp-*/exp-*/otherfolder/file'

I want to part this string into: /folder1/folder2/folder3/folder4/folder5/exp-* and put this into a new list.

I thought to do something like that, but I am lacking the right snippet to do what I want:

list_of_stringparts = []

for string in sorted(list_of_strings):
    part= string.split('/')[7]  # or whatever returns the first part of my string
    list_of_stringparts.append(part)

has anyone an idea? Do I need a regex?

kylieCatt
  • 10,672
  • 5
  • 43
  • 51
Annamarie
  • 183
  • 1
  • 2
  • 13

3 Answers3

3

You are using array subscription which extracts one (eigth) element. To get first seven elements, you need a slicing [N:M:S] like this:

>>> l = '/folder1/folder2/folder3/folder4/folder5/exp-*/exp-*/otherfolder/file'
>>> l.split('/')[:7]
['', 'folder1', 'folder2', 'folder3', 'folder4', 'folder5', 'exp-*']

In our case N is ommitted (by default 0) and S is step which is by default set to 1, so you'll get elements 0-7 from the result of split.

To construct your string back, use join():

>>> '/'.join(s)
'/folder1/folder2/folder3/folder4/folder5/exp-*'
Community
  • 1
  • 1
myaut
  • 11,174
  • 2
  • 30
  • 62
1

I would do like this,

>>> s = '/folder1/folder2/folder3/folder4/folder5/exp-*/exp-*/otherfolder/file'
>>> s.split('/')[:7]
['', 'folder1', 'folder2', 'folder3', 'folder4', 'folder5', 'exp-*']
>>> '/'.join(s.split('/')[:7])
'/folder1/folder2/folder3/folder4/folder5/exp-*'

Using re.match

>>> s = '/folder1/folder2/folder3/folder4/folder5/exp-*/exp-*/otherfolder/file'
>>> re.match(r'.*?\*', s).group()
'/folder1/folder2/folder3/folder4/folder5/exp-*'
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
0

Your example suggests that you want to partition the strings at the first * character. This can be done with str.partition():

list_of_stringparts = []

list_of_strings = ['/folder1/folder2/folder3/folder4/folder5/exp-*/exp-*/otherfolder/file', '/folder1/exp-*/folder2/folder3/folder4/folder5/exp-*/exp-*/otherfolder/file', '/folder/blah/pow']
for s in sorted(list_of_strings):
    head, sep, tail = s.partition('*')
    list_of_stringparts.append(head + sep)

>>> list_of_stringparts
['/folder/blah/pow', '/folder1/exp-*', '/folder1/folder2/folder3/folder4/folder5/exp-*']

Or this equivalent list comprehension:

list_of_stringparts = [''.join(s.partition('*')[:2]) for s in sorted(list_of_strings)]

This will retain any string that does not contain a * - not sure from your question if that is desired.

mhawke
  • 84,695
  • 9
  • 117
  • 138