1

I would like to split a series that has multiple values with file paths and jar file as a delimitator. How can split the values into different rows so that the '.jar' delimitator is not lost

Ex: 1 /opt/abc/defg/first.jar/opt/dce/efg/second.jar/opt/xyz/prs/third.jar

Expected result: 1 /opt/abc/defg/first.jar

2 /opt/dce/efg/second.jar

3 /opt/xyz/prs/third.jar

Thanks

Manowar
  • 19
  • 2
  • 1
    Does this answer your question? [In Python, how do I split a string and keep the separators?](https://stackoverflow.com/questions/2136556/in-python-how-do-i-split-a-string-and-keep-the-separators) – AR5HAM Jan 03 '22 at 19:30
  • This question is already answered [here](https://stackoverflow.com/questions/7866128/python-split-without-removing-the-delimiter) and [Here](https://stackoverflow.com/questions/2136556/in-python-how-do-i-split-a-string-and-keep-the-separators) – AR5HAM Jan 03 '22 at 19:31

3 Answers3

3

Try str.split with a positive lookbehind assertion

>>> df['path'].str.split('(?<=\.jar)').str[:-1].explode()

0    /opt/abc/defg/first.jar
0    /opt/dce/efg/second.jar
0     /opt/xyz/prs/third.jar
Name: path, dtype: object
Corralien
  • 109,409
  • 8
  • 28
  • 52
3

You can use .str.extractall, using the pattern '(.*?\.jar)'

import pandas as pd

s = pd.Series(['/opt/abc/defg/first.jar/opt/dce/efg/second.jar/opt/xyz/prs/third.jar'])
s.str.extractall('(.*?\.jar)')

                               0
  match                         
0 0      /opt/abc/defg/first.jar
  1      /opt/dce/efg/second.jar
  2       /opt/xyz/prs/third.jar
ALollz
  • 57,915
  • 7
  • 66
  • 89
1

You can add ".jar" after the split.

value = "/opt/abc/defg/first.jar/opt/dce/efg/second.jar/opt/xyz/prs/third.jar"
results = [i + ".jar" for i in value.split(".jar") if i != ""]
print(results)

Output:

['/opt/abc/defg/first.jar', '/opt/dce/efg/second.jar', '/opt/xyz/prs/third.jar']
arshovon
  • 13,270
  • 9
  • 51
  • 69