3

I have a dataframe column which could look something like this:

s = pd.Series(["a0a1a3", "b1b3", "c1c1c3c3"], index=["A", "B", "C"])

I can find the str.find method to find at each cell the indeces I want:

s.str.find('1').values
array([3, 1, 1])
s.str.find('3').values
array([5, 3, 5])

However I cannot find how to use these function to cut a strings in that column. For example:

s.str[s.str.find('1').values:s.str.find('3').values].values

gives

array([ nan,  nan,  nan])

Which is the right way to combine these functions?

Delosari
  • 677
  • 2
  • 17
  • 29

1 Answers1

7

Is that what you want?

In [87]: s.str.split('1').str[0]
Out[87]:
A    a
B    b
C    c
dtype: object

In [88]: s.str.split('1').str[1]
Out[88]:
A    a2
B    b2
C    c2
dtype: object

or

In [89]: s.str.split('1', expand=True)
Out[89]:
   0   1
A  a  a2
B  b  b2
C  c  c2

You will find a lot of useful examples on the official Pandas docs site

UPDATE:

In [203]: s = pd.Series(["a1a2", "b1b2", "c1c2", "aaaaaa1XX"], index=["A", "B", "C", "D"])

In [204]: s
Out[204]:
A         a1a2
B         b1b2
C         c1c2
D    aaaaaa1XX
dtype: object

In [205]: s.str.split('1', expand=True)
Out[205]:
        0   1
A       a  a2
B       b  b2
C       c  c2
D  aaaaaa  XX

UPDATE2:

In [224]: s
Out[224]:
A      a0a1a3
B        b1b3
C    c1c1c3c3
dtype: object

In [225]: s.str.extract(r'1(.*?)3', expand=False)
Out[225]:
A      a
B      b
C    c1c
dtype: object

NOTE: please always post both source and desired data sets - otherwise we have to guess what are you trying to achieve...

MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
  • Thank you very much for the reply but it is not exactly that: In this case you know the "1" string is in the same index for all the cells. How would you do it if it wasn't? – Delosari May 09 '17 at 19:40
  • @Delosari, it will work as well - see updated answer ;-) – MaxU - stand with Ukraine May 09 '17 at 19:43
  • thanks again. I can manage to work with what you have given me but I wonder if there is another way: In your method you are cutting the column series each time to get the parts you want. But myself I wanted to work with the ".find" method to find the indeces I need to slice the strings. Is there not way to use the indeces from ".find" in a ".str[idxInidial:idxFinal]" structure – Delosari May 09 '17 at 19:52
  • @Delosari, most probably it's possible, but it's not idiomatic way to do that and it will look ugly. Third rule of Python Zen says - `"Simple is better than complex."` – MaxU - stand with Ukraine May 09 '17 at 19:54
  • I have updated the question for a better description – Delosari May 09 '17 at 19:55