Cutting a string after the last occurrence of certain sign

Question

Can you please help me how to disentangle the following issue. I have a column in pandas df called "names" that contains links to webpages. I need to create a variable called "total categories" that will contain the parts of the link that appears after the last appearance of "/" sign. Example:

names
https://www1.abc.com/aaa/72566-finance
https://www1.abc.com/aaa1/725-z2
https://www1.abc.com/aaa2/75-z3

total categories
72566-finance
725-z2
75-z3

I tried this code:

def find_index(x):
    return x.rindex('/')

data_pd['total categories'] = data_pd['names'].apply(find_index)

I receive the following error:

AttributeError: 'float' object has no attribute 'rindex'

OK, so what have you tried so far? Does `split("/")[-1]` not work for you? — MattDMo, Jun 17 '22 at 17:52
Does this answer your question? [How to get everything after last slash in a URL?](https://stackoverflow.com/questions/7253803/how-to-get-everything-after-last-slash-in-a-url) — quartzic, Jun 17 '22 at 17:56

Philip Ciunkiewicz · Answer 1 · 2022-06-17T18:01:41.547

1

If you have these set up as columns in a pandas DataFrame, you can do the following:

df['total categories'] = df['names'].str.split('/').str[-1]

This will split the string based on the passed delimiter, '/', and then take the last element of the resulting splits.

edited Jun 17 '22 at 18:01

answered Jun 17 '22 at 17:57

Philip Ciunkiewicz

2,652
3
12
24

mozway · Accepted Answer · 2022-06-17T18:06:16.060

1

Use str.extract with the r'/([^/]+)$' regex:

df['total categories'] = df['names'].str.extract(r'/([^/]+)$')

output:

                                    names total categories
0  https://www1.abc.com/aaa/72566-finance    72566-finance
1        https://www1.abc.com/aaa1/725-z2           725-z2
2         https://www1.abc.com/aaa2/75-z3            75-z3

regex demo and description:

/       # match a literal /
(       # start capturing
[^/]+   # one or more non-/ characters
)       # end capturing
$       # end of string

edited Jun 17 '22 at 18:06

answered Jun 17 '22 at 17:58

mozway

194,879
13
39
75

Thank you for you answer! And is there a way to get what is coming after "integer-"? – Alberto Alvarez Jun 17 '22 at 18:10
1

yes sure, use `r'\d-([^-]+)$'` – mozway Jun 17 '22 at 18:15

Cutting a string after the last occurrence of certain sign

2 Answers2