0

I am trying to loop through the "Team" Column and return a slice of the team to remove the number and "-" where it meets a certain condition.

    Team                Player
0   1-Miami Heat        Jimmy Butler
1   2-Boston Celtics    Jason Tatum
2   3-Houston Rockets   James Harden

I am currently using:

def slice(x):
    for elm in x:
        if elm[0] == '1' or '2':
            return elm[2:]

NBA['Team'] = NBA['Team'].apply(slice)

This is returning an empty value for each team.

I would like to return this:
    Team                Player
0   Miami Heat          Jimmy Butler
1   Boston Celtics      Jason Tatum
2   3-Houston Rockets   James Harden
  • 2
    try `df['Team'].str.replace('(1-|2-)','')` using bitwise `OR` – Umar.H Jul 03 '20 at 16:15
  • Does this answer your question? [pandas replace multiple values one column](https://stackoverflow.com/questions/22100130/pandas-replace-multiple-values-one-column) – Umar.H Jul 03 '20 at 16:16

2 Answers2

0

Try this,
This will remove Numbers and '-'.

def slice(x):
    x = x.split('-')
    return x[::-1][0]

NBA['Team'] = NBA['Team'].apply(slice)

OUTPUT will be

    Team              Player
0   Miami Heat        Jimmy Butler
1   Boston Celtics    Jason Tatum
2   Houston Rockets   James Harden
TechWithVP
  • 44
  • 3
  • You have the `str` accessor to simplify string operations – yatu Jul 03 '20 at 16:31
  • While this solution does slice the number and '-', it doesn't use a condition. I need the slice to only happen where the number is 1 or 2 and not 3 – fearthespier Jul 03 '20 at 22:29
0

Your function is off a bit. First, you need the operand in both statements. Then secondly, you don't want to iterate through each string (which is what it's doing). It takes in each character of each string, when what you really want is just each string. You also need to return something if the condition isn't met.

I'd also make it more robust. What if a team starts with 11, or 12? It'll leave you from 12-Chicago Bulls to -Chicago Bulls. So instead of a fixed slice on the index, split at the - (see final solution at the end)

So adjust the function:

def slice(x):
    if x[0] == '1' or x[0] == '2':
        return x[2:]
    else:
        return x

There's other ways to do it too:

def slice(x):
    if x.startswith('1') or x.startswith('2'):
        return x[2:]
    else:
        return x

Or combine them into a list and use that:

def slice(x,check_list=['1','2']):
    if x.startswith(tuple(check_list)) :
        return x[2:]
    else:
        return x

More robust

import pandas as pd

df = pd.DataFrame({'Team':['1-Miami Heat','2-Boston Celtics','3-Houston Rockets','15-Chicago Bulls'],
                   'Player':['Jimmy Butler','Jason Tatum','James Harden', 'Zach LaVine']})

def slice(x,check_list=['1','2']):
    val, team = x.split('-')[0], x.split('-')[-1]
    if val in check_list:
        return team
    else:
        return x

df['Team'] = df['Team'].apply(slice)
chitown88
  • 27,527
  • 4
  • 30
  • 59