1

I have a dataframe to which I want to add another column and that depends on the values based on what other column in that particular cell.

I keep getting TypeError: string indices must be integers, not str.

Here is my dataframe: df where all columns' values are in string format

ID      Key
_1      A
_2       B, C
_3       A
_4       D, E
_5       B, C 

My expected output is

ID      Key      Name
_1       A        n0, n1
_2       B, C     n2
_3       A        n3
_4       D, E     n4
_5       B, C     n5, n6

Here is what I did:

df[df['ID'].str.contains('1')]['Name'] = 'n0, n1' that gave me Type Error.

Note here that the id matching is a substring match which is intentional.

Tried using numpy where but that also gave me the same error. I followed This link.

What is the correct way to set a new column's value that is based on subset of a column's values. Also, I cover all values later where I do this for every ID (here from 1 to 5).

Atihska
  • 4,803
  • 10
  • 56
  • 98

1 Answers1

0

The following worked for me:

df.loc[df['ID'].str.contains('1'), 'Name'] = 'n0, n1'

Basically you need to use .loc[row_index, col_index] = val to modify an existing dataframe.

Using df[row_index][col_index] just creates a copy of the value I believe.

This is also assuming you've already defined the column:

df['Name'] = pd.Series()
G. Larkham
  • 81
  • 6