5

I import a CSV as a DataFrame using:

import numpy as np
import pandas as pd

df = pd.read_csv("test.csv")

Then I'm trying to do a simple replace based on IDs:
df.loc[df.ID == 103, ['fname', 'lname']] = 'Michael', 'Johnson'

I get the following error:

AttributeError: 'list' object has no attribute 'loc'

Note, when I do print pd.version() I get 0.12.0, so it's not a problem (at least as far as I understand) with having pre-11 version. Any ideas?

Parseltongue
  • 11,157
  • 30
  • 95
  • 160
  • That syntax works fine for me - if you provide a reproducible example then it would be easier to help, since it may depend on an issue with the data in the csv file. – Peter Fine Oct 09 '13 at 10:44
  • Yes also works for me, a sample of your csv may be helpful. – C Mars Oct 09 '13 at 11:20

3 Answers3

5

To pickup from the comment: "I was doing this:"

df = [df.hc== 2]

What you create there is a "mask": an array with booleans that says which part of the index fulfilled your condition.

To filter your dataframe on your condition you want to do this:

df = df[df.hc == 2]

A bit more explicit is this:

mask = df.hc == 2
df = df[mask]

If you want to keep the entire dataframe and only want to replace specific values, there are methods such replace: Python pandas equivalent for replace. Also another (performance wise great) method would be creating a separate DataFrame with the from/to values as column and using pd.merge to combine it into the existing DataFrame. And using your index to set values is also possible:

df[mask]['fname'] = 'Johnson'

But for a larger set of replaces you would want to use one of the two other methods or use "apply" with a lambda function (for value transformations). Last but not least: you can use .fillna('bla') to rapidly fill up NA values.

Community
  • 1
  • 1
Carst
  • 1,614
  • 3
  • 17
  • 28
1

The traceback indicates to you that df is a list and not a DataFrame as expected in your line of code.

It means that between df = pd.read_csv("test.csv") and df.loc[df.ID == 103, ['fname', 'lname']] = 'Michael', 'Johnson' you have other lines of codes that assigns a list object to df. Review that piece of code to find your bug

Zeugma
  • 31,231
  • 9
  • 69
  • 81
1

@Boud answer is correct. Loc assignment works fine if the right-hand-side list matches the number of replacing elements

In [56]: df = DataFrame(dict(A =[1,2,3], B = [4,5,6], C = [7,8,9]))

In [57]: df
Out[57]: 
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9

In [58]: df.loc[1,['A','B']] = -1,-2

In [59]: df
Out[59]: 
   A  B  C
0  1  4  7
1 -1 -2  8
2  3  6  9
Jeff
  • 125,376
  • 21
  • 220
  • 187