2
>>> df = pd.DataFrame({'num_legs': [4, 2], 'num_wings': [0, 2]},
...                   index=['dog', 'hawk'])
>>> df
      num_legs  num_wings
dog          4          0
hawk         2          2
>>> for row in df.itertuples():
...     print(row)
...
Pandas(Index='dog', num_legs=4, num_wings=0)
Pandas(Index='hawk', num_legs=2, num_wings=2)

I am parsing an excel sheet using pandas.DataFrame.itertuples which will give me a pandas.DataFrame over each iteration. Consider the pandas.DataFrame returned in each iteration as shown above.

Now off the each data frame Pandas(Index='dog', num_legs=4, num_wings=0) I would like to access the values using the keyword num_legs however upon using the same I get the below exception.

TypeError: tuple indices must be integers, not str

Could someone help on how to retrieve the data from the data frames using the column headers directly.

meW
  • 3,832
  • 7
  • 27
Krishna Oza
  • 1,390
  • 2
  • 25
  • 50

3 Answers3

4

I faced the same error when using a variable.

v = 'num_legs'
for row in df.itertuples():
    print(row[v])

TypeError: tuple indices must be integers or slices, not str

To use df.itertuples() and use the attribute name as a variable.

v = 'num_legs'
for row in df.itertuples():
    print(getattr(row, v))

At the end df.itertuples() is faster than df.iterrows().

Mohit Musaddi
  • 143
  • 1
  • 8
1

Here:

for row in df.itertuples():
    print(row.num_legs)
  # print(row.num_wings)   # Other column values

# Output
4
2
meW
  • 3,832
  • 7
  • 27
  • accepting this since I was using itertuples to iterate over data frames. – Krishna Oza Feb 19 '19 at 12:35
  • I tried to use the same when reading a csv using `read_csv` however my first row after comments in csv is not being treated as column names and I get exception while using `row["columnHeader"]` – Krishna Oza Feb 22 '19 at 06:35
  • While that's a separate question which you should raise, but as a hint play with `header` argument. – meW Feb 22 '19 at 06:36
  • Tried to use `header` argument , unfortunately the csv have extra column data apart from column header and hence upon using the `header` argument the parsing fails – Krishna Oza Feb 22 '19 at 07:16
  • @darth_coder Then I suggest you should ask a separate question, by listing only this problem with proper explanation. – meW Feb 22 '19 at 07:18
1

you could use iterrows(),

for u,row in df.iterrows():
    print(u)
    print (row)
    print (row['num_legs'])

O/P:

dog
num_legs     4
num_wings    0
Name: dog, dtype: int64
4
hawk
num_legs     2
num_wings    2
Name: hawk, dtype: int64
2
Mohamed Thasin ah
  • 10,754
  • 11
  • 52
  • 111
  • This answer is also correct and I would now use `iterrows` while coding rather than `itertuples` since the way data is accessed mimics array index operator. – Krishna Oza Feb 19 '19 at 12:36