0

Say I have a dataframe df

import pandas as pd
df = pd.DataFrame()

and I have the following tuple and value:

column_and_row = ('bar', 'foo')
value = 56

How can I most easily add this tuple to my dataframe so that:

df['bar']['foo'] 

returns 56?

What if I have a list of such tuples and list of values? e.g.

columns_and_rows = [A, B, C, ...]
values = [5, 10, 15]

where A, B and C are tuples of columns and rows (similar to column_and_row).

Along the same lines, how would this be done with a Series?, e.g.:

import pandas as pd
srs = pd.Series()

and I want to add one item to it with index 'foo' and value 2 so that:

srs['foo'] 

returns 2?

Note: I know that none of these are efficient ways of creating dataframes or series, but I need a solution that allows me to grow my structures organically in this way when I have no other choice.

Amelio Vazquez-Reina
  • 91,494
  • 132
  • 359
  • 564

1 Answers1

3

For a series, you can do it with append, but you have to create a series from your value first:

>>> print x
A    1
B    2
C    3
>>> print x.append( pandas.Series([8, 9], index=["foo", "bar"]))
A      1
B      2
C      3
foo    8
bar    9

For a DataFrame, you can also use append or concat, but it doesn't make sense to do this for a single cell only. DataFrames are tabular, so you can only add a whole row or a whole column. The documentation has plenty of examples and there are other questions about this.

Edit: Apparently you actually can set a single value with df.set_value('newRow', 'newCol', newVal). However, if that row/column doesn't already exist, this will actually create an entire new row and/or column, with the rest of the values in the created row/column filled with NaN. Note that in this case a new object will be returned, so you'd have to do df = df.set_value('newRow', 'newCol', newVal) to modify the original.

However, now matter how you do it, this is going to be inefficient. Pandas data structures are based on Numpy and are fundamentally reliant on knowing the size of the array ahead of time. You can add rows and columns, but every time you do so, and entirely new data structure is created, so if you do this a lot, it will be slower than using ordinary Python lists/dicts.

Community
  • 1
  • 1
BrenBarn
  • 242,874
  • 37
  • 412
  • 384