5

I would like to insert a row into an empty DataFrame. However, this seems to fail for a DataFrame with predefined indices and when the elements include a tuple or list prompting the error:

ValueError: setting an array element with a sequence.

The example code is as follows:

df = pd.DataFrame(columns=['idx1', 'idx2', 'col1', 'col2', 'col3'])
df.set_index(['idx1', 'idx2'], inplace=True)
df.loc[(1,2),:] = [3,4,(5,6)]
print(df)
Nick ODell
  • 15,465
  • 3
  • 32
  • 66
Christian
  • 991
  • 2
  • 13
  • 25

2 Answers2

3

It is not clear that the elements in the list correspond to values in different columns. You can convert the list first to a Series indexed by the DataFrame's columns:

df = pd.DataFrame(columns=['idx1', 'idx2', 'col1', 'col2', 'col3'])
df.set_index(['idx1', 'idx2'], inplace=True)
df.loc[(1,2),:] = pd.Series([3,4,(5,6)], index=df.columns)
print(df)
gofvonx
  • 1,370
  • 10
  • 20
  • 1
    Great, thank you very much. Exactly what I need! However, I still feel that this is a workaround and should be probably handled by loc directly? – Christian Oct 06 '18 at 18:24
  • 1
    @Christian I agree that it looks like a workaround since you do not assign your original variable but need to convert it before. You can, however, also view it as just explicitly stating where you want to see your values. (*Explicit is better than implicit.*) Not if this is also true from a performance viewpoint. – gofvonx Oct 06 '18 at 18:34
0

I tried something like this.

def with_return(row):
    t = [5,6]
    return t

df = pd.DataFrame(columns=['idx1', 'idx2', 'col1', 'col2', 'col3'])
df.set_index(['idx1', 'idx2'], inplace=True)
df.loc[(1,2),:] = [3,4,5]  #dummy element
df['col3'] = df.apply(with_return, axis=1)
print(df)

or simply use series,

df.loc[(1,2),:] = pd.Series([3,4,(5,6)], index=df.columns)

Still not directly inserting a tuple as an element in an empty DataFrame. But just another way to do it. Still, loc should be able to handle it.