Problem
Pandas seems to support using df.loc
to assign a dictionary to a row entry, like the following:
df = pd.DataFrame(columns = ['a','b','c'])
entry = {'a':'test', 'b':1, 'c':float(2)}
df.loc[0] = entry
As expected, Pandas inserts the dictionary values to the corresponding columns based on the dictionary keys. Printing this gives:
a b c
0 test 1 2.0
However, if you overwrite the same entry, Pandas will assign the dictionary keys instead of the dictionary values. Printing this gives:
a b c
0 a b c
Question
Why does this happen?
Specifically, why does this only happen on the second assignment? All subsequent assignments revert to the original result, containing (almost) the expected values:
a b c
0 test 1 2
I say almost because the dtype
on c
is actually an object
instead of float
for all subsequent results.
I've determined that this happens whenever there is a string and a float involved. You won't find this behavior if it's just a string and integer, or integer and float.
Example Code
df = pd.DataFrame(columns = ['a','b','c'])
print(f'empty df:\n{df}\n\n')
entry = {'a':'test', 'b':1, 'c':float(2.3)}
print(f'dictionary to be entered:\n{entry}\n\n')
df.loc[0] = entry
print(f'df after entry:\n{df}\n\n')
df.loc[0] = entry
print(f'df after second entry:\n{df}\n\n')
df.loc[0] = entry
print(f'df after third entry:\n{df}\n\n')
df.loc[0] = entry
print(f'df after fourth entry:\n{df}\n\n')
This gives the following printout:
empty df:
Empty DataFrame
Columns: [a, b, c]
Index: []
dictionary to be entered:
{'a': 'test', 'b': 1, 'c': float(2)}
df after entry:
a b c
0 test 1 2.0
df after second entry:
a b c
0 a b c
df after third entry:
a b c
0 test 1 2
df after fourth entry:
a b c
0 test 1 2