In python pandas I have a dataframe
df_aaa:
date data otherdata symbol
2015/1/1 11 12 aaa
2015/2/1 21 22 aaa
2015/3/1 31 31 aaa
df_all:
2015/1/1 31 31 bbb
Currently the index of both is by date
.
I want to append df_aaa to df_all, and have them with a composite index of both symbol and date.
- How do I do that?
Basically the following are all one question: How do I set a multi-index and use it when appending. Can I do it with different column order? Do I need to refresh? Etc.:
I'm not sure if a
multi-index
is an index that has multiple 'columns' (or rows), or is it the ability to have more than one index (and any of them could be for multiple columns or rows). Or are both correct?Must I first set the index of both dataframes to a multi-index, so the append will work? (otherwise I'll have duplicates for different symbols
Do I have to "drop" the existing index before creating the new one?
Is there such a thing as a dataframe with data but no index?
Must a (single) index be of unique values?
When do I use which of the following dataframe methods:
set_index()
,reindex()
,reset_index()
,set_level
,reset_level
?- And what is the default when I give these methods an array. Python docs are daunting, and I can't find my hands or legs in them. Giving some good examples would help...
Do I have to add anything (like
axis=1
) when setting the index?How do I set the index to be the data in a column. (And why does sometimes using ['symbol', 'date'] as a parameter, give me a new column with those two values, instead of setting the index on the existing values of the columns with those two names?)
After I append and assuming the old index is correct do I need to 'update' the index (perhaps using reindex?) or since I told the dataframe that the index is in a certain column, is my data correctly indexed?
And since my dataframes (will) have indices on the same column name, can I do an append of df_aaa on df_all even if df_all was defined to have the columns originally in a different order. (say: ['symbol', 'date', 'data', 'otherdata'] with symbol the first column)?