1

So essentially I am trying to concatenate two data frames in a Jupyter notebook and to do this I am first setting the index for each data frame to be the date. When I do this it looks like I get a multilevel index with the "3_mo" category in an extra dimension. Do I have to change this somehow?

Once both data frames are formatted and have the same dimension of index then I should be able to just use .concat([df1, df2]) right?

Data Frame 1:

       Date     3_Mo
0   1990-01-02  7.83
1   1990-01-03  7.89
2   1990-01-04  7.84
3   1990-01-05  7.79
4   1990-01-08  7.79

Data Frame 2:

       Date      3_MO
0   1990-01-02  8.375
1   1990-01-03  8.375
2   1990-01-04  8.375
3   1990-01-05  8.375
4   1990-01-08  8.375

Data Frame 1 After set_index('Date'):

            3_Mo
    Date    
1990-01-02  7.83
1990-01-03  7.89
1990-01-04  7.84
1990-01-05  7.79
1990-01-08  7.79

The goal format: How do I get this?

              MDW     ORD
2000-01-01  4384.0  22474.0
2000-02-01  4185.0  21607.0
2000-03-01  4671.0  24535.0
2000-04-01  4419.0  23108.0
2000-05-01  4552.0  23292.0

I would like to have it so I just have one row index and the one column index.

Picture of the Data Frames

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • Can you provide samples of both input dataframes or create some toy dataframe with the problem and show the expected output from this data? – Scott Boston Aug 16 '18 at 18:30
  • 2
    @Ethan, it's easier for the community to reproduce your problem and find a solution to your problem if you provide a [minimal, complete and verifiable example](https://stackoverflow.com/help/mcve). You can find a great guide on how to do that specifically in pandas examples [here](https://stackoverflow.com/a/32536193/7480990). Make sure you show us your approach, what your input is and what your desired output would look like. – tobsecret Aug 16 '18 at 18:46
  • How do I copy the output of a Jupyter notebook into a code data frame to post in this question? –  Aug 16 '18 at 18:49
  • Given that your DataFrame is called `df`, you can use `df.to_dict()` to get a dictionary representation of it. Then in your code example you would copy the output of that, so that it looks like this: `df = pd.DataFrame({'your':'copied', 'dict': 'goes here'})` – tobsecret Aug 16 '18 at 18:51
  • 1
    Your `Data Frame 1 After set_index('Date'):` doesn't have a multi-index, the index just has a name. – tobsecret Aug 16 '18 at 19:36
  • Oh, I understand what you are saying now. –  Aug 16 '18 at 19:51

2 Answers2

0

To get rid of a multiindex, use the .reset_index() method of your DataFrame object.

tobsecret
  • 2,442
  • 15
  • 26
  • Just curious, when I go to plot Data Frame 1 after set_index it returns an error saying 'Empty Data Frame: no numeric data to plot" the numeric data is the 3_mo yield though so why is this happening? –  Aug 16 '18 at 20:07
  • What's the dtype of 3_mo? – tobsecret Aug 16 '18 at 20:13
  • It says that it is a string which I do not understand because to me it looks like floating point data. Also, when I try to reassign the dytype to float it won't let me reassign it. –  Aug 16 '18 at 20:14
  • That's probably because you have some row for which you have a string in there. – tobsecret Aug 16 '18 at 20:19
  • Is there a way to use a pandas .loc for searching for string data types? –  Aug 16 '18 at 20:29
  • https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.isdecimal.html#pandas.Series.str.isdecimal `df['3_Mo'].loc[~df['3_Mo'].str.isdecimal()]` – tobsecret Aug 16 '18 at 20:44
  • Using this syntax it returned: Name: 3_MO, Length: 7307, dtype: object (a series of length 7300, out of the 7500 total). These returned values still look like floating point numbers though so why are they being classified as strings? –  Aug 16 '18 at 21:19
  • This goes a bit beyond the scope of this question, I would suggest asking a new question. Also, again - it's much easier for anyone to help you if they have more information, so I encourage you to find a minimal, complete and verifiable example. – tobsecret Aug 16 '18 at 22:01
  • I was about to ask a new question on this topic so I could explain it completely, but its seems that I have been banned from asking questions based on some dumb questions I asked on the platform a year ago when I was new to this site. I have gone back to try to edit/fix those questions based on the guidelines you linked to. In the meantime an upvote on my good questions/recommendations on how to fix any you still believe are bad might be helpful. Cheers. –  Aug 17 '18 at 19:13
  • In the meantime it would be useful if you could add the output to this question, which you can still edit, right? – tobsecret Aug 17 '18 at 20:24
0

You can use df.reset_index(inplace=True) to make the change in the existing dataframe itself.

Harikrishna
  • 1,100
  • 5
  • 13
  • 30