4
pd.__version__
'0.15.2'

I have a panda's data frame with a multiindex of three levels. When I concatenated the two dataframes, it turned the lowest index into a float where it should be a string. enter image description here

enter image description here I tried to replace the the .0 with nothing using

idx=str(dfmaster_stats.index.levels[2]).replace('.0', '')

enter image description here and assigning it to the dataframe, but I get this error

TypeError: 'FrozenList' does not support mutable operations.

I looked at other questions and figured out that multi indexes can't be changed so I tried to reindex the dataframe. I followed this question, but both solutions do not work.

Pandas: Modify a particular level of Multiindex

It definitely does not look right. What am I doing wrong? enter image description here

I also tried set_levels, but not sure of the syntax.

dfmaster_stats.index.set_levels(dfmaster_stats.index.levels[2](idx), level =2)

gives me this error

TypeError: 'Index' object is not callable
  • Like this? https://stackoverflow.com/questions/34417970/pandas-convert-index-type-in-multiindex-dataframe – Evan Oct 10 '19 at 14:30
  • @evan I think I also tried that `dfmaster_stats.index = dfmaster_stats.index.set_levels([idx.levels[:2], idx.levels[2].astype(str)].replace('.0', ''))` gets me `AttributeError: 'str' object has no attribute 'levels'` – davidhmpham Oct 10 '19 at 14:42
  • I'm going to try to recreate my entire multiindex using pivot tables instead of groupby and see if that makes a difference. – davidhmpham Oct 10 '19 at 15:11

1 Answers1

5

It may be easier, as has been mentioned in other posts, to just reset your index, change dtypes, and set a new index.

np.random.seed(0)
tuples = list(zip(*[['bar', 'bar', 'baz', 'baz',
                     'foo', 'foo', 'qux', 'qux'],
                      [1.0, 2.0, 1.0, 2.0,
                       1.0, 2.0, 1.0, 2.0]]))

idx = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(8, 2), index=idx, columns=['A', 'B'])

print(df)
print(df.index.get_level_values("second").dtype)

Output:

                         A         B
first second                    
bar   1.0     1.764052  0.400157
      2.0     0.978738  2.240893
baz   1.0     1.867558 -0.977278
      2.0     0.950088 -0.151357
foo   1.0    -0.103219  0.410599
      2.0     0.144044  1.454274
qux   1.0     0.761038  0.121675
      2.0     0.443863  0.333674
float64

Now, reset the index, change dtype, and set new index.

df = df.reset_index()
df["second"] = df["second"].astype(int).astype(str)
df = df.set_index(["first", "second"])

print(df)
print(df.index.get_level_values("second").dtype)

Output:

                     A         B
first second                    
bar   1       1.764052  0.400157
      2       0.978738  2.240893
baz   1       1.867558 -0.977278
      2       0.950088 -0.151357
foo   1      -0.103219  0.410599
      2       0.144044  1.454274
qux   1       0.761038  0.121675
      2       0.443863  0.333674
object

In general, I have found manipulating multiindexes (indices?) to be sometimes worth the bother, and sometimes not. Changing levels gets verbose. If you're dedicated to the cause, this works:

idx0 = df.index.levels[0]
idx1 = df.index.levels[1].astype(str).str.replace('.0', '')

df.index = df.index.set_levels([idx0, idx1])
print(df.index.levels[1].dtype)

Output:

object

If you provide sample code to create your dataframe, I can extend it to 3 levels, or you can work it out. :)

Evan
  • 2,121
  • 14
  • 27
  • Note that the `levels` attribute of a pandas MultiIndex seems to sort the individual levels lexicographically (as can be seen on the screenshots of the OP). This will reorder the index labels for a given level without maintaining the MultiIndex tuples. (This also occurs when converting to `str` with `astype(str)` ). – onietosi Aug 12 '20 at 16:04
  • pandas devs: Please let us do ```df.set_index(df.index.astype(types_dict))``` – Attila the Fun Jul 18 '23 at 19:02