2

Say I want to compute the relative complement df2 - df1 between two MultiIndex dataframes. Assuming that they have the same indexing schema, based on what I saw in this answer from Andy Hayden, I could do the following:

diff_indices = df2.index - df1.index

And then either:

  1. df2.reindex(diff_indices, inplace=True)

    or

  2. df2 = df2.loc[diff_indices]

What would be the difference between 1. and 2. above? What is the difference between df.reindex and df.loc?

Community
  • 1
  • 1
Amelio Vazquez-Reina
  • 91,494
  • 132
  • 359
  • 564

1 Answers1

9

Both approaches return a new series/dataframe, and basically do the same thing.

The reason for the seeming redundancy is that, while using loc is syntacticly limiting (you can only pass a single argument to __getitem__), reindex is a method, which supports taking various optional parameters. (docs)

David Nehme
  • 21,379
  • 8
  • 78
  • 117
shx2
  • 61,779
  • 13
  • 130
  • 153