0

I am trying to use a query (countries who received a gold medal) from the data frame and store into elsewhere. A portion of the dataset is shown below.

A portion of the dataset

This is the query I have run

only_gold = df.where(df['Gold'] > 0)
only_gold.head()

Here is the error generated

I have also tried the query

only_gold = df[df['Gold'] > 0]
only_gold.head()

But the same error has been generated

Peter Tutervai
  • 793
  • 12
  • 21
  • You have a duplicate names of ```Gold``` in your data. Does this answer your question? [What does \`ValueError: cannot reindex from a duplicate axis\` mean?](https://stackoverflow.com/questions/27236275/what-does-valueerror-cannot-reindex-from-a-duplicate-axis-mean) – My Work Jun 09 '20 at 13:16

1 Answers1

0

This problem arises when multiple columns have the same name. As I pointed out in the comment, this post should answer your questions. One way of dealing with it is to remove the duplicates by renaming your columns or by an answer from @tuomastik or Parseltongue:

df = df[~df.index.duplicated()]

# or 
df = df.loc[:,~df.columns.duplicated()]

If your data is created by merging some other data frames (often the case), then use the ignore_index=True option.

My Work
  • 2,143
  • 2
  • 19
  • 47