0

I'm trying to making a df from only unique values from another df. This is my idea

columns = df.columns
df_uniquevalues = pd.DataFrame()

for i in range(len(columns)):
    df_uniquevalues[columns[i]] = df[columns[i]].unique()
    i += 1

My idea is extracting unique values by a for cycle. Hope you understand. But the error occurs for this reason "Length of values does no match length of index".

Have you a better idea? Or, just the way that I can add columns without the index problem?

Thank you so much!

Moriah
  • 3
  • 3

2 Answers2

0

Not really sure what you mean by "unique values", as there are several options to what you meant: whole row unique? only one value is unique?

Anyway, pandas' drop_duplicates does exactly what you are asking for. Another option is use numpy.unique which receives an numpy array (dataframe is also possible) and returns only unique values.

Roim
  • 2,986
  • 2
  • 10
  • 25
  • Thank you Roim! I just was refering to unique values of every column. But with this both answers I'm getting. Thank you again! – Moriah Sep 23 '20 at 20:59
0

df_uniquevalues = df.drop_duplicates()

Documentation discussing the various argument options.

In general looping isn't the way to go with Pandas. There is almost always a vectorized version of the operation you are looking for.

noah
  • 2,616
  • 13
  • 27