How to make a df of unique values from another df columns?

Question

I'm trying to making a df from only unique values from another df. This is my idea

columns = df.columns
df_uniquevalues = pd.DataFrame()

for i in range(len(columns)):
    df_uniquevalues[columns[i]] = df[columns[i]].unique()
    i += 1

My idea is extracting unique values by a for cycle. Hope you understand. But the error occurs for this reason "Length of values does no match length of index".

Have you a better idea? Or, just the way that I can add columns without the index problem?

Thank you so much!

maybe you could use `drop_duplicates()` on each series – rhug123 Sep 23 '20 at 20:37 — rhug123, Sep 23 '20 at 20:37

score 0 · Answer 1 · answered Sep 23 '20 at 20:40

0

Not really sure what you mean by "unique values", as there are several options to what you meant: whole row unique? only one value is unique?

Anyway, pandas' drop_duplicates does exactly what you are asking for. Another option is use numpy.unique which receives an numpy array (dataframe is also possible) and returns only unique values.

answered Sep 23 '20 at 20:40

Roim

2,986
2
10
25

Thank you Roim! I just was refering to unique values of every column. But with this both answers I'm getting. Thank you again! – Moriah Sep 23 '20 at 20:59

score 0 · Answer 2 · answered Sep 23 '20 at 20:41

0

df_uniquevalues = df.drop_duplicates()

Documentation discussing the various argument options.

In general looping isn't the way to go with Pandas. There is almost always a vectorized version of the operation you are looking for.

answered Sep 23 '20 at 20:41

noah

2,616
13
27

How to make a df of unique values from another df columns?

2 Answers2