0

I have a DataFrame (B) with two columns of 500 words each. I try to create a new list (C) that only contains the unique words found in (B): so each word in (B) appears only once in (C).

However, the following error is thrown that I cannot resolve: "ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."

Any suggestions? Here is my code:

import pandas as pd
q = "questions.tsv"
data = pd.read_csv(q, usecols = [3, 4], nrows = 9, header=0, sep="\t")
first_words = []
for word in data:
    first_words.append(data.applymap(lambda x: x.split()[0]))
unique_words = []
for w in first_words:
    if w not in unique_words:
        unique_words.append(w)
print(unique_words)

Column 1. Column 2
0 What      What
1 What      What
2 How       How
3 Why       Find
4 Which    Which
5 Should   What
6 How      What
7 When      When

 I would expect is to get a list (C) like this:
What
How
Why
Find
Which
Should
When
twhale
  • 725
  • 2
  • 9
  • 25
  • How about some data and expected output? – cs95 Sep 15 '17 at 16:16
  • From Pandas FAQ [Using If/Truth Statements with pandas](https://pandas.pydata.org/pandas-docs/stable/gotchas.html#using-if-truth-statements-with-pandas) – wwii Sep 15 '17 at 16:24

0 Answers0