0

This question is related to this helpful answer here

The situation is the same. I have a dataframe:

print(df)
#      A      B  C   D
# 0  foo    one  0   0
# 1  bar    one  1   2
# 2  foo    two  2   4
# 3  bar  three  3   6
# 4  foo    two  4   8
# 5  bar    two  5  10
# 6  foo    one  6  12
# 7  foo  three  7  14

print(df.loc[df['A'] == 'foo'])

Which should give:

A      B  C   D
0  foo    one  0   0
2  foo    two  2   4
4  foo    two  4   8
6  foo    one  6  12
7  foo  three  7  14

But when I run it I get returned an empty dataframe. The column I am working with is of datatype object and looks like:

   ColumnA         ColumnB
    117700          []
    467390          []
    467391          []
    467392      ['AF']
    467393    ['AAPL']

I have tried the following commands. All yielded the same results:

df[[ColumnB==['[AAPL]'] for ColumnB in df.ColumnA]]

df[df["ColumnB"] == "AAPL"]

df.query("ColumnB== 'AAPL'")
innit
  • 51
  • 6

1 Answers1

0

It seems that the data type of values in Column B is list. Thus, you cannot treat them as string.

You can try this to check if the data type is list.

print(type(df['ColumnB'][0]))

If so, you can try this to subset the dataframe:

df.loc[df['ColumnB'].apply(lambda x: x[0] if len(x) > 0 else np.nan) == 'AAPL']

The point is to use the apply function to extract the first element in a list for each row, and thus those values become string.

Hope this helps!

Crystal L
  • 561
  • 3
  • 4