0

I am writing a function which takes a pandas df, column name and a list of values and gives the filtered df. This function uses df.query() internally.

In one specific case, I have a dataframe which has a column in which both integers and strings are present. My function should filter this df on a list whose elements are all integers. At the moment, I get an empty df as strings can't be compared to int. Even though in the dataframe and lookup list are same - for eg. '345' & 345.

What is a general way to handle this in pandas? I could coerce the list of integers to strings but I would like to stay away from that. This is because I want my function to be able to handle non-integral values as well. I am not sure if coercing to strings would be safe then: for eg. for floats.

Swetabh
  • 1,665
  • 2
  • 16
  • 21

1 Answers1

0

You have many otions but I think they can be summarized. I couldn't tell which one would make more sense to you without more context.

  • Convert numeric strings to numbers

    • If you are afraid of issues with floats, convert only integers.
    • If you want to keep your data as is, store the converted values in a different column / object and use it just for filtering.
    • If you want to keep the data types in the filtered data, filter the converted data and use the filtered index to subset the original data.
  • Convert numbers to strings (same considerations as above)

  • Filter by both the numbers in the lookup list and their string representation.

Stop harming Monica
  • 12,141
  • 1
  • 36
  • 56