-1

I have got a huge data set in a CSV file of different models of cars sold by countries.image of CSV

As you can see, in the column "Country" we have got entries like: "Fiji, Japan, India" "Fiji, India" "Japan, India" "Japan" "Fiji, Japan"

I want to delete all rows of cars that are not sold in Japan. Therefore deleting rows like "Fiji, India", "India" and "Fiji" and leaving the remaining rows with Japan.

How do I go about doing this in Python?

Additionally, I want to code it in such a way that if in the future the database has entries like : Japan, USA USA Mexico Mexico, Japan

Then it can automatically sort this out and delete cars that are not sold in Japan.

I am extremely new to python and I am learning to code by myself. Any help would be apricated.

Thanks in advance!

  • 2
    Please see and implement: [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – BeRT2me Jul 19 '22 at 01:24
  • Pandas string match tutorial: https://davidhamann.de/2017/06/26/pandas-select-elements-by-string/ – Nick ODell Jul 19 '22 at 01:25

1 Answers1

0

You can use Series.str.contains

out = df[df['Country'].str.contains('japan', case=False)]
Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52