-2

I want to drop rows with any NaN values in my dataset.The dimension of my dataset is 9733*123.

Here is the code to drop rows with any NaN value :

dataset.dropna(how='all')
print(dataset)

Even after running this I am unable to clean my data in csv file. Can you guys help?

SanSolo
  • 2,267
  • 2
  • 24
  • 32
ojj
  • 1
  • 1
  • 1
  • As with a huge number of pandas methods, it does not work in-place. Either reassign the result back: `dataset = dataset.dropna(how='all')` or pass the `inplace` argument: `dataset.dropna(how='all', inplace=True)` – roganjosh Dec 19 '18 at 02:17
  • How does this have anyhting to do with a decision tree? – Jab Dec 19 '18 at 02:18
  • @Jaba I am cleaning the data first. I mentioned in case of decision tree i have to do something different then others can inform me. – ojj Dec 19 '18 at 02:50
  • @roganjosh the inplace thing doesnt work. Any other thing i can do? – ojj Dec 19 '18 at 02:50
  • Please paste a sample of the data. There are many reasons for this. For e.g. it may not actually be NaN then you can specify what value is same as `Nan`. You can even remove `na` values when loading data. A sample of this data would help to answer this. – Abhishek Dujari Dec 19 '18 at 03:28
  • @AbhishekDujari the data has NA values. how do i specify to drop NA and not NaN? – ojj Dec 19 '18 at 03:48
  • Ok there are a couple of things I can't tell from your question. But this is a good answer to your questions assuming you read_csv() or similar import method https://stackoverflow.com/questions/12514590/reading-file-with-missing-values-in-python-pandas The idea is that within the dataframe the values should be of certain type and then they will be interpreted as Nan or INF or zero. Otherwise you will end up with mixed data types. – Abhishek Dujari Dec 20 '18 at 10:44

1 Answers1

0

From the documentation: if you want to drop rows “with any NaN values”, use how='any' (the default value); and to modify the dataframe instead of making an edited copy, use inplace=True. All together:

dataset.dropna(inplace=True)
CrepeGoat
  • 2,315
  • 20
  • 24
  • As a general suggestion, if you're using a function from pandas (or any library), and it doesn't do what you think it should, take a look at the documentation. Especially with core Python, NumPy and pandas, the documentation is very thorough and can answer a lot of questions. – CrepeGoat Dec 19 '18 at 03:38
  • why the downvote? – CrepeGoat Dec 20 '18 at 16:29