1

empty string like this isnull() not find empty string

conn = connect(host='localhost',port=3306,user='root',password='root',database='spiderdata',charset='utf8')
df = pd.read_sql('select * from beikedata_community1',con=conn)
df
df.subway.isnull()

**i want to use 'isnull()' find missing value, but it's not support empty string, what can i do? thanks very much!**
yanchen heng
  • 11
  • 1
  • 3
  • i want to use 'isnull()' find missing value, but it's not support empty string, what can i do? thanks very much – yanchen heng Apr 17 '21 at 12:45
  • 2
    `df.replace(r'', np.NaN)`? Does [Replacing blank values (white space) with NaN in pandas](https://stackoverflow.com/questions/13445241/replacing-blank-values-white-space-with-nan-in-pandas) help you? – Ynjxsjmh Apr 17 '21 at 12:46
  • @Ynjxsjmh wow! amazing! thanks very much! – yanchen heng Apr 17 '21 at 12:50
  • @Ynjxsjmh excuse me! your suggestion help me replace empty string, but i found i can't use missingno find missing values, do you have good idea about this? – yanchen heng Apr 17 '21 at 13:12
  • Sorry, I have never used that library before, maybe you could replace `np.NaN` in `df.replace(r'', np.NaN)` to some values that `missingno` can find. – Ynjxsjmh Apr 17 '21 at 13:48
  • okay! still thanks for your help! you are very nice! – yanchen heng Apr 17 '21 at 13:58
  • @ Anurag Dabas @Golden Lion both not answer i want, still thanks for your help – yanchen heng Apr 18 '21 at 03:53

1 Answers1

-1

You can use print(df.replace(r' ', 'NaN')) . This Replaces the empty cells with NaN.

  • but 'NaN' is also a string, i just want replace it make missingno finding it, like None in python or null in mysql. – yanchen heng Apr 17 '21 at 13:46
  • Can you please provide an example of how your initial and final results would look like? – Anirudh Apr 17 '21 at 17:12
  • @ Anirudh i just want to find missing values with library like missingno in my dataframe Conveniently,but empty string in raw data i have no means to deal with it. – yanchen heng Apr 18 '21 at 04:02
  • Can you share a before and after version of the dataframe ? I am still unable to comprehend – Anirudh Apr 18 '21 at 07:51
  • Post the desired output. I gave you an answer that all data science preprocessing use dd.fillna(0). You can also use bfill to backfill missing values – Golden Lion Apr 18 '21 at 09:41
  • [link](https://github.com/hengyanchen/beike_data/blob/main/%E8%B4%9D%E5%A3%B3%E7%BD%91%E6%95%B0%E6%8D%AE%E6%8E%A2%E7%B4%A2%E6%80%A7%E5%88%86%E6%9E%90.ipynb)**this is my data jupyter note,null values in subway can't find by missingno or df.isnull(), i also can't use dd.fillna(0) to replace it** – yanchen heng Apr 18 '21 at 12:39
  • by the way ,sometimes we need to replace missing values with median,isn't it? – yanchen heng Apr 18 '21 at 12:44
  • I think I got what you are trying to say. You can replace an empty/missing cell with a string like 'NaN'. Then if you use df.dropna on your dataframe or if you count the NaN/missing values, you get the answer. There are a lot of techniques to play around with missing data and one among them is replacing with median – Anirudh Apr 18 '21 at 15:31
  • @ Golden Lion @Anirudh you are right, i think i made a stupid mistake, i don't use inplace=True cause i think your way didin't work.hahaha! my fault! thank you again – yanchen heng Apr 19 '21 at 08:28