Python Pandas: Delete all rows of data below first empty cell in column A

Question

I have a csv file that gets mailed to me every day and I want to write a script to clean up the data before I push it in a database. At the bottom of the csv file are 2 empty rows (Row 73 & 74 in image) and two rows with some junk data in them (Row 75 & 76 in image) and I need to delete these rows.

To identify the first empty row, it might be helpful to know that Column A will always have data in it until the first empty row (Row 73 in image).

Can you help me figure out how to identify these rows and delete the data in them?

df.dropna() drops rows with NaNs, If you have NaNs in each row to drop, you can use this. This also might help: https://stackoverflow.com/questions/13851535/delete-rows-from-a-pandas-dataframe-based-on-a-conditional-expression-involving — Petronella, Dec 19 '19 at 15:12
[Please don't post images of code/data (or links to them)](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question) — anky, Dec 19 '19 at 15:12

jezrael · Accepted Answer · 2019-12-19T15:16:33.277

You can check misisng values by Series.isna, create cumulative sum by Series.cumsum and filter only if equal 0 by boolean indexing. Also this solution working if no missing value in first column.

df = pd.DataFrame({'A':['as','bf', np.nan, 'vd', 'ss'],
                   'B':[1,2,3,4,5]})

print (df)
     A  B
0   as  1
1   bf  2
2  NaN  3
3   vd  4
4   ss  5

df = df[df['A'].isna().cumsum() == 0]
print (df)
    A  B
0  as  1
1  bf  2

Python Pandas: Delete all rows of data below first empty cell in column A

1 Answers1