0

So my issue is every time I do some cleaning in the dataframe, the modified dataframe is not saving the changes to a new csv. Is something wrong with my code?

import pandas as pd

housing = pd.read_csv(csv_path)

# Modifying the data frame (removing any strings)

headers = ['Sold Price', 'Longitude', 'Latitude', 'Land Size', 'Total Bedrooms', 'Total Bathrooms', 'Parking Spaces']

for header_index in range(len(headers)):
    for index in housing.index:
        row = housing.at[index, headers[header_index]]
        if row is not int or row is not float:
            row = ''

housing.to_csv('propertyupdated.csv')

After calling housing.to_csv('propertyupdated.csv'), I went to the directory and checked the csv file. It was the original file, my modifications have not been saved into the new csv file. But I know that I have changed the dataframe in python.

CountDOOKU
  • 289
  • 3
  • 14
  • Probably you are looking in the wrong place. [What exactly is current working directory?](https://stackoverflow.com/questions/45591428/what-exactly-is-current-working-directory/66860904) – tripleee Jul 16 '22 at 08:31
  • @tripleee Hmm, I don't know about that everything else seems okay? Using ```replace() ```works, I could see my changes in the saved csv file. Maybe something is wrong with my modifications code? I just wanted to make sure every data in my csv is an integer or float – CountDOOKU Jul 16 '22 at 09:52

1 Answers1

0

First, there is a problem with your code where you are checking the data types the result is always true

You can consider DataFrame as something like a column dict of numpy.array-s (Ami Tavory) The types of dataframe data are of <class 'numpy.dtype'> which means that they should be checked in a different way (see the code)

I'm not quite sure what you want to do, but in one comment you said ( I just wanted to make sure every data in my csv is an integer or float) and if that's the case you can try removing columns that are not of type np.int64 or np.float64... This is just a sample df with the data that, after the changes, could be written to .csv file.

import pandas as pd
import numpy as np

headers = ['ID', 'SIGN', 'BASE_1', 'BASE_2', 'BASE_3', 'VARIABLE', 'ONE_TENTH']
data = [[101, 'A', 10, 100, 1000, 'X', 0.1], [201, 'B', 20, 200, 2000, 'Y', 0.2], [301, 'C', 30, 300, 3000, 'Z', 0.3]]

df = pd.DataFrame(data, columns = headers)

for col in df.columns:
    if df[col].dtype not in(np.int64, np.float64):
        df = df.drop([col], axis=1)
Result = '''     
d f     B e f o r e
    ID SIGN  BASE_1  BASE_2  BASE_3 VARIABLE  ONE_TENTH
0  101    A      10     100    1000        X        0.1
1  201    B      20     200    2000        Y        0.2
2  301    C      30     300    3000        Z        0.3

d f     A f t e r
    ID  BASE_1  BASE_2  BASE_3  ONE_TENTH
0  101      10     100    1000        0.1
1  201      20     200    2000        0.2
2  301      30     300    3000        0.3
'''

... and if you really want to deal with rows and change some of the data with empty string (which is still a string) than you can find the way to manipulate rows at How can we change data type of a dataframe row in pandas? (accepted answer). Regards...

d r
  • 3,848
  • 2
  • 4
  • 15