0

I have several .csv files with house data from four different districts:

  1. Duckez.csv: enter image description here

  2. coordinate.csv: enter image description here

So the result I want that is to gather features into a numpy array X_train so that the model can be train and predict the house price for an unknown given latitude and longtitude ( coordinate ) The target X_train that I want contains these features: Bedrooms,Gardens,Latitude, Longitude (depending on the name of the .csv file, for example all the data from Duckez.csv file when merging will contain the latitude and longtitude of Dukez).

There are 4 similar .csv file: Duckez.csv, Vim.csv, Hustla.csv, Zedrim.csv

desertnaut
  • 57,590
  • 26
  • 140
  • 166
mintorii
  • 39
  • 4

1 Answers1

0
  1. Read your 'coordinate.csv' into a dataframe called coords with pd.read_csv

  2. For the house records (houses), we'll want one large dataframe. Declare it with an extra column 'District'. ('District' should be a categorical)

    • write a for-loop for district in ['Duckez', 'Vim',' Hustla', 'Zedrim']: that does the following
    • read the 'Duckez.csv' in with pd.read_csv and set 'District' = 'Duckez'..., etc.
    • you might want to pd.concat([dataframes], axis=0). You could do all this in one line with a list comprehension, if you wanted to be fancy.
    • in your read_csv command, you can specify the dtype of each column
  3. Then pd.join these houses and coords, on the 'District' column. This gives you one final dataframe df or recs or whatever.

  4. Finally for mungeing all the int,float,string,categorical data from recs into a strictly numerical (int/float)-only array X_train for training ML, there's tons of examples, beginner blogs, boilerplate out there, look for it. Also commonly do any NA-filling, or transforms (e.g. normalizing) here. And computing any derived features (e.g. Bathroom_to_Bedroom_Ratio, SqFt_per_Bedrooms, etc.)

Now try to write your own code to do this, it's not hard. (On SO, you're supposed to post your own code attempt, not just a spec and ask for code.) Edit your own code attempt into your question, people are much more likely to answer questions showing the OP's code.

Ok, roll your sleeves up and start writing the code. Refer to blogs and SO when stuck.

smci
  • 32,567
  • 20
  • 113
  • 146