0

Given the following data from a CSV file, I want to plot a regression plot using Matlab for the mean of the 2-bedroom price.

I have managed to use subgroup to get the mean. However, after reading the solution from Stackoverflow and trying it, I mostly end up with other never-ending data-related problems. In general most of the errors are either to convert it to string or it is not index etc.

    Bedrooms    Price       Date
0   2.0     NaN             3/9/2016
1           1480000.0       3/12/2016
2   2.0     1035000.0       4/2/2016
3   3.0     NaN             4/2/2016
4   3.0     1465000.0       4/2/2016

#Assume you have the following dataframe df that describes flights
%matplotlib inline
import pandas as pd 
import datetime
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.read_csv('justtesting.csv', nrows=50, usecols=['Price','Date','Bedrooms']) 
df = df.dropna(0)e
df['Date'] = pd.to_datetime(df.Date)
df.sort_values("Date", axis = 0, ascending = True, inplace = True)
df2 = df[df['Bedrooms'] == 2].groupby(["Date"]).agg(['sum'])

df2.head()
df2.info()
sns.set()
g=sns.lmplot(x="Date", y="Price", data=df2, lowess=True)
letsintegreat
  • 3,328
  • 4
  • 18
  • 39

1 Answers1

0
#Assume you have the following dataframe df that describes flights
%matplotlib inline
import pandas as pd 
import datetime
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

df = x.copy()
df = df.dropna(0)
df.sort_values("Date", axis = 0, ascending = True, inplace = True)
df2 = df[df['Bedrooms'] == 2].groupby(["Date", 'Bedrooms'], as_index=False).sum()
df2.head()
df2.info()
sns.set()
g=sns.lmplot(x='Date', y="Price", data=df2, lowess=True)

Groupby makes the grouped by columns as index by default. Giving as_index=False will fix that. However, seasborn lmplot is required to have a float value. More info can be found on this question

Suraj Motaparthy
  • 520
  • 1
  • 5
  • 12