I am trying to pre-process my data in R such that I can use the "attribute mean for all samples belonging to the same class as the given tuple"
The missing values or the values falling out of range have been already given a value -1 by the data source provider. But I want to replace those missing values according to the data mining principle stated above in bold. The column that is my class decider is "Accident severity" and I want to give the attribute mean for all samples belonging to the same level of accident severity as the level of severity of the tuple with the missing attribute value.
As there are multiple columns with missing values, I guess I will have to do the taskk repeatedly for all columns one at a time. What r command should I use.
There are mostly two types of data types(vectors) in my data frame.. Factor is for Date and Time columns where as integer is for most of the other columns.
Is there a way that I can upload a subset of the data set here on stack overflow?
here is the link to the reproducible data set https://drive.google.com/file/d/0B3cafW7J7xSfSkRTYWRWMHhaU2c/edit?usp=sharing
Update 2: Now that the data set is there , please help me change the values where there is a "-1" in any of the columns to a value that is the mean of all tuples that have the same value for the attribute "Accident_severity" as the tuple with the missing values..
Update 3: please ignore the colums "X2_roadclass" and "X2_Road_type" as they are mostly blank and I am dropping them. thanks