I have split my dataframe according to a range of sub-intervals of one column of continuous data:
Data1 <- read.csv(file.choose(), header = T)
# Order (ascending)by size
Group.order <- order(GroupN)
# Assign label to data frame ordered by group
Data1.group.order <- Data1[Group.order, ]
# Set a range of sub-intervals we wish to split the ordered data into
range <- seq(0, 300, by=75)
# Use the split function to split the ordered data, using the cut function which will
# cut the numeric vector GroupN by the value 'range'
Split.Data1 <- split(Data1.group.order, cut(Data1.group.order$GroupN, range))
With the data split, I now need to find the mean value of one of the columns in all sub-sets of the data frame but despite a lot of effort I'm struggling.
However, I have been to able to find the mean of multiple columns across the whole split data frame using the lapply function, but not one column on its own.
Any help would be appreciated.
EDIT: I am an R newbie, so what I really want to do is look at a distribution of variable x for each sub-set of the data frame, i.e. x-axis = 0-75, 75-150, 150-225, 225-300, y-axis = variable x. My planning was to split the data, find the mean values of variable x for each subset of the dataframe, then plot variable x by the intervals I subset the dataframe by. However, I'm sure there's a better way of doing this!