Manipulating data.frames

Question

I have a sample survey sheet; something like demographic. One of the columns is country (factor) another is annual income. Now, I need to calculate average of each country and store in new data.frame with country and corresponding mean. It should be simple but I am lost. The data is something like the one shown below:

Country  Income($) Education ... ... ...
1. USA    90000      Phd
2. UK     94000      Undergrad
3. USA    94000      Highschool
4. UK     87000      Phd
5. Russia 77000      Undergrad
6. Norway 60000      Masters
7. Korea  90000      Phd
8. USA    110000     Masters
.
.

I need a final result like:

USA   UK    Russia ...
98000 90000 75000

Thank You.

downvote not from me but please [read this](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) and edit your post, as it stands this will likely be closed. — user1317221_G, Feb 16 '13 at 19:16
@user1317221_G, does it look better, if that's what you mean. — 700resu, Feb 16 '13 at 19:28
The answer to this question is in almost every R-tutorial i've seen. Take the time to go through one of them completely and you'll save yourself an immense amount of time in the long haul. — N8TRO, Feb 16 '13 at 19:47
@NathanG is right. I would take some time to google & familiarise yourself especially with `ddply` and `aggregate` as there are a lot of great blogs, and these are often used tools. — user1317221_G, Feb 16 '13 at 19:49

user1317221_G · Accepted Answer · 2013-02-16T20:50:44.463

5

data example:

dat <- read.table(text="Country  Income Education 
 USA    90000      Phd
 UK     94000      Undergrad
 USA    94000      Highschool
 UK     87000      Phd
 Russia 77000      Undergrad
 Norway 60000      Masters
 Korea  90000      Phd
 USA    110000     Masters",header=TRUE)

Do what you want with plyr :

if your data is called dat:

library(plyr)
newdf <- ddply(dat, .(Country), function(x) Countrymean = mean(x$Income))

# newdf <- ddply(dat, .(Country), function(x) data.frame(Income = mean(x$Income)))

and aggregate:

 newdf <- aggregate(Income ~ Country, data = dat, FUN = mean)

for the output you show at the end maybe tapply?

tapply(dat$Income, dat$Country, mean)

edited Feb 16 '13 at 20:50

answered Feb 16 '13 at 19:33

user1317221_G

15,087
3
52
78

Thanks. I have a question though. I tried sorting now and used **newdf<-newdf[order(Income),]** But it does not seem to work . It says object "Income" not found. does newdf have different structure? I also tried changing **newdf<-newdf[,order(Income)]** though. – 700resu Feb 16 '13 at 20:29
I think your probably wanting to do something like this: `newdf[with(newdf, order(Income)), ]` check [this post](http://stackoverflow.com/a/1296745/1317221) also I added an extra `ddply` line of code in answer for you to help you get a `newdf` with the mean column called `Income` – user1317221_G Feb 16 '13 at 20:48

Manipulating data.frames

1 Answers1