0

I uploaded document(.csv) to the project. There is 9 column(9 variable). In the task said that it is necessary to divide a variable(wages) into groups(the average salary, median, Kurtosis,Standard deviation) but still need to indicate the gender(variable gender) and the man is married or not(variable - marital status). For example there is data:

wages   gender  status  ............
5000      M       NO
3000      M       Yes
4500      W       NO
2000      M       NO
3500      W       Yes
6500      M       NO
8000      W       NO
.
.
.
.

and if we divide on the average wages than must be: 1)for man with status NO (5000+2000+6500)/3=4500

wages   gender  status
4500      M       NO

With what methods can this be done?

1 Answers1

0

If I understand you correctly you are looking for the average wage for each unique combination of gender and status? If so you could use the dplyr package. If, for example, your uploaded data was df.

library(dplyr)
library(e1071)

df <- data.frame(wages=c(5000, 3000, 4500, 2000, 3500, 6500, 8000),
                 gender=c('M', 'M', 'W', 'M', 'W', 'M', 'W'),
                 status=c('NO', 'YES', 'NO', 'NO', 'YES', 'NO', 'NO'))

df_out <- df %>%
  group_by(gender, status) %>%
  summarise(avg_wage = mean(wages),
      median_val = median(wages),
      st_dev= sd(wages),
      kurt = kurtosis(wages))

df_out

# A tibble: 4 x 6
# Groups:   gender [?]
  gender status avg_wage median_val   st_dev      kurt
  <fctr> <fctr>    <dbl>      <dbl>    <dbl>     <dbl>
1      M     NO     4500       5000 2291.288 -2.333333
2      M    YES     3000       3000       NA       NaN
3      W     NO     6250       6250 2474.874 -2.750000
4      W    YES     3500       3500       NA       NaN
D.sen
  • 938
  • 5
  • 14
  • thank. and what about median, Kurtosis,Standard deviation. can you help? – Dany worner Mar 25 '18 at 17:33
  • you would embed all grouped summary statistics in the `summarise` function. See updated answer. FYI you'll need to install a package for the kurtosis measurement, for me it was `e1071`. – D.sen Mar 25 '18 at 17:42
  • if either answer provided helped you solve your problem, please upvote and accept that answer. – D.sen Mar 25 '18 at 22:00