0

So basically I need to generate a bar plot that tracks absentee hours by month given a csv data set. I've tried a lot of variations, but this is what I currently have:

df = read.csv("Absenteeism_at_work.csv",sep=";",header=TRUE)

tabledata <- table(df$Absenteeism.time.in.hours,df$Month.of.absence)

barplot(tabledata[,-1],main="Absent Hours by Month",

    xlab="Month",

    ylab="Total Hours Absent",

    col="Red")

The bar plot currently being generated

However, I believe this is just giving me frequency by month, and I need to figure out how to put df$absenteeism.time.in.hours as a sum value on the y axis without using ggplot. Any advice on how to set the sum of absenteeism.time.in.hours as the y axis would be appreciated.

Data set for reference

camille
  • 16,432
  • 18
  • 38
  • 60
Potato Joe
  • 17
  • 6
  • Please add a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610). That way you can help others to help you! – dario Feb 10 '20 at 15:56
  • 1
    `table` gives frequencies. If you want sums, you need to calculate those instead; those sorts of tasks are covered pretty extensively on SO, such as [here](https://stackoverflow.com/q/1660124/5325862) and [here](https://stackoverflow.com/q/9847054/5325862) – camille Feb 10 '20 at 16:10

1 Answers1

0

You should use tapply (table apply) instead of table. We can use it to group "absenteeism in hours" according to the month and then apply some function to those groups; in this case, we want to sum the groups.

tabledata <- tapply(as.numeric(as.character(df$`Absenteeism time in hours`)),
                    as.numeric(as.character(df$`Month of absence`)), 
                    sum)

barplot(tabledata[-1], main="Absent Hours by Month",
    xlab="Month",
    ylab="Total Hours Absent",
    col="Red")

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87