0

I have the following data of Unemployement per Year and quarter, but in my data frame is up to 2018, but I will use only 2 years for exemple.

Year       Unemployement

1997Q3       1914 
1997Q4       1697 
1998Q1       1702 
1998Q2       1645 
1998Q3       1742 
1998Q4       1605

What code can I use in order to tidy the Year column and to have the following data, and mainly to obtain the unemployment number by calculating the mean of each data per year: 1997 and 1998 (+ for other years that I have in my data frame). In the final version, I would like to have only one data of Unemployment per year, which theoretically shoud be the average of all Quaters

Year       Unemployement

1997         1805.50

1998         1673.50 

Thank you!

Chris
  • 45
  • 6
  • If the year and quarter information is combined do `aggregate(Unemployement~substr(Year, 1, 4), df, mean)` ? – Ronak Shah May 06 '19 at 11:53
  • @RonakShah , I really do not understand how can I calculate the mean per each year and insert the result accordingly in the Unemployment column? – Chris May 06 '19 at 18:07
  • 1
    Did you try what I suggested above? What output ddi you get? If it didn't work I would suggest to update your post with `dput(data)`. – Ronak Shah May 06 '19 at 23:11

1 Answers1

0
##Data entry

library(tidyverse)

df<- tribble(
~Year,~Quarter,~Unemployement,
1997,"Q3",1914,
1997,"Q4",1697,
1998,"Q1",1702,
1998,"Q2",1645,
1998,"Q3",1742,
1998,"Q4",1605
)


##Solution

df%>%
group_by(Year)%>%
summarise(mean_year = mean(Unemployement))


# A tibble: 2 x 2
   Year mean_year
  <dbl>     <dbl>
1  1997     1806.
2  1998     1674.

## 2nd Version (first separate the Year-column)

df%>%
  separate(Year, c("Year", "Quarter"))%>%
  group_by(Year)%>%
  summarise(mean_year = mean(Unemployement))
TobKel
  • 1,293
  • 8
  • 20
  • Thank your very much for the option offered. But, can you please provide how to create 3 columns (Year, Quarter and Unemployement) by using my data frame, not in a manual manner, by inserting the years, and the unemployement information? Thank you!!! – Chris May 06 '19 at 12:50
  • Do you want to separate the "Year"-column into two colums ("Year" and "quarter") ? – TobKel May 06 '19 at 13:06
  • Yes, I would like to do so, as for me after it would be easier to use your solution, as my table is very long :) – Chris May 06 '19 at 13:15
  • It's easy: you have to insert `separate(Year, c("Year", "Quarter"))%>%` into the second line of this code – TobKel May 06 '19 at 13:28
  • 1
    I've updated the code in the answer for you. See: `## 2nd Version` – TobKel May 06 '19 at 13:29
  • I have used the separate function, but something is not working when I want to calculate the mean and to have the value into Unemployement column. The problem that occurs is in the function group_by. It shows me: Error in UseMethod("group_by_") : no applicable method for 'group_by_' applied to an object of class "function" – Chris May 06 '19 at 16:29
  • And then it tells me that--> Error in group_by(Year) : object 'Year' not found – Chris May 06 '19 at 16:31
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/192920/discussion-between-chris-and-tobkel). – Chris May 06 '19 at 17:24