I am working with panel data and I am looking to summarize information for each entity (ID) in different periods of time.
I am fairly new to R, but I am certain this can be done rather easily.
My data is formatted like this:
mydata <- structure(list(company_id = c(1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3,
3, 3, 3, 3, 3), A = c(3, 3, 3, 3, 4, 4, 4, 4, 4, 6, 6, 6, 6,
6, 6, 6), B = c(3, 6, 9, 12, 4, 8, 12, 16, 20, 6, 12, 18, 24,
30, 36, 42)), .Names = c("company_id", "A", "B"), row.names = c(NA,
-16L), class = c("tbl_df", "tbl", "data.frame"))
# A tibble: 16 x 2
company_id A
<dbl> <dbl>
1 1 3
2 1 3
3 1 3
4 1 3
5 2 4
6 2 4
7 2 4
8 2 4
9 2 4
10 3 6
11 3 6
12 3 6
13 3 6
14 3 6
15 3 6
16 3 6
My desired output would be to add a column B such that:
╔════════════╦═══╦════╗
║ company_id ║ A ║ B ║
╠════════════╬═══╬════╣
║ 1 ║ 3 ║ 3 ║
║ 1 ║ 3 ║ 6 ║
║ 1 ║ 3 ║ 9 ║
║ 1 ║ 3 ║ 12 ║
║ 2 ║ 4 ║ 4 ║
║ 2 ║ 4 ║ 8 ║
║ 2 ║ 4 ║ 12 ║
║ 2 ║ 4 ║ 16 ║
║ 2 ║ 4 ║ 20 ║
║ 3 ║ 6 ║ 6 ║
║ 3 ║ 6 ║ 12 ║
║ 3 ║ 6 ║ 18 ║
║ 3 ║ 6 ║ 24 ║
║ 3 ║ 6 ║ 30 ║
║ 3 ║ 6 ║ 36 ║
║ 3 ║ 6 ║ 42 ║
╚════════════╩═══╩════╝
What is the most efficient way of doing this?