i have the following data:
ID | A | B |
---|---|---|
1 | 4 | 2 |
2 | 5 | 3 |
The result I want is:
ID | C |
---|---|
1 | 4 |
1 | 2 |
2 | 5 |
2 | 3 |
i have the following data:
ID | A | B |
---|---|---|
1 | 4 | 2 |
2 | 5 | 3 |
The result I want is:
ID | C |
---|---|
1 | 4 |
1 | 2 |
2 | 5 |
2 | 3 |
The cdata package has transformation functions that pivot and unpivot with multiple variables. What I like about the package is the control table idea that visually arranges the data in the form you want to see.
In this case, starting with a data frame df
,
ID <- c(1, 2)
A <- c(4, 5)
B <- c(2, 3)
df <- data.table(ID, A, B)
# check the data
df[]
#> ID A B
#> 1 1 4 2
#> 2 2 5 3
Build a cdata control table. This is the basic element that allows you to unambiguously assign which data values go where. In this case, I use a new variable from
to identify the columns from which values are taken and C
is the new column with the desired values.
# build a control table
from <- c("col_A", "col_B")
C <- c("A", "B")
control_table <- data.table(from, C)
# examine the result
control_table[]
#> from C
#> 1: col_A A
#> 2: col_B B
With the control table ready, I can use rowrecs_to_blocks()
from cdata to transform the data from row record (wide) form to block record (longer) form. The ID
variable is repeated as many times as necessary to complete the block records.
# transform to block form
DT <- cdata::rowrecs_to_blocks(
wideTable = df,
controlTable = control_table,
columnsToCopy = c("ID"))
setDT(DT)
# examine the result
DT[]
#> ID from C
#> 1: 1 col_A 4
#> 2: 1 col_B 2
#> 3: 2 col_A 5
#> 4: 2 col_B 3
You can omit the from
column, though by keeping it you can always recover the original data frame if you need to.
# omit the from column
DT[, from := NULL]
DT
#> ID C
#> 1: 1 4
#> 2: 1 2
#> 3: 2 5
#> 4: 2 3
Using data.table
where melt
is the dplyr
's equivalent of pivot_longer
dt <- data.table(ID = 1:2, A = 4:5, B = 2:3)
dt <- melt(dt, measure.vars = c("A", "B"), value.name = "C")
dt[, variable := NULL]
setorder(dt, ID) # to get the same order as your desired output
dt
# ID C
# 1: 1 4
# 2: 1 2
# 3: 2 5
# 4: 2 3
Just use by
.
DT[, .(C = c(A, B)), by = ID]
# ID C
# <int> <int>
# 1: 1 4
# 2: 1 2
# 3: 2 5
# 4: 2 3
Reproducible data
DT = data.table(ID = 1:2, A = 4:5, B = 2:3)
I used this command to answer my question :
df1 %>% pivot_longer(cols = c(A,B), values_to = "C") %>%
select(-name)
thanks to the previous comments by Martin Gal and Gnueghoidune