0

i have the following data:

ID A B
1 4 2
2 5 3

The result I want is:

ID C
1 4
1 2
2 5
2 3
chay
  • 45
  • 6
  • Does this answer your question? [Reshaping data.frame from wide to long format](https://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format) – Martin Gal Mar 22 '22 at 22:58
  • Take a look at the `tidyr` package and its `pivot_longer()` function: `df1 %>% pivot_longer(-ID, values_to = "C") %>% select(-name)` should do the trick. – Martin Gal Mar 22 '22 at 22:59
  • `melt(df, 'ID')` – Onyambu Mar 22 '22 at 23:12
  • basically I have other columns in the input data, how can I specify the columns that I want to melt in this command *df1 %>% pivot_longer(-ID, values_to = "C") %>% select(-name)*? – chay Mar 22 '22 at 23:16
  • Instead of deselecting columns you do not want to `pivot_longer()` (here: `-ID`, you can select them with `cols=c()`: `df1 %>% pivot_longer(cols = c(A,B), values_to = "C") %>% select(-name)` – Gnueghoidune Mar 22 '22 at 23:21

4 Answers4

0

The cdata package has transformation functions that pivot and unpivot with multiple variables. What I like about the package is the control table idea that visually arranges the data in the form you want to see.

In this case, starting with a data frame df,

ID <- c(1, 2)
A <- c(4, 5)
B <- c(2, 3)
df <- data.table(ID, A, B)


# check the data 
df[]

#>   ID A B
#> 1  1 4 2
#> 2  2 5 3

Build a cdata control table. This is the basic element that allows you to unambiguously assign which data values go where. In this case, I use a new variable from to identify the columns from which values are taken and C is the new column with the desired values.

# build a control table
from <- c("col_A", "col_B")
C    <- c("A", "B")
control_table <- data.table(from, C)

# examine the result
control_table[]
#>     from C
#> 1: col_A A
#> 2: col_B B

With the control table ready, I can use rowrecs_to_blocks() from cdata to transform the data from row record (wide) form to block record (longer) form. The ID variable is repeated as many times as necessary to complete the block records.

# transform to block form
DT <- cdata::rowrecs_to_blocks(
  wideTable     = df, 
  controlTable  = control_table,
  columnsToCopy = c("ID"))
setDT(DT)

# examine the result
DT[]
#>    ID  from C
#> 1:  1 col_A 4
#> 2:  1 col_B 2
#> 3:  2 col_A 5
#> 4:  2 col_B 3

You can omit the from column, though by keeping it you can always recover the original data frame if you need to.

# omit the from column
DT[, from := NULL]
DT
#>    ID C
#> 1:  1 4
#> 2:  1 2
#> 3:  2 5
#> 4:  2 3
0

Using data.table where melt is the dplyr's equivalent of pivot_longer

dt <- data.table(ID = 1:2, A = 4:5, B = 2:3)

dt <- melt(dt, measure.vars = c("A", "B"), value.name = "C")
dt[, variable := NULL]
setorder(dt, ID) # to get the same order as your desired output

dt
#    ID C
# 1:  1 4
# 2:  1 2
# 3:  2 5
# 4:  2 3
Merijn van Tilborg
  • 5,452
  • 1
  • 7
  • 22
0

Just use by.

DT[, .(C = c(A, B)), by = ID]
#       ID     C
#    <int> <int>
# 1:     1     4
# 2:     1     2
# 3:     2     5
# 4:     2     3

Reproducible data

DT = data.table(ID = 1:2, A = 4:5, B = 2:3)
s_baldur
  • 29,441
  • 4
  • 36
  • 69
0

I used this command to answer my question :

df1 %>% pivot_longer(cols = c(A,B),  values_to = "C") %>% 
        select(-name) 

thanks to the previous comments by Martin Gal and Gnueghoidune

Hansel Palencia
  • 1,006
  • 9
  • 17
chay
  • 45
  • 6