0

I want to know if there are function or code that changes dataframe data into column.

This is my making data frame.

number <- c("no.1","no.2","no.3","no.4","no.5","no.6","no.7","no.8","no.9","no.10")

tp1 <- c("car","car","bicycle","car","walk","walk","bus","subway","subway","subway")

tp2 <- c("bicycle",NA,"bus",NA,"subway",NA,"walk",NA,NA,NA)

tp3 <- c("walk",NA,"subway",NA,NA,"bus",NA,NA,NA,NA)

tp4 <- c("bus","walk",NA,NA,NA,NA,NA,NA,NA,NA)

tp5 <- c("subway",NA,NA,NA,NA,NA,NA,NA,NA,NA)

transport <- data.frame(number,tp1,tp2,tp3,tp4,tp5)

and I want to make new dataframe as shown in the screenshot.

Please give me some advice :)

i want to make dataframe like this

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459

2 Answers2

0

You can get the data in long format, create a dummy column and get it back in wide format.

library(dplyr)
library(tidyr)

transport %>%
  pivot_longer(cols = -number, values_drop_na = TRUE) %>%
  mutate(n = 'yes') %>%
  select(-name) %>%
  pivot_wider(names_from = value, values_from = n, names_prefix = 'use_')


#  number use_car use_bicycle use_walk use_bus use_subway
#   <chr>  <chr>   <chr>       <chr>    <chr>   <chr>     
# 1 no.1   yes     yes         yes      yes     yes       
# 2 no.2   yes     NA          yes      NA      NA        
# 3 no.3   NA      yes         NA       yes     yes       
# 4 no.4   yes     NA          NA       NA      NA        
# 5 no.5   NA      NA          yes      NA      yes       
# 6 no.6   NA      NA          yes      yes     NA        
# 7 no.7   NA      NA          yes      yes     NA        
# 8 no.8   NA      NA          NA       NA      yes       
# 9 no.9   NA      NA          NA       NA      yes       
#10 no.10  NA      NA          NA       NA      yes    
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thanks for advice :) this was example data... i want to apply in real data but there are error :Values are not uniquely identified; output will contain list-cols. * Use `values_fn = list` to suppress this warning. * Use `values_fn = length` to identify where the duplicates arise * Use `values_fn = {summary_fun}` to summarise duplicates How can i solve it??? – plusICON Jun 19 '20 at 13:25
  • @plusICON You should provide example data that is similar to real data so that we can work on same data as you. Here are some things which you can do : 1) Keep only required columns in the data and remove the columns which are not required using `select` (same as `select(-name)` in the answer). If the error still persists try to create a unique column by doing `group_by(name) %>% mutate(row = row_number())`. Here is a reference post https://stackoverflow.com/questions/58837773/pivot-wider-issue-values-in-values-from-are-not-uniquely-identified-output-w – Ronak Shah Jun 19 '20 at 13:37
0

error :Values are not uniquely identified is not a problem in data.table. But, I guess that has been solved with tidyr 1.1.0.

Here is anyway, data.table solution:


library(data.table)

transport <- as.data.table(transport)

transport_long <- melt(transport, 
     id = "number", 
     measure = patterns("tp"), 
var.name ="tp", 
value.name = "transport_mode", 
na.rm = TRUE)

transport_long <- [, c("transport_mode", "yes") := .(paste0("use", transport_mode), "yes")]

dcast(transport_long[, -2], number~transport_mode, drop = FALSE, value.var = "yes")


number usebicycle usebus usecar usesubway usewalk
 1:   no.1        yes    yes    yes       yes     yes
 2:  no.10       <NA>   <NA>   <NA>       yes    <NA>
 3:   no.2       <NA>   <NA>    yes      <NA>     yes
 4:   no.3        yes    yes   <NA>       yes    <NA>
 5:   no.4       <NA>   <NA>    yes      <NA>    <NA>
 6:   no.5       <NA>   <NA>   <NA>       yes     yes
 7:   no.6       <NA>    yes   <NA>      <NA>     yes
 8:   no.7       <NA>    yes   <NA>      <NA>     yes
 [ reached getOption("max.print") -- omitted 2 rows ]


  
Eyayaw
  • 1,033
  • 5
  • 10