Mutating columns to get Unique Values, transpose another column and add the percentage of those unique values

Question

I have a data set that I've subset which looks like this:

Item	Code	Percentage
10000	123	0.2
10001	134	0.98
10001	152	0.02
10002	123	0.68
10003	123	1
10002	178	0.32
10004	189	1

I want to find a way to transpose in a way where I only retain unique values from column A, Column B is dispersed into different columns according to unique values and the Percentage populates in those values. Please see the example of the data I'm looking to finalize with:

Item	123	134	152	178	189
10000	0.2	0	0	0	0
10001	0	0.98	0.02	0	0
10002	0.68	0	0	0.3	0
10003	1	0	0	0	0
10004	0	0	0	0	1

I am currently using a format the follows this "skeleton":

       df <-df %>%
       group_by(Item) %>%
       mutate(n = row_number()) %>%
       spread(Code, Percentage)

Following this structure I still get values in Column A repeating (not unique). I did load library(plyr) library(dplyr) library(tidyr) in that order. Reason why I mentioned is I read somewhere if you switch the order it works, but ended up messing up the results.

If you need more information please let me know. Thank you!

plyr has been retired for a few years now. You can do this with tidyr, which also has updated syntax from `spread` to `pivot_wider`. There are many SO questions on this same task; I'll dig up a few to flag — camille, Feb 25 '21 at 17:11
Does this answer your question? [Why does pivot\_wider either read single values as duplicates or create a wide-and-long tibble (without merging rows)?](https://stackoverflow.com/questions/64627034/why-does-pivot-wider-either-read-single-values-as-duplicates-or-create-a-wide-an) — camille, Feb 25 '21 at 17:39
Hey All! thank you so much for your help! Both options worked out for me easily! Really appreciate it! :) — guirgo__, Feb 25 '21 at 20:28

score 1 · Accepted Answer · answered Feb 25 '21 at 17:11

using tidyverse

library(tidyverse)

df <- read.table(text = "Item   Code    Percentage
10000   123 0.2
10001   134 0.98
10001   152 0.02
10002   123 0.68
10003   123 1
10002   178 0.32
10004   189 1", header = T)

pivot_wider(
  data = df, 
  id_cols = Item, 
  names_from = Code, 
  values_from = Percentage, 
  values_fill = 0
)
#> # A tibble: 5 x 6
#>    Item `123` `134` `152` `178` `189`
#>   <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 10000  0.2   0     0     0        0
#> 2 10001  0     0.98  0.02  0        0
#> 3 10002  0.68  0     0     0.32     0
#> 4 10003  1     0     0     0        0
#> 5 10004  0     0     0     0        1

^{Created on 2021-02-25 by the reprex package (v1.0.0)}

using data.table

library(data.table)
setDT(df)
dcast(data = df, formula = Item ~ Code, value.var = "Percentage", fill = 0)
#>     Item  123  134  152  178 189
#> 1: 10000 0.20 0.00 0.00 0.00   0
#> 2: 10001 0.00 0.98 0.02 0.00   0
#> 3: 10002 0.68 0.00 0.00 0.32   0
#> 4: 10003 1.00 0.00 0.00 0.00   0
#> 5: 10004 0.00 0.00 0.00 0.00   1

^{Created on 2021-02-25 by the reprex package (v1.0.0)}

score 1 · Answer 2 · answered Feb 25 '21 at 17:11

The issue is that it needs to do both grouping by 'Code' and 'Item'

library(dplyr)
library(tidyr)
df %>%
   group_by(Code, Item) %>%
   mutate(n = row_number()) %>%
   ungroup %>%
   spread(Code, Percentage, fill = 0) %>%
   select(-n)

-output

# A tibble: 5 x 6
#   Item `123` `134` `152` `178` `189`
#  <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 10000  0.2   0     0     0        0
#2 10001  0     0.98  0.02  0        0
#3 10002  0.68  0     0     0.32     0
#4 10003  1     0     0     0        0
#5 10004  0     0     0     0        1

Mutating columns to get Unique Values, transpose another column and add the percentage of those unique values

2 Answers2