I am quite new on R, and this post might be a duplicate of https://stackoverflow.com/questions/58837773/pivot-wider-issue-values-in-values-from-are-not-uniquely-identified-output-wpost, which however does not solve my doubts (I am not entirely sure it is exactly the same problem...)
My tibble looks like this:
tibs <- tibble(Brand = c("Brand1","Brand2","Brand1", "Brand2","Brand3","Brand4","Brand3","Brand4"),
Category = c("Cat1", "Cat1", "Cat1", "Cat1","Cat2", "Cat2","Cat2", "Cat2"),
share_1 = c(0.2, 0.8, 0.21, 0.79, 0.5, 0.5, NA, NA),
share_2 = c(0.3, 0.7, 0.3, 0.7, NA, NA, 0.6, 0.4),
share_3 = c(NA, NA, 0.21, 0.79, 0.6, 0.4,NA,NA),
mktsize_1 = c(100, 100,200, 200, 100, 100, NA, NA),
mktsize_2 = c(200,200,NA,NA,NA,NA,200,200),
mktsize_3 = c(NA,NA,300,300,300,300,NA,NA),
Type = c("Q", "Q", "P", "P", "Q", "Q", "P","P")
)
And the output that I want is exactly the Bobby
tibble below:
Bobby <- tibs %>%
pivot_longer(cols = share_1:mktsize_3,
names_to = c(".value", "year"),
names_sep = "_") %>%
pivot_wider(names_from = Type,
values_from = c(share, mktsize))
Problem: when I run a similar code (the same idea) on the real dataset, I get the following warning:
pivoted <- renamed %>%
pivot_longer(cols = c(Share_2012:Share_2021, Unit_2012:Unit_2021),
names_to = c(".value", "Year"),
names_sep = "_"
) %>%
rename(Market_Size = Unit) %>%
pivot_wider(names_from = Currency_Conversion,
values_from = c(Share, Market_Size)
)
Warning message:
Values from `Market_Size` and `Share` are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = {summary_fun}` to summarise duplicates.
* Use the following dplyr code to identify duplicates.
{data} %>%
dplyr::group_by(Geography, Industry, Category, Subcategory, NACE_mapping, Hierarchy_level, Data_Type, GBO_BI,
Current_Constant, Measure, Year, Currency_Conversion) %>%
dplyr::summarise(n = dplyr::n(), .groups = "drop") %>%
dplyr::filter(n > 1L)
Is it because I have several 100 values in the market shares? And, most importantly, is this problematic? I have some entries with null objects, which I could substitute with missing values as follows:
pivoted <- renamed %>%
pivot_longer(cols = c(Share_2012:Share_2021, Unit_2012:Unit_2021),
names_to = c(".value", "Year"),
names_sep = "_"
) %>%
rename(Market_Size = Unit) %>%
pivot_wider(names_from = Currency_Conversion,
values_from = c(Share, Market_Size),
values_fn = list) %>%
select(-c(Share_NA, Market_Size_NA)) %>%
mutate(across(where(is.list), map, `%||%`, NA))
However, I do not understand why the tibble Bobby
does not give me this type of problem, whereas the real one does...
I tried to read through the error and check the post above, but I still a bit confused on what should I do (I am not even sure I should do something!)