1

Here is the original df:

    area sector      item
1   East      A      <NA>
2  South      A     Baidu
3  South      A   Tencent
4   West      A      <NA>
5  North      A      <NA>
6   East      B Microsoft
7   East      B    Google
8   East      B  Facebook
9  South      B      <NA>
10  West      B      <NA>
11 North      B      <NA>
12  East      C      <NA>
13 South      C      <NA>
14  West      C      <NA>
15 North      C   Alibaba
16  East      D      <NA>
17 South      D      <NA>
18  West      D    Amazon
19 North      D      <NA>
20  East      E      <NA>
21 South      E      <NA>
22  West      E      <NA>
23 North      E      <NA>

How can I transform the above df to the following one? Some cells in the transformed df have multiple items from the original df.

  Sector                     East            South     West     North
1 A                          <NA> "Baidu, Tencent"     <NA>      <NA>
2 B "Microsoft, Google, Facebook"             <NA>     <NA>      <NA>
3 C                          <NA>             <NA>     <NA> "Alibaba"
4 D                          <NA>             <NA> "Amazon"      <NA>
5 E                          <NA>             <NA>     <NA>      <NA>
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
billyi
  • 27
  • 5

3 Answers3

2

A quick solution could be to use the toString function while trasnforming from long to wide using the reshape2 package

reshape2::dcast(df, sector ~ area, toString)
#Using item as value column: use value.var to override.
#   sector                        East   North          South   West
# 1      A                        <NA>    <NA> Baidu, Tencent   <NA>
# 2      B Microsoft, Google, Facebook    <NA>           <NA>   <NA>
# 3      C                        <NA> Alibaba           <NA>   <NA>
# 4      D                        <NA>    <NA>           <NA> Amazon
# 5      E                        <NA>    <NA>           <NA>   <NA>

This is almost a dupe of this but most of the solutions there won't work for this case- but this can still give you some ideas.

Community
  • 1
  • 1
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
2

And just for fun, here is a base solution:

reshape(aggregate(item ~ area + sector, data = df, paste, collapse = ","), 
    idvar = "sector", timevar = "area", direction = "wide")
   sector                 item.East item.North    item.South item.West
1       A                      <NA>       <NA> Baidu,Tencent      <NA>
5       B Microsoft,Google,Facebook       <NA>          <NA>      <NA>
9       C                      <NA>    Alibaba          <NA>      <NA>
13      D                      <NA>       <NA>          <NA>    Amazon
17      E                      <NA>       <NA>          <NA>      <NA>
DatamineR
  • 10,428
  • 3
  • 25
  • 45
1

Here is an option with dplyr/tidyr

library(dplyr)
library(tidyr)
df1 %>%
   group_by(area, sector) %>% 
   summarise(item = toString(item)) %>% 
   spread(area, item)
akrun
  • 874,273
  • 37
  • 540
  • 662