How to search certain value in a data frame, then return all matching value in the same row

Question

My original data looks like this :

what I want to get is this:

I tried to write it in the for loop, but it did not work. The pivot_wider command also did not work. Any suggestion?

Could you post a [reproducible example please? Here some suggestions: https://stackoverflow.com/help/minimal-reproducible-example — geomicrobio, Jul 01 '21 at 00:23
This is a reshape long to wide, which you can do in multiple ways once you add a time variable, see all the options here - https://stackoverflow.com/questions/11322801/transpose-reshape-dataframe-without-timevar-from-long-to-wide-format — thelatemail, Jul 01 '21 at 01:21
Images are not the right way to share data/code. Add them in a reproducible format which is easier to copy. Read about [how to give a reproducible example](http://stackoverflow.com/questions/5963269). — Ronak Shah, Jul 01 '21 at 03:09

score 1 · Answer 1 · edited Jul 06 '21 at 10:24

1

Two reshaping options.

tidyverse -

library(dplyr)
library(tidyr)

df %>%
  group_by(source) %>%
  mutate(row = row_number()) %>%
  ungroup() %>%
  pivot_wider(names_from = row, values_from = c(location, value))

#  source location_1 location_2 location_3 location_4 value_1 value_2 value_3 value_4
#  <chr>  <chr>      <chr>      <chr>      <chr>        <dbl>   <dbl>   <dbl>   <dbl>
#1 a      ww         rr         de         NA              12      32      22      NA
#2 b      rr         tt         ef         ef              32      45      56      57
#3 c      er         eg         NA         NA              50      33      NA      NA

data.table -

library(data.table)
dcast(setDT(df), source~rowid(source), value.var = c('location', 'value'))

edited Jul 06 '21 at 10:24

rjen

1,938
1
7
19

answered Jul 01 '21 at 03:08

Ronak Shah

377,200
20
156
213

Thank you for the answer. I just have one more question. What does that "ungroup" do? – susan123 Jul 01 '21 at 23:05
@susan123 `ungroup` does not change anything in the dataframe as such but it removes the grouping attribute which was created by `group_by(source)`. If you remove `ungroup` from the answer the code will still work. It is just a good practice to `ungroup` whenever we group the data. – Ronak Shah Jul 01 '21 at 23:25

score 0 · Answer 2 · answered Jul 01 '21 at 00:44

Using library(data.table), we can do

rbindlist(lapply(unique(dt$source), 
                 function(x) dt[source==x, as.list(c(x, unlist(.SD[,-1])))]), 
          fill = TRUE)

#    V1 value1 value2 value3 location1 location2 location3 value4 location4
# 1:  a     12     32     22        ww        rr        de   <NA>      <NA>
# 2:  b     32     45     56        rr        tt        ef     57        ef
# 3:  c     50     33   <NA>        er        eg      <NA>   <NA>      <NA>

The data:

dt = data.table(source = c('a','a','a','b','b','b','b','c','c'),
                value =  c(12,32,22,32,45,56,57,50,33),
                location= c('ww','rr','de','rr','tt','ef','ef','er','eg'))

rjen · Answer 3 · 2021-07-01T07:09:52.720

A tidyverse option. coalesce_by_column is by Tung.

library(dplyr)
library(tidyr)
library(stringr)

coalesce_by_column <- function(df) {
  return(dplyr::coalesce(!!! as.list(df)))}

df %>%
  group_by(source) %>%
  mutate(valPivot = str_c('value', row_number()),
         locPivot = str_c('location', row_number()),
         across(everything(), ~ as.character(.))) %>%
  pivot_wider(names_from = valPivot, values_from = value) %>%
  pivot_wider(names_from = locPivot, values_from = location) %>%
  summarise(across(everything(), coalesce_by_column)) %>%
  mutate(across(starts_with('V'), ~ as.integer(.))) %>%
  select(source, colnames(.)[order(str_extract(colnames(.), '.$'))][-length(colnames(.))])

# # A tibble: 3 x 9
#   source value1 location1 value2 location2 value3 location3 value4 location4
#   <chr>   <int> <chr>      <int> <chr>      <int> <chr>      <int> <chr>    
# 1 a          12 ww            32 rr            22 de            NA NA       
# 2 b          32 rr            45 tt            56 ef            57 ef       
# 3 c          50 er            33 eg            NA NA            NA NA

Data

df <- tibble(source = c('a','a','a','b','b','b','b','c','c'),
             value =  c(12,32,22,32,45,56,57,50,33),
             location= c('ww','rr','de','rr','tt','ef','ef','er','eg'))

How to search certain value in a data frame, then return all matching value in the same row

3 Answers3