-1

I have a data like df1:

    score range
1   8.00  001-009.99
2   8.50  001-009.99
3   8.51  001-009.99
4   8.52  001-009.99
5   79.20 050-079.99
6   79.90 050-079.99
7   80.00 080-082.99
8   81.00 080-082.99
9   81.10 080-082.99
10  81.11 080-082.99
11  81.12 080-082.99
12  90.00 090-092.99

I need to make a new column of range list,is acordding to if df1$score in df1$range then put in new column named range list.All need to be character form.How do I acheive this in R? Many thanks.

    score   range list
1   8.00    8.00,8.50,8.51,8.52 
2   8.50    8.00,8.50,8.51,8.52    
3   8.51    8.00,8.50,8.51,8.52  
4   8.52    8.00,8.50,8.51,8.52  
5   79.20   79.20,79.90
6   79.90   79.20,79.90
7   80.00   80.00,81.00,81.10,81.11,81.12
8   81.00   80.00,81.00,81.10,81.11,81.12
9   81.10   80.00,81.00,81.10,81.11,81.12
10  81.11   80.00,81.00,81.10,81.11,81.12
11  81.12   80.00,81.00,81.10,81.11,81.12
12  90.00   90.00
Alegría
  • 179
  • 5

2 Answers2

1

We could do it this way:

The trick is to take advantage of parse_numbers (my favorite function) behaviour to extract to first numbers of a string only, then group it and use toString:

library(dplyr)
library(readr)

df %>% 
  group_by(x = parse_number(range)) %>% 
  mutate(`range list` = toString(score), .keep="used") %>% 
  ungroup() %>% 
  select(-x)
  score `range list`              
   <dbl> <chr>                     
 1  8    8, 8.5, 8.51, 8.52        
 2  8.5  8, 8.5, 8.51, 8.52        
 3  8.51 8, 8.5, 8.51, 8.52        
 4  8.52 8, 8.5, 8.51, 8.52        
 5 79.2  79.2, 79.9                
 6 79.9  79.2, 79.9                
 7 80    80, 81, 81.1, 81.11, 81.12
 8 81    80, 81, 81.1, 81.11, 81.12
 9 81.1  80, 81, 81.1, 81.11, 81.12
10 81.1  80, 81, 81.1, 81.11, 81.12
11 81.1  80, 81, 81.1, 81.11, 81.12
12 90    90  
TarJae
  • 72,363
  • 6
  • 19
  • 66
0

Please check the below code

code

library(tidyverse)
library(data.table)

df2 <- df %>% mutate(id=rleid(range)) %>% group_by(id) %>% 
mutate(`range list`=paste0(score, collapse = ',')) %>% ungroup() %>% select(-id,-range)

output

# A tibble: 12 × 2
   score `range list`                 
   <chr> <chr>                        
 1 8.00  8.00,8.50,8.51,8.52          
 2 8.50  8.00,8.50,8.51,8.52          
 3 8.51  8.00,8.50,8.51,8.52          
 4 8.52  8.00,8.50,8.51,8.52          
 5 79.20 79.20,79.90                  
 6 79.90 79.20,79.90                  
 7 80.00 80.00,81.00,81.10,81.11,81.12
 8 81.00 80.00,81.00,81.10,81.11,81.12
 9 81.10 80.00,81.00,81.10,81.11,81.12
10 81.11 80.00,81.00,81.10,81.11,81.12
11 81.12 80.00,81.00,81.10,81.11,81.12
12 90.00 90.00                        

jkatam
  • 2,691
  • 1
  • 4
  • 12