0

player column

So I am using tidyr in Rstudio and I am trying to separate the data in the 'player' column (attached below) into 4 individual columns: 'number', 'name','position' and 'school'. I tried using the separate() function, but can't get the number to separate and can't use a str_sub because some numbers are double digits. Does anyone know how to separate this column to the appropriate 4 columns?

Matt
  • 7,255
  • 2
  • 12
  • 34
  • 2
    Please don't show data as images. – Martin Gal Jun 16 '20 at 21:28
  • 2
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Pictures of data do not count as "reproducible" since we can't copy/paste the data for testing. – MrFlick Jun 16 '20 at 21:38

1 Answers1

3

A method using a series of separate calls.

# Example data
df <- data.frame(
  player = c('11Vita VeaDT | Washington',
             '16Clelin FerrellEDGE | Clemson',
             "17K'Lavon ChaissonEdge | LSU",
             '15Cody FordOT | Oklahoma',
             '20Derrius GuiceRB',
             '1Joe BurrowQB | LSU')) 

The steps are:

  1. separate school using |
  2. separate number using the distinction of numbers and letters
  3. separate position using capital and lowercase, but starting at the end
  4. cleanup, trim off white space, or extra spaces around the text
df %>%
  separate(player, into = c('player', 'school'), '\\|') %>%
  separate(player, into = c('number', 'player'), '(?<=[0-9])(?=[A-Za-z])') %>%
  separate(player, into = c('last', 'position'), '(?<=[a-z])(?=[A-Z])') %>%
  mutate_if(is.character, trimws)
# Results
   number         name position      school
1  11         Vita Vea       DT  Washington
2  16   Clelin Ferrell     EDGE     Clemson
3  17 K'Lavon Chaisson     Edge         LSU
4  15        Cody Ford       OT    Oklahoma
5  20    Derrius Guice       RB        <NA>
6   1       Joe Burrow       QB         LSU
nniloc
  • 4,128
  • 2
  • 11
  • 22