2

I have recently started learning R and I am facing an issue. I have a column in my data which have height of players in (feet'inches) format. I want to create a new column for height in centimeters. For this I used the "strsplit" function as below(df is the height column):

l <- strsplit(df,"'",fixed = T)
print(l)
[[1]]

[1] "5" "7"

[[2]]

[1] "6" "2"

[[3]]

[1] "5" "9"

[[4]]

[1] "6" "4"

[[5]]

[1] "5"  "11"

[[6]]

[1] "5" "8"

I am getting stuck here as I don't know how to obtain the required value after splitting the field.

I am trying to use the below code but its giving the following error:

p_pos <- grep("'",df)
l[[p_pos]][1]

Error in l[[p_pos]] : recursive indexing failed at level 2

I am expecting the above code to print the values from the first column in the list

5 6 5 6 5 5

>dput(head(df, 10))
c("5'7", "6'2", "5'9", "6'4", "5'11", "5'8")
Rizwan Nawab
  • 23
  • 1
  • 4

2 Answers2

4

One way to do this is to create a data frame with a column of feet and a column of inches. The separate function in the tidyr package handles this well - see this answer by its creator.

> library(dplyr)
> library(tidyr)
> df = data.frame(height = c("5'7", "6'2", "5'9", "6'4", "5'11", "5'8"))
> df %>% separate(height, c('feet', 'inches'), "'", convert = TRUE) %>% 
+     mutate(cm = (12*feet + inches)*2.54)
  feet inches     cm
1    5      7 170.18
2    6      2 187.96
3    5      9 175.26
4    6      4 193.04
5    5     11 180.34
6    5      8 172.72

The separate creates a data frame with columns of feet and inches; the mutate does the conversion to centimeters.

Michael Lugo
  • 378
  • 3
  • 9
3

This will give you a vector with the heights in centimeters.

We are applying to your whole list a function that turns the number string into numeric and multiplies it with the conversion to cm.

l = list()
l[[1]] = c("5","7")
l[[2]] = c("6","2")
l[[3]] = c("5","9")
l[[4]] = c("6","4")
l[[5]] = c("5","11")
l[[6]] = c("5","8")

sapply(l,function(x) sum(as.numeric(x)*c(30.48,2.54)))
[1] 170.18 187.96 175.26 193.04 180.34 172.72
Fino
  • 1,774
  • 11
  • 21
  • Thanks! I have many rows present in my actual column. Can you please show how I can do the assigning part(l[[1]] = c("5","7")) for all of them using a loop. Thanks again. – Rizwan Nawab Feb 14 '19 at 15:10
  • I did the part just to get the same list you were printing. I imagine that `l <- strsplit(df,"'",fixed = T)` should return the values for all rows this same way. – Fino Feb 14 '19 at 15:20