0

I am trying to create a new column based on a existing column that uses pattern matching. The existing column is a user agent field such as

"Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B367 Safari/531.21.10"

I want to create a new column that uses pattern matching to identify what device is.

-So if user_agent like '%iPad%' and user_agent like '%WebKit%' then device is iPad. -if user agent user_agent like '%Android%' and user_agent not like '%Mobile%' then device is an android - if the (user_agent like '%Silk%' and user_agent like '%WebKit%') then device is kindle -if (user_agent like '%Playbook%') then device is Other

I want to try using the mutate function in dplyr to create the new column but need help with how to structure the regular expression

i.e mutate(data,device = ....)

sunny
  • 11
  • 3

1 Answers1

2

Something like this?

x <- c("Mozilla/5.0 (iPad; stuff AppleWebKit more stuff",
        "Android",
        "stuff Silk more stuff and WebKit",
        "stuff Playbook more stuff", 
        "unknown")

y <- ifelse(grepl("iPad", x) & grepl("WebKit", x), "iPad", 
        ifelse(grepl("Android", x) & !grepl("Mobile", x), "android", 
                ifelse(grepl("Silk", x) & grepl("WebKit", x), "kindle", 
                        ifelse(grepl("Playbook", x), "other", 
                                "don't know")
                )
        )
)

data.frame(x, y)
                                                x          y
1 Mozilla/5.0 (iPad; stuff AppleWebKit more stuff       iPad
2                                         Android    android
3                stuff Silk more stuff and WebKit     kindle
4                       stuff Playbook more stuff      other
5                                         unknown don't know

EDIT

Or perhaps this is easier:

device <- rep(NA_character_, length(x))

device[grepl("iPad", x) & grepl("WebKit", x)] <-  "iPad"
device[grepl("Android", x) & !grepl("Mobile", x)] <-  "android"
device[grepl("Silk", x) & grepl("WebKit", x)] <-  "kindle"
device[grepl("Playbook", x)] <-  "other"

data.frame(x, device)

                                                x  device
1 Mozilla/5.0 (iPad; stuff AppleWebKit more stuff    iPad
2                                         Android android
3                stuff Silk more stuff and WebKit  kindle
4                       stuff Playbook more stuff   other
5                                         unknown    <NA>
Jeff
  • 718
  • 8
  • 20
  • Thanks for the help Jeff. I'm new to grepl but it seems to combine multiple conditions unlike grep. Do you folks know how I can post sample data sets on stack overflow. everytime I try to do so they come out as one line items and not a dataframe – sunny Apr 10 '15 at 17:29
  • `grepl` returns a logical vector so it can be useful for what you are trying to do (I think). Usually best to get a small sample of your data and simply copy/paste the output of `dput(sampleData)` – Jeff Apr 10 '15 at 17:51
  • How can this be modified to search for "Kit", instead of "WebKit", and then return "kindle"? – derelict Feb 05 '16 at 20:44