0

I am dealing with a data frame as shown in this image, but with 380 rows in total

Not sure if this will help but let's say I am working on the dataframe:

df <- data.frame(c(-10:-1),c(-5:4),c(1:10))

and I would like to extract any rows that contain the number "-5" in either the first or the second column.

In the shared Image, I want to extract rows that contain "Arsenal" in either "HomeTeam" or "AwayTeam" column, however I do not know how to do so.

This is my attempt using grep()

However it shows the message below:

"Error: Can't subset columns that don't exist. x The locations 12, 39, 45, 78, 98, etc. don't exist. i There are only 7 columns."

where the mentioned locations are exactly the rows I need...

I wanted to try some other filtering functions like dplyr() but I couldn't understand how it works... And I am not even sure if it's fit for what I wanted to do.

oguz ismail
  • 1
  • 16
  • 47
  • 69
William
  • 21
  • 5
  • 1
    Welcome to Stack Overflow. Please [make this question reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) by including code and example data in a plain text format - for example the output from `dput(yourdata)`. We cannot copy/paste data from images. – neilfws Nov 10 '20 at 03:37
  • Try league1819[grepl('Arsenal', league1819$HomeTeam)|grepl('Arsenal', league1819$AwayTeam), ] – Karthik S Nov 10 '20 at 03:40

2 Answers2

1

Using your df <- data.frame(c(-10:-1),c(-5:4),c(1:10)) example, and since you're (potentially) already using tidyverse, it is possible to achieve what you want using the code:

if(!require(tidyverse)) install.packages('tidyverse'); library(tidyverse) #to load the package, just in case you haven't already!
df <- data.frame(c(-10:-1),c(-5:4),c(1:10))
colnames(df) <- c("col1", "col2", "col3")
df %>% filter(col1 %in% "-5" | col2 %in% "-5")

or if you want rows with -5 in both columns, you can use:

df %>% filter(col1 %in% "-5" & col2 %in% "-5")

instead. For your leagues question, I'd do:

sample_Arsenal <- league1819 %>% filter(HomeTeam %in% "Arsenal" | AwayTeam %in% "Arsenal")
0

You can use grepl :

sampleArsenal <- subset(league1819, grepl('Aresenal', HomeTeam) | 
                                    grepl('Aresenal', AwayTeam))

Or if you want to try dplyr :

library(dplyr)
library(stringr)

league1819 %>% 
   filter(str_detect(HomeTeam, 'Aresenal') | str_detect(AwayTeam, 'Aresenal'))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Omg this worked perfectly. Thank you so much sir! Sorry for asking the question in a not-reproducible way as I am extremely new to R... And not sure how to generate a similar case. Really much appreciated! – William Nov 10 '20 at 03:57