0

Hi I have 2 dataframes with 3 similar columns home, visitor and date

I would like to extract the rows from dataframe italy that match newChamps upon these conditions:

newItaly$home == newChamps$home | newItaly$visitor == newChamps$visitor & newItaly$Date >newChamps$Date 

newItaly and newChamps do not have the same amount of rows.

Update:

I am still not able to get the results correctly. This is the code:

install_github('jalapic/engsoccerdata', username = "jalapic")
LoadLibraries <- function(){
  library(stringr)
  library(plyr)
  library(devtools)
  library(engsoccerdata)
}

ChampsData <- function(){
  filteredChamps <- champs[champs$hcountry == "ITA" | champs$vcountry == "ITA", ]
  finalChamps <- subset(filteredChamps, select = -c(round, leg, FT, HT, aet, pens, FTagg_home, FTagg_visitor, aethgoal, aetvgoal, tothgoal, totvgoal, totagg_home, totagg_visitor, tiewinner) )
  finalChamps$Date <- as.Date(finalChamps$Date, "%y/%m/%d")
  finalChamps[,"Results"] <- NA
  finalChamps$Results[finalChamps$hcountry == 'ITA' & finalChamps$hgoal > finalChamps$vgoal] <- "WIN"
  finalChamps$Results[finalChamps$hcountry == 'ITA' & finalChamps$hgoal < finalChamps$vgoal] <- "LOSS"
  finalChamps$Results[finalChamps$vcountry == 'ITA' & finalChamps$vgoal > finalChamps$hgoal] <- "WIN"
  finalChamps$Results[finalChamps$vcountry == 'ITA' & finalChamps$vgoal < finalChamps$hgoal] <- "LOSS"
  finalChamps$Results[finalChamps$vgoal == finalChamps$hgoal] <- "DRAW"
  finalChamps<-  finalChamps[order(finalChamps$Date),] 
  return(finalChamps)
}

ItalyData <- function(){
  amendedItaly<- subset(italy, italy$Season>1954 & italy$Season<2016)
  amendedItaly<-  amendedItaly[order(amendedItaly$Date),] 
  amendedItaly$Date <- as.Date(amendedItaly$Date, "%y/%m/%d")
  finalItaly <- subset(amendedItaly, select = -c(FT, tier) )
  finalItaly[,"Results"] <- NA
  finalItaly$Results <- ifelse(finalItaly$hgoal < finalItaly$vgoal, finalItaly$visitor, finalItaly$home)
  finalItaly$Results[finalItaly$hgoal == finalItaly$vgoal] <- "DRAW"
  return(finalItaly)
}



LoadLibraries()
newChamps <- ChampsData()
newItaly <- ItalyData()
t<- newItaly[which(newItaly$home %in% unique(newChamps$home) | newItaly$visitor %in% unique(newChamps$visitor) & newItaly$Date > newChamps$Date),] 

Basically I am trying to match teams that played in the champions league and teams that played in the italian league who had a game in midweek and another at the end of the week. ex: If Milan played on 2/5/2018 (Champions League) and Milan played on 6/5/2018 (Italian league)

user9737581
  • 41
  • 1
  • 8
  • @griffinevo Your suggestion will only does pairwise comparisons... first row to first row, second row to second row, etc. Can't be completely sure without an example from OP, I think that is much more restrictive than they are looking for. It's actually basically the same code as what OP has but with `df1[]` wrapped around it. – Gregor Thomas May 03 '18 at 20:41
  • Please provide a [minimal reproducible example with sample input and desired output. See this link for tips on doing that](https://stackoverflow.com/q/5963269/903061), built-in data, simulated data, or data shared with `dput()` all work well. – Gregor Thomas May 03 '18 at 20:44

1 Answers1

2

I think you are looking to do something like this:

newItaly[which(newItaly$home %in% unique(newChamps$home) | 
               newItaly$visitor %in% unique(newChamps$visitor) & 
               newItaly$Date > max(newChamps$Date) ),] 

EDIT

The which is optional, you can directly do:

newItaly[newItaly$home %in% unique(newChamps$home) | 
         newItaly$visitor %in% unique(newChamps$visitor) & 
         newItaly$Date > max(newChamps$Date),] 
DJack
  • 4,850
  • 3
  • 21
  • 45
  • It works if you use the exact same code than mine: `newItaly$Date > max(newChamps$Date)`. – DJack May 04 '18 at 07:05
  • You cannot compare two vectors of different size, it's why I have arbitrarily set the max condition that compares a vector with a single value (but not sure it's what you really want, you need to think about it). – DJack May 04 '18 at 07:08
  • It works on my machine using your script + my code. Be careful to typos. – DJack May 04 '18 at 07:57