0

I'm working on a web scraping project. Each webpage has a table with 101 rows. The main chunk of code shows the initial attempt to pull info from the webpage that pulls values from a different column into an empty vector that meet conditions being met of a different column.

The append call is not working, and I'm not sure what's going on. Could someone help guide me in the right direction, and explain to me why that particular way works? I appreciate any thoughts or ideas.

wl_record <- c()

dateof <- c()

con_1 <- str_detect(misawa$X4, "Mitsuharu Misawa defeats")

con_2 <- str_detect(misawa$X4, "defeats Misuharu Misawa")

for(i in misawa$X4){
  if (str_detect(i, "Mitsuharu Misawa defeats") == TRUE) {
    dateof[con_1] <- misawa$X2
    wl_record[con_1] <- "win"
  } else if(str_detect(i, "defeats Mitsuharu Misawa") == TRUE) {
    append(dateof[con_1], misawa$X2) 
    append(wl_record[con_1], misawa$X2) <- "loss"
  }
}

*EDIT An excerpt from the misawa data frame is below. The columns are X1 = record number, X2 = date, X3 = blank, X4 = match information. The string detect is important, as I'm just looking to pull singles match data. "defeat" indicates a tag match, while "defeats" indicates a singles match:

X1    X2       X3   X4
901 10.12.2000     Daisuke Ikeda & Mitsuharu Misawa defeat Kenta Kobashi & Takeshi Rikio (21:55)NOAH The Final Navigation - Tag 7 - Event @ Act City Hamamatsu in Hamamatsu, Shizuoka, Japan
902 08.12.2000      Takao Omori & Yoshihiro Takayama defeat Mitsuharu Misawa & Yoshinari Ogawa (18:41)NOAH The Final Navigation - Tag 6 - Event @ Miyagi Sports Center in Sendai, Miyagi, Japan
903 07.12.2000       Mitsuharu Misawa & Yoshinari Ogawa defeat Jun Akiyama & Takeshi Morishima (15:34)NOAH The Final Navigation - Tag 5 - Event @ Odate Citizen Gymnasium in Odate, Japan

The result of the for loop is a vector that is equal in length to the original data frame. The rows not meeting the condition show up as NA. The goal is to append each vector until I have all of the data, then combine dateof and wl_record into a dataframe that the NA's will then be removed from.

wnettles
  • 17
  • 5
  • 1
    Hi wnettles, welcome to Stack Overflow. It is not clear what you are trying to do with your code. It will be much easier to help if you provide at least a sample of your data with `dput(misawa[1:20,])` as well as your expected output. You can [edit] your question and paste the output. Please surround the output with three backticks (```) for better formatting. See [How to make a reproducible example](https://stackoverflow.com/questions/5963269/) for more info. – Ian Campbell Jul 04 '20 at 17:38

1 Answers1

0

Seems it should be:

append(dateof[con_1], misawa$X2) 
    append(wl_record[con_1], misawa$X2) <- "loss"

not:

append(dateof[con_1], misawaX2) 
    append(wl_record[con_1], misawaX2) <- "loss"
Dave2e
  • 22,192
  • 18
  • 42
  • 50
  • Thanks for your answer! That was a typo on my part. I fixed that part and ran it again, but same issue, the appends are not happening. – wnettles Jul 04 '20 at 23:23