0

I am collecting data from a webpage and I keep getting the error below. I think it might have to do with the CSS patterns but I can't seen to find one that is general enough to collect the data from the website.

Error in rbindlist(DATA, fill = TRUE) : Column 2 of item 42 is length 15 inconsistent with column 6 which is length 17. Only length-1 columns are recycled.

Also here is the code that I used with the CSS Patterns.


pattern_names <- c("Username.Status","Topic.Title","Post.Date","Post","Reply.Number")

topic_filter <- c("topic\\=\\d+\\.\\d+", "board\\=12\\.\\d+$")

Rcrawler("https://ssdfacts.com/forum/index.php?board=12.0",
         no_cores = 4, no_conn = 4, MaxDepth = 4, RequestsDelay = 0.1,
         # dataUrlfilter = topic_data_filter,
         crawlUrlfilter = topic_filter,
         ExtractCSSPat = ssdfacts,
         PatternsNames = pattern_names,
         ManyPerPattern = TRUE,
         saveOnDisk = FALSE)

url_list <- rename( select(INDEX, Id, Url), PageID = Id)
url_list$PageID <- as.numeric(url_list$PageID)
ssdfacts_data <- rbindlist(DATA) %>%
  left_join(url_list, by="PageID")
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
Ngun Tial
  • 1
  • 1
  • 3
    You need to post a reproducible example of the data the produces this error. – s_baldur Mar 05 '20 at 14:12
  • Welcome! You may have a look at https://stackoverflow.com/questions/5963269/ . In the meantime, does this illustrate the issue? `consistent_column_length <- list( a = 1:3, b = 4:6 ) ; inconsistent_column_length <- list( a = 1:5, c = 1:7 ) ; DATA <- list( consistent_column_length, inconsistent_column_length ) ; data.table::rbindlist( DATA, fill = TRUE )` – Aurèle Mar 05 '20 at 16:15
  • 1
    If yes, the issue is essentially that we are manipulating a list of ragged arrays, and there is no standard way to convert a ragged array into a data frame, it really depends on what the data means. – Aurèle Mar 05 '20 at 16:17
  • I have edited it but I am not sure if that is a reproducible example~ – Ngun Tial Mar 06 '20 at 16:27

0 Answers0