arguments imply differing number of rows in R

Question

These codes were working fine, but somehow although I didn't make any changes to this code block, it started to show arguments imply differing number of rows error message.

When I run this code block, it doesn't show any error message:

pos.words <- scan('C:/Users/samet/Desktop/twitter_steam_text_mining/pozitifkelimeler.txt', what = 'character', comment.char = ';', skipNul = TRUE)
neg.words <- scan('C:/Users/samet/Desktop/twitter_steam_text_mining/negatifkelimeler.txt', what = 'character', comment.char = ';', skipNul = TRUE)

    score.sentiment <- function(sentences, pos.words, neg.words, .progress='none')
      
    {
      require(plyr)
      require(stringr)
      
      scores <- laply(sentences, function(sentence, pos.words, neg.words)
        
      {
        
        # clean up sentences with R's regex-driven global substitute, gsub() function:
        sentence <- gsub('https://','',sentence)
        sentence <- gsub('http://','',sentence)
        sentence <- gsub('[^[:graph:]]', ' ',sentence)
        sentence <- gsub('[[:punct:]]', '', sentence)
        sentence <- gsub('[[:cntrl:]]', '', sentence)
        sentence <- gsub('\\d+', '', sentence)
        sentence <- str_replace_all(sentence,"[^[:graph:]]", " ")
        # and convert to lower case:
        sentence <- tolower(sentence)
        
        # split into words. str_split is in the stringr package
        word.list <- str_split(sentence, '\\s+')
        # sometimes a list() is one level of hierarchy too much
        words <- unlist(word.list)
        
        # compare our words to the dictionaries of positive & negative terms
        pos.matches <- match(words, pos.words)
        neg.matches <- match(words, neg.words)
        
        # match() returns the position of the matched term or NA
        # we just want a TRUE/FALSE:
        pos.matches <- !is.na(pos.matches)
        neg.matches <- !is.na(neg.matches)
        
        # TRUE/FALSE will be treated as 1/0 by sum():
        score <- sum(pos.matches) - sum(neg.matches)
        
        return(score)
      }, pos.words, neg.words, .progress=.progress )
      
      scores.df <- data.frame(score=scores, text=sentences)
      return(scores.df)
    }

but it shows Error in data.frame(score = scores, text = sentences) : arguments imply differing number of rows: 17, 7552 error message when I run this code afterward:

analysis <- score.sentiment(tweet_clean, pos.words, neg.words)

I had 17 cols in my dataset and I still have 17 cols but it stopped working. I can't find the problem here.

Some sample data:

pos.words <- c("zenginlik", "zerafet", "zevk", "zevkle", "zevkli")

neg.words <- c("zindan", "ziyan", "zor", "zoraki", "zorba")

tweet_clean <- 
structure(list(...1 = c("1", "2"), text = c(" steam e  liradan girmis biraz ucuzlarsa alacagim ama  ayi bulur tahminen ", 
" hollanda kvl hotel buhar odasi  netherlands kvl hotel steam room yurt disi projelerimizde avrupa da  ülkeyi r "
), favorited = c("FALSE", "FALSE"), favoriteCount = c(0, 0), 
    replyToSN = c("socalledevelopr", NA), created = structure(c(1642422257, 
    1642421795), class = c("POSIXct", "POSIXt"), tzone = "UTC"), 
    truncated = c("FALSE", "TRUE"), replyToSID = c("1482169113778737154", 
    NA), id = c("1483052554925879298", "1483050615282491392"), 
    replyToUID = c("1476703946907525120", NA), statusSource = c("<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>", 
    "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>"
    ), screenName = c("NihiluX89", "teknohavuz"), retweetCount = c(0, 
    0), isRetweet = c("FALSE", "FALSE"), retweeted = c("FALSE", 
    "FALSE"), longitude = c(NA_character_, NA_character_), latitude = c(NA_character_, 
    NA_character_)), row.names = 1:2, class = "data.frame")

Here is my sessionInfo():

R version 4.1.1 (2021-08-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    
system code page: 1254

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] reshape2_1.4.4     readxl_1.3.1       xlsx_0.6.5         widyr_0.1.4        janeaustenr_0.1.5  forcats_0.5.1      tidyr_1.1.4       
 [8] tibble_3.1.6       tidyverse_1.3.1    magrittr_2.0.1     plotly_4.10.0      ggeasy_0.1.3       rtweet_0.7.0       networkD3_0.4     
[15] glue_1.6.0         igraph_1.2.11      lubridate_1.8.0    hms_1.1.1          stringi_1.7.6      readr_2.1.1        tidytext_0.3.2    
[22] stringr_1.4.0      ggplot2_3.3.5      wordcloud_2.6      RColorBrewer_1.1-2 openssl_1.4.6      SnowballC_0.7.0    RCurl_1.98-1.5    
[29] purrr_0.3.4        dplyr_1.0.7        plyr_1.8.6         tm_0.7-8           NLP_0.2-1          twitteR_1.1.9      ROAuth_0.9.6      

loaded via a namespace (and not attached):
 [1] bitops_1.0-7      fs_1.5.2          bit64_4.0.5       httr_1.4.2        tools_4.1.1       backports_1.4.1   utf8_1.2.2       
 [8] R6_2.5.1          DBI_1.1.2         lazyeval_0.2.2    colorspace_2.0-2  withr_2.4.3       tidyselect_1.1.1  curl_4.3.2       
[15] bit_4.0.4         compiler_4.1.1    cli_3.1.0         rvest_1.0.2       xml2_1.3.3        labeling_0.4.2    slam_0.1-49      
[22] scales_1.1.1      askpass_1.1       digest_0.6.29     pkgconfig_2.0.3   htmltools_0.5.2   dbplyr_2.1.1      fastmap_1.1.0    
[29] htmlwidgets_1.5.4 rlang_0.4.12      rstudioapi_0.13   farver_2.1.0      generics_0.1.1    jsonlite_1.7.2    qdapRegex_0.7.2  
[36] tokenizers_0.2.1  textclean_0.9.3   Matrix_1.3-4      Rcpp_1.0.7        munsell_0.5.0     fansi_0.5.0       lifecycle_1.0.1  
[43] grid_4.1.1        parallel_4.1.1    crayon_1.4.2      lattice_0.20-44   haven_2.4.3       xlsxjars_0.6.1    pillar_1.6.4     
[50] rjson_0.2.20      reprex_2.0.1      data.table_1.14.2 modelr_0.1.8      vctrs_0.3.8       tzdb_0.2.0        cellranger_1.1.0 
[57] gtable_0.3.0      assertthat_0.2.1  broom_0.7.11      viridisLite_0.4.0 rJava_1.0-6       ellipsis_0.3.2

You should include sample values of `pos.words` and `neg.words` (and any other missing bits) so that the code in your question will run and produce the same error for us. — user2554330, Jan 18 '22 at 19:35
@user2554330 these are from pos.words txt file: zenginlik, zerafet, zevk, zevkle, zevkli And these are from neg.words txt file: zorlu, zorluk, zorluklar, zulmeden — Samet Turgut, Jan 18 '22 at 19:46
FYI, you're using `require` incorrectly: it does not stop flow if the packages are not found. Either capture the return value from `require(.)` and do something with it, or use `library(.)` and have it error if the package is not available. See https://stackoverflow.com/a/51263513/3358272, https://yihui.org/en/2014/07/library-vs-require/, https://r-pkgs.org/namespace.html#search-path — r2evans, Jan 18 '22 at 20:00
@r2evans thanks for the info. I have library(.) before these codes, I wish that was the problem here. But you're right. — Samet Turgut, Jan 18 '22 at 20:16
What I had in mind is more what is described here: https://stackoverflow.com/a/5963610/2554330 — user2554330, Jan 18 '22 at 20:31
@user2554330 I edited the post, I hope this much is okay. If not, I can edit again. — Samet Turgut, Jan 18 '22 at 21:04
Now I can run your code, and I get an error something like yours, but I'm afraid I can't help you much: the code doesn't really make sense to me. Maybe you intended to run `score.sentiment(tweet_clean$text, pos.words, neg.words)` ? — user2554330, Jan 18 '22 at 21:23
@user2554330 I don't know what to say... It works. But it was working without the $text part before. I guess some package caused this. This little mistake made me really mad and I'm so stupid that I did not even try writing a little $text to my code. I really appreciate it mate. — Samet Turgut, Jan 18 '22 at 21:35

arguments imply differing number of rows in R

0 Answers0