These codes were working fine, but somehow although I didn't make any changes to this code block, it started to show arguments imply differing number of rows
error message.
When I run this code block, it doesn't show any error message:
pos.words <- scan('C:/Users/samet/Desktop/twitter_steam_text_mining/pozitifkelimeler.txt', what = 'character', comment.char = ';', skipNul = TRUE)
neg.words <- scan('C:/Users/samet/Desktop/twitter_steam_text_mining/negatifkelimeler.txt', what = 'character', comment.char = ';', skipNul = TRUE)
score.sentiment <- function(sentences, pos.words, neg.words, .progress='none')
{
require(plyr)
require(stringr)
scores <- laply(sentences, function(sentence, pos.words, neg.words)
{
# clean up sentences with R's regex-driven global substitute, gsub() function:
sentence <- gsub('https://','',sentence)
sentence <- gsub('http://','',sentence)
sentence <- gsub('[^[:graph:]]', ' ',sentence)
sentence <- gsub('[[:punct:]]', '', sentence)
sentence <- gsub('[[:cntrl:]]', '', sentence)
sentence <- gsub('\\d+', '', sentence)
sentence <- str_replace_all(sentence,"[^[:graph:]]", " ")
# and convert to lower case:
sentence <- tolower(sentence)
# split into words. str_split is in the stringr package
word.list <- str_split(sentence, '\\s+')
# sometimes a list() is one level of hierarchy too much
words <- unlist(word.list)
# compare our words to the dictionaries of positive & negative terms
pos.matches <- match(words, pos.words)
neg.matches <- match(words, neg.words)
# match() returns the position of the matched term or NA
# we just want a TRUE/FALSE:
pos.matches <- !is.na(pos.matches)
neg.matches <- !is.na(neg.matches)
# TRUE/FALSE will be treated as 1/0 by sum():
score <- sum(pos.matches) - sum(neg.matches)
return(score)
}, pos.words, neg.words, .progress=.progress )
scores.df <- data.frame(score=scores, text=sentences)
return(scores.df)
}
but it shows Error in data.frame(score = scores, text = sentences) : arguments imply differing number of rows: 17, 7552
error message when I run this code afterward:
analysis <- score.sentiment(tweet_clean, pos.words, neg.words)
I had 17 cols in my dataset and I still have 17 cols but it stopped working. I can't find the problem here.
Some sample data:
pos.words <- c("zenginlik", "zerafet", "zevk", "zevkle", "zevkli")
neg.words <- c("zindan", "ziyan", "zor", "zoraki", "zorba")
tweet_clean <-
structure(list(...1 = c("1", "2"), text = c(" steam e liradan girmis biraz ucuzlarsa alacagim ama ayi bulur tahminen ",
" hollanda kvl hotel buhar odasi netherlands kvl hotel steam room yurt disi projelerimizde avrupa da ülkeyi r "
), favorited = c("FALSE", "FALSE"), favoriteCount = c(0, 0),
replyToSN = c("socalledevelopr", NA), created = structure(c(1642422257,
1642421795), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
truncated = c("FALSE", "TRUE"), replyToSID = c("1482169113778737154",
NA), id = c("1483052554925879298", "1483050615282491392"),
replyToUID = c("1476703946907525120", NA), statusSource = c("<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>",
"<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>"
), screenName = c("NihiluX89", "teknohavuz"), retweetCount = c(0,
0), isRetweet = c("FALSE", "FALSE"), retweeted = c("FALSE",
"FALSE"), longitude = c(NA_character_, NA_character_), latitude = c(NA_character_,
NA_character_)), row.names = 1:2, class = "data.frame")
Here is my
sessionInfo()
:
R version 4.1.1 (2021-08-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252
system code page: 1254
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] reshape2_1.4.4 readxl_1.3.1 xlsx_0.6.5 widyr_0.1.4 janeaustenr_0.1.5 forcats_0.5.1 tidyr_1.1.4
[8] tibble_3.1.6 tidyverse_1.3.1 magrittr_2.0.1 plotly_4.10.0 ggeasy_0.1.3 rtweet_0.7.0 networkD3_0.4
[15] glue_1.6.0 igraph_1.2.11 lubridate_1.8.0 hms_1.1.1 stringi_1.7.6 readr_2.1.1 tidytext_0.3.2
[22] stringr_1.4.0 ggplot2_3.3.5 wordcloud_2.6 RColorBrewer_1.1-2 openssl_1.4.6 SnowballC_0.7.0 RCurl_1.98-1.5
[29] purrr_0.3.4 dplyr_1.0.7 plyr_1.8.6 tm_0.7-8 NLP_0.2-1 twitteR_1.1.9 ROAuth_0.9.6
loaded via a namespace (and not attached):
[1] bitops_1.0-7 fs_1.5.2 bit64_4.0.5 httr_1.4.2 tools_4.1.1 backports_1.4.1 utf8_1.2.2
[8] R6_2.5.1 DBI_1.1.2 lazyeval_0.2.2 colorspace_2.0-2 withr_2.4.3 tidyselect_1.1.1 curl_4.3.2
[15] bit_4.0.4 compiler_4.1.1 cli_3.1.0 rvest_1.0.2 xml2_1.3.3 labeling_0.4.2 slam_0.1-49
[22] scales_1.1.1 askpass_1.1 digest_0.6.29 pkgconfig_2.0.3 htmltools_0.5.2 dbplyr_2.1.1 fastmap_1.1.0
[29] htmlwidgets_1.5.4 rlang_0.4.12 rstudioapi_0.13 farver_2.1.0 generics_0.1.1 jsonlite_1.7.2 qdapRegex_0.7.2
[36] tokenizers_0.2.1 textclean_0.9.3 Matrix_1.3-4 Rcpp_1.0.7 munsell_0.5.0 fansi_0.5.0 lifecycle_1.0.1
[43] grid_4.1.1 parallel_4.1.1 crayon_1.4.2 lattice_0.20-44 haven_2.4.3 xlsxjars_0.6.1 pillar_1.6.4
[50] rjson_0.2.20 reprex_2.0.1 data.table_1.14.2 modelr_0.1.8 vctrs_0.3.8 tzdb_0.2.0 cellranger_1.1.0
[57] gtable_0.3.0 assertthat_0.2.1 broom_0.7.11 viridisLite_0.4.0 rJava_1.0-6 ellipsis_0.3.2