It might seem a silly question but how to repeat this line for 152 times and I would not like to use a for loop,since later it will not be efficient with larger data sets:
reviews = as.vector(t(mydata)[,1])
mydata is a row in a data.frame and reviews is an array of characters, also [,1] is just the first row
The output could be a matrix or worst case a data.frame.
I tried something like this, but it did not work :
testing = apply(mydata, 1, function(x) {as.vector(t(mydata[,x]))})
Error in t(mydata)[, x] : subscript out of bounds
Thanks.
EDIT: Quick data sample:
> reviews = as.vector(t(mydata)[,1])
> class(reviews)
[1] "character"
> length(reviews)
[1] 14
> reviews
[1] "I was involuntarily"
[2] "I was in transit"
[3] "My initial flight"
[4] "That still left"
[5] "After disembarking"
[6] "customs and proceed to my gate."
[7] "I arrived"
[8] "When my boarding pass was scanned"
[9] "No reason was given for the bump."
[10] "The UA gate staff"
[11] "I boarded Air Canada."
[12] "After arriving"
[13] "I spent 5 hours"
[14] NA
mydata data.frame:
> class(mydata)
[1] "data.frame"
> length(mydata[,1])
[1] 152
> mydata[,1]
[1] I was involuntarily... .
[2] First time... .
...
...
152 Levels: First time . ...
I have about 30.000 of these, but I want to start small, so only 152 of paragraphs split in individual sentence and put into a data.frame. Each row in the data.frame has 5-15 sentences.
I want to to be able to access each row as an array since I need to perform some action on each row of the data.frame
Packages used: plyr, sentiment(downloaded from here and installed manually)
EDIT 2:
dput(myData[1:6, 1:6])
structure(list(V1 = structure(c(70L, 41L, 94L, 114L, 47L, 49L),
.Label = c(" Air Canada",
"their service",
"hours for de-icing",
"have flown BA",
"my booking",
"If the video screen",
"Frankfurt flights",
"and another 150 lines of text data",