I have a folder with 60-some .csv files containing "scraped" tweets. Each file has 6 columns: one is simply the "position" in which they got downloaded (might as well get rid of it), then I have one for the date, one for the time and then the 3 important ones: ticker (e.g. AAPL, MSFT...), body of the tweet a lastly a label available only for a small part of them. What I need to do is to run a for loop to be able to apply svm and NaiveBayes to every file so to predict the label for non-labeled tweets. I've already run both algorithms on the training set, the output from this loop would be the test set. Thank You in advance.
Asked
Active
Viewed 75 times
0
-
When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Ask a specific question that can be unambiguously answered. – MrFlick Feb 26 '18 at 21:48
-
I wasnt able to find the question you all linked, but I seem to have found an answer there, thank you all for the answers – TheClutch01 Feb 27 '18 at 14:13
1 Answers
2
We can get a character vector of all relevant files with list.files
fn <- list.files(pattern = ".csv$")
Then use a member of the *apply
family to read in the data and apply a function; for example, something like this:
lst <- lapply(fn, function(x) {
df <-read.csv(x);
## Example: Fit a linear model and return lm object
return(lm(y ~ x, data = df));
})
lst
then contains a list
of lm
objects. If you replace lm
with svm
, the return objects will be a list
of results from a support vector machine classification.

zx8754
- 52,746
- 12
- 114
- 209

Maurits Evers
- 49,617
- 4
- 47
- 68