0

I have a folder with 60-some .csv files containing "scraped" tweets. Each file has 6 columns: one is simply the "position" in which they got downloaded (might as well get rid of it), then I have one for the date, one for the time and then the 3 important ones: ticker (e.g. AAPL, MSFT...), body of the tweet a lastly a label available only for a small part of them. What I need to do is to run a for loop to be able to apply svm and NaiveBayes to every file so to predict the label for non-labeled tweets. I've already run both algorithms on the training set, the output from this loop would be the test set. Thank You in advance.

  • When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Ask a specific question that can be unambiguously answered. – MrFlick Feb 26 '18 at 21:48
  • I wasnt able to find the question you all linked, but I seem to have found an answer there, thank you all for the answers – TheClutch01 Feb 27 '18 at 14:13

1 Answers1

2

We can get a character vector of all relevant files with list.files

fn <- list.files(pattern = ".csv$")

Then use a member of the *apply family to read in the data and apply a function; for example, something like this:

lst <- lapply(fn, function(x) {
    df <-read.csv(x);
    ## Example: Fit a linear model and return lm object
    return(lm(y ~ x, data = df));
})

lst then contains a list of lm objects. If you replace lm with svm, the return objects will be a list of results from a support vector machine classification.

zx8754
  • 52,746
  • 12
  • 114
  • 209
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68