0

I can read data as a list of baskets rather than a data frame from an url like this:

url <- "http://www.salemmarafi.com/wp-content/uploads/2014/03/groceries.csv"
baskets <- strsplit(readLines(url), ",", fixed=TRUE)
> head(baskets,2)
[[1]]
[1] "citrus fruit"        "semi-finished bread" "margarine"           "ready soups"        

[[2]]
[1] "tropical fruit" "yogurt"         "coffee" 

But when I try to accomplish the same by loading data from a file the result is:

baskets2 <- strsplit(readLines("groceries.csv"), ",", fixed=TRUE)
> head(baskets2,2)
[[1]]
[1] "citrus fruit;semi-finished bread;margarine;ready soups;;;;;;;;;;;;;;;;;;;;;;;;;;;;"

[[2]]
[1] "tropical fruit;yogurt;coffee;;;;;;;;;;;;;;;;;;;;;;;;;;;;;"

And with ";" it's:

> baskets2 <- strsplit(readLines("groceries.csv"), ";", fixed=TRUE)
> head(baskets2,2)
[[1]]
 [1] "citrus fruit"        "semi-finished bread" "margarine"           "ready soups"        
 [5] ""                    ""                    ""                    ""                   
 [9] ""                    ""                    ""                    ""                   
etc

How could I get the data to load similarly from a file on my computer (ie C:/mypath/groceries.csv) as it is loading from an url ie without the empty items ";;;;"?

*EDIT: This is only one csv and I'm trying to avoid loading it into a data frame.

ElinaJ
  • 791
  • 1
  • 6
  • 18
  • Possible duplicate of [Read multiple CSV files into separate data frames](http://stackoverflow.com/questions/5319839/read-multiple-csv-files-into-separate-data-frames) – alki Mar 06 '16 at 04:30
  • Try `read.csv()`. Then turn it into a list using `as.list()`. No calls to `strsplit()` are needed. – Richard Border Mar 06 '16 at 05:24
  • @Chani I don't think it is a duplicate of that question, though it might be of others – Richard Border Mar 06 '16 at 05:25
  • The solution provided by @Buckminster gives list with empty strings. An alternative is first use `readLines` and then `lapply(lines, function(x){ unlist(strsplit(x, ",")) })` – jbkunst Mar 08 '16 at 14:37
  • 1
    As @jbkunst also provided an incorrect hint, here is a solution: `groc <- read.csv('groceries.csv', header = F)` `apply(groc, 1, function(x) unique(x)[unique(x)!=''])`. This can be optimized if you need to do this for bigger files but works pretty quick for your example. – Richard Border Mar 09 '16 at 18:21

0 Answers0