0

I have data, for example --

foo
bar, john
bloggs
smith
william
jones, doug

I want to turn these into a list, where elements are foo, bar, john, bloggs etc. I have tried to use a flatmap from the purrr package which gives me a useless mess of a dataframe. I have also tried using a list like so, which very helpfully gives me the list I started with.

var_list = list()
i = 1
for (variable in variables_list) {
    split = strsplit(variable, ',')
    for (s in split) {
        var_list[[i]] = trimws(s)
        i = i + 1
    }
}

In Java, I could do something like this:

list.stream()
        .flatMap(s -> Stream.of(s.split(",")))
        .map(String::trim)
        .collect(Collectors.toList());

And accomplish this all in one line. As a secondary thing, since R bills itself as a functional language, is it possible to flatmap the data directly in something of a one-liner like in Java?

ifly6
  • 5,003
  • 2
  • 24
  • 47

3 Answers3

2

Most functions in R are vectorized so you don't have to explicitly map, for example you can so

trimws(unlist(strsplit(unlist(strsplit(x, "\n")), ",")))
# [1] "foo"     "bar"     "john"    "bloggs"  "smith"   "william" "jones"  
# [8] "doug" 

where

x<-"foo
bar, john
bloggs
smith
william
jones, doug"
MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • I don't know why, but this doesn't work. It tells me that `strsplit` must take a character argument – ifly6 Apr 11 '18 at 19:41
  • @ifly6 did you try exactly what I wrote the the sample data here? you didn't make it clear exactly how you loaded your data into R so I took a guess with my `x` value. Maybe you read it in as factors? When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Provide a `dput()` so we know exactly how your data is stored. – MrFlick Apr 11 '18 at 19:43
  • Just implemented the entire thing in `pandas`, took like 2 minutes. However, for reference, how would I coerce a list's elements to character? – ifly6 Apr 11 '18 at 19:54
1

You can use unlist to flatten a list:

> x <- c("foo", "bar, john", "bloggs", "smith", "william", "jones, doug")
> x
[1] "foo"         "bar, john"   "bloggs"      "smith"       "william"     "jones, doug"
> unlist(strsplit(x, ","))
[1] "foo"     "bar"     " john"   "bloggs"  "smith"   "william" "jones"   " doug" 
C. Braun
  • 5,061
  • 19
  • 47
0

Not sure what flatmapping is, but if your data exists in a text file you could do something like this:

pth <- "/path/to/file.txt"
gsub(",","",scan(pth,""))
Read 8 items
[1] "foo"     "bar"     "john"    "bloggs"  "smith"   "william" "jones"   "doug"

Extracted the parts and removed commas with the gsub function.

If you really do want your output in a list instead of a vector then split it:

split(x,seq_along(x))
Brian Davis
  • 990
  • 5
  • 11