2

I'm a beginning R user and have this List of 5 vectors:

[[1]]
[,1] [,2]     
[1,] ""   "EWR/MIA"

[[2]]
 [,1] [,2]     
[1,] ""   "MIA/JFK"

[[3]]
 [,1] [,2]     
[1,] ""   "FLR/BRU"
[2,] ""   "BRU/EVN"

[[4]]
 [,1] [,2]     
[1,] ""   "FCO/JFK"
[2,] ""   "BOS/FCO"

[[5]]
 [,1] [,2]

This list was create from a str_match_all function that I used on a data frame of 5 rows.

How do I create a new data frame that combines these results into 6 rows? Furthermore, I'd like to be able to split each result into two columns (e.g. EWR in column 1 and MIA in column 2).

Thanks!

EDIT: Here is my data frame:

> dput(Egencia.input)
structure(list(Domestic...International = structure(c(2L, 1L, 
1L, 2L, 2L), .Label = c("Domestic", "International"), class = "factor"), 
Ticketing.carrier = structure(c(3L, 2L, 3L, 1L, 1L), .Label = c("Air France", 
"American Airlines", "Delta"), class = "factor"), Routing = structure(c(1L, 
4L, 3L, 2L, 5L), .Label = c("EWR/MIA", "FCO/JFK_BOS/FCO", 
"FLR/BRU/EVN", "MIA/JFK", "New York (Penn S/New Carrollton,M"
), class = "factor")), row.names = c(NA, -5L), class = "data.frame")

And the code I'm using:

Egencia.input <- read.csv("/Users/nliusont/Documents/NYU/R/test2.csv", header=T)

city.pair.temp <- "(?=([A-Z]{3}/[A-Z]{3}))"

city.pairs <- str_match_all(Egencia.input$Routing, city.pair.temp)
nliusont
  • 21
  • 2
  • Please add your data.frame and the code you used to get to this output. There might be a better way than do this fixing afterwards. See [here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) on how to do this – phiver Jul 05 '18 at 15:22
  • @phiver Added data frame and existing code as requested. – nliusont Jul 05 '18 at 15:43

3 Answers3

0

We can use map

library(tidyverse)
map_df(lst, ~ 
          .x %>%
              as.data.frame %>%
              separate(V2, into = c("V2", "V3"), sep="/")) %>%
  select(-V1)

data

lst <- list(cbind("", "EWR/MIA"), cbind("", "MIA/JFK"), 
          cbind(c("", ""), c("FLR/BRU", "BRU/EVN")))
akrun
  • 874,273
  • 37
  • 540
  • 662
0

Since you have a list of matrices with the same number of columns, the easiest way is just to rbind them:

result = do.call(rbind, city.pairs)

The pattern do.call(f, list(a, b, c)) is an alternative way of writing f(a, b, c). Since you have a list, you need to use do.call rather than the direct call.

Furthermore, I'd like to be able to split each result into two columns

Adjust your regular expression then:

city.pair.temp <- "(?=([A-Z]{3})/([A-Z]{3}))"

This way, you get two separate match groups, before and after the slash.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
0

Based on your Eugenia.input I created a workflow that can handle this in one go. It might look a bit complicated but it is not that difficult.

First, I added an id to keep track of the records, next I replace the _ in JFK_BOS into a / as JFK is a stopover before going on to Boston from Rome. Thirdly, I used your regex to extract all the airport codes and in the third mutate step I removed all the empty colums that are created in the list. Once this is done you can unnest the Routing list which acts as a sort of separate_rows. After this you can just split the columns into from and to.

library(purrr)
library(dplyr)
library(tidyr)

city.pair.temp <- "(?=([A-Z]{3})/([A-Z]{3}))"

Egencia.output <- Egencia.input %>% 
  mutate(id = row_number(),
         Routing = stringr::str_replace(Routing, "_", "/"),
         Routing = stringr::str_match_all(Routing, city.pair.temp),
         Routing = map(Routing, function(x) x[x != ""])) %>% 
  unnest(Routing) %>% 
  separate(Routing, into = c("from", "to"))

Egencia.output
  Domestic...International Ticketing.carrier id from  to
1            International             Delta  1  EWR MIA
2                 Domestic American Airlines  2  MIA JFK
3                 Domestic             Delta  3  FLR BRU
4                 Domestic             Delta  3  BRU EVN
5            International        Air France  4  FCO JFK
6            International        Air France  4  JFK BOS
7            International        Air France  4  BOS FCO

The entry with "New York (Penn S/New Carrollton,M" is, if I'm not mistaken, a train journey from New York to New Carrollton, Maryland. I'm not sure if that shouldn't be classified as domestic travel.

phiver
  • 23,048
  • 14
  • 44
  • 56