0

This is a continuation from my question earlier: Dplyr select_ and starts_with on multiple values in a variable list

I am collecting data from differnt sensors in various locations, data output is something like:

df<-data.frame(date=c(2011,2012,2013,2014,2015),"Sensor1 Temp"=c(15,18,15,14,19),"Sensor1 Pressure"=c(1001, 1000, 1002, 1004, 1000),"Sensor1a Temp"=c(15,18,15,14,19),"Sensor1a Pressure"=c(1001, 1000, 1002, 1004, 1000), "Sensor2 Temp"=c(15,18,15,14,19),"Sensor2 Pressure"=c(1001, 1000, 1002, 1004, 1000), "Sensor2 DewPoint"=c(10,11,10,9,12),"Sensor2 Humidity"=c(90, 100, 90, 100, 80))

The problem is (I think) similar to: Using select_ and starts_with R or select columns based on multiple strings with dplyr

I want to search for sensors for example by location so I have a list to search through the dataframe and also include the timestamp. But searching falls apart when I search for more than one sensor (or type of sensor etc). Is there a way of using dplyr (NSE or SE) to achieve this?

FindLocation = c("date", "Sensor1", "Sensor2")
df %>% select(matches(paste(FindLocation, collapse="|"))) # works but picks up "Sensor1a" and "DewPoint" and "Humidity" data from Sensor2 

Also I want to add mixed searches such as:

 FindLocation = c("Sensor1", "Sensor2") # without selecting "Sensor1a"
 FindSensor = c("Temp", "Pressure") # without selecting "DewPoint" or "Humidity"

I am hoping the select combines FindSensor with FindLocation and selects Temp and Pressure data for Sensor1 and Sensor2 (without selecting Sensor1a). Returning the dataframe with the data and the columns headings:

date, Sensor1 Temp, Sensor1 Pressure, Sensor2 Temp, Sensor2 Pressure

Many thanks again!

Bhav Shah
  • 167
  • 3
  • 10

3 Answers3

2

What about something like:

library(tidyverse)
wich_col <- df %>% names %>% strsplit("[.]") %>% map_lgl(function(x)x[1]%in%FindLocation&x[2]%in%FindSensor)
df[wich_col]

?

AaronP
  • 185
  • 10
2

Some functions from purrr are going to be useful. First, you use cross2 to compute the cartesian product of FindLocation and FindSensor. You'll get a list of pairs. Then you use map_chr to apply paste to them, joining the location and sensor strings with a dot (.). Then you use the one_of helper to select the colums.

library(purrr)

FindLocation = c("Sensor1", "Sensor2")
FindSensor = c("Temp", "Pressure")

columns = cross2(FindLocation, FindSensor) %>%
  map_chr(paste, collapse = ".")

df %>% select(one_of(columns))
Luiz Rodrigo
  • 936
  • 1
  • 7
  • 19
2

We can use

df %>% 
  select(matches(paste(c("date", outer(FindLocation, 
                FindSensor, paste, sep=".")), collapse="|")))
akrun
  • 874,273
  • 37
  • 540
  • 662