Here are a couple of approaches:
1) dplyr/tidyr Convert df
to long form using gather
and then separate the generated variable
column by underscore into two columns. Finally convert from long to wide based on the variable
column (which contains the strings pressure
and temperature
and value
column (which contains the number):
library(dplyr)
library(tidyr)
df %>%
gather("variable", "value", -Test) %>%
separate(variable, c("variable", "sensor"), sep = "_") %>%
spread(variable, value)
2) Can use reshape
. No packages needed. The line marked optional removes the row names. It could be omitted if that does not matter.
unames <- grep("_", names(df), value = TRUE)
varying <- split(unames, sub("_.*", "", unames))
sensors <- unique(sub(".*_", "", unames))
long <- reshape(df, dir = "long", varying = varying, v.names = names(varying),
times = sensors, timevar = "sensor")
rownames(long) <- NULL # optional
If df
has fixed columns then we could simplify the above a bit by hard coding varying
and sensors
using these definitions in place of the more complex but general code above:
varying <- list(pressure = 2:4, temperature = 5:7)
sensors <- c("sensor1", "sensor2", "sensor3")
Note: To create df
reproducibly we must set the seed first because random numbers were used so to be definite we created df
like this. Also note that in the question temperature_sensor1
was used on two columns and we assumed that the second occurrence was intended to be temperature_sensor3
.
set.seed(123)
df <- data.frame(
Test = 1:10,
temperature_sensor1=rnorm(10,25,5),
temperature_sensor2 = rnorm(10,25,5),
temperature_sensor3 = rnorm(10,25,5),
pressure_sensor1 = rnorm(10,10,2),
pressure_sensor2 = rnorm(10,10,2),
pressure_sensor3 = rnorm(10,10,2))