1

With data like below,

text = "
R_1700,R_350,R_2950,S_1700,S_350,S_2950
,,-98.2,,,14.15
-80,-82.3,-99,,-0.7,12.4
-77.55,-80.6,-97,,,14.5
-75.55,-80.85,-96.35,,,14.4
-80.8,-81.6,-94.3,,9.95,6
-80.8,-81.8,,,4.9,
-80.8,-81.85,,,8.2,
-73.8,-77.6,-98,,6.35,
-72.8,-76.7,-96.8,3.7,4.6,
-72.65,-81.7,-94.05,2.25,,
-72.95,-80.4,-94.6,1.7,,
-72.7,-81.7,-94.35,1.6,,
-76.05,-84.25,-95.65,3.65,,
-75.5,-84.65,-95.2,1.95,,
-74.65,-83.8,-94.6,2.6,,
-74.2,-83.95,-100.65,3.25,,
-66.8,-75.65,-97.25,,6.45,
-73.7,-77.7,-97.05,,6.8,
-97.8,-100.8,-116.9,,-5.3,
,-99.7,,,-1,
,-100.2,,,-1.3,
-93.3,-94.75,-103.7,,-4.25,
-94.6,-96.55,-105,,-6.7,
-96.4,-98.45,-110.1,,-6.9,
-96.4,-101.1,-110.7,,-7.65,
-94.95,-102,,,-7.2,
-94,-102.15,,,-9.35,
-91.8,-97,-110.3,,-5.3,
"
df1 = read.table(textConnection(text), sep=",", header = T)

Need to plot regression lines for columns as below with X-axis holding R_... values and Y-axis holding S_... values

  1. S_1700 vs. R_1700
  2. S_350 vs. R_350
  3. S_2950 vs. R_2950

For a single group of variables, I could have done something like below.

ggplot(df1, aes(x=R_1700, y=S_1700)) +
  geom_point() + 
  geom_smooth(method=lm, se=FALSE, fullrange=TRUE)

Need help to get all the three lines in a single plot as in the example below. The 3 different groups would be 1700, 350 and 2950.

Plot of multiple Linear regressions

user3206440
  • 4,749
  • 15
  • 75
  • 132

2 Answers2

2

If you could reorganize your data in a format like below:

# with data.table package
library(data.table)
setDT(df1)
df2 <- melt(df1, measure.vars = patterns('R_', 'S_'))
df2[, variable := factor(variable, levels = 1:3,
    labels = tstrsplit(grep('R_', names(df1), value = TRUE), '_')[[2]])]
# > df2
#     variable  value1 value2
# 1:     1700      NA     NA
# 2:     1700  -80.00     NA
# 3:     1700  -77.55     NA
# 4:     1700  -75.55     NA
# 5:     1700  -80.80     NA
# 6:     1700  -80.80     NA
# 7:     1700  -80.80     NA
# 8:     1700  -73.80     NA
# 9:     1700  -72.80   3.70


# without data.table
tmp <- split.default(df1, f = sapply(strsplit(names(df1), '_'), `[`, 2))
tmp <- lapply(tmp, function(dtf){
    names(dtf) <- c('value1', 'value2')
    return(dtf)
})
df2 <- do.call(rbind, tmp)
df2$variable <- rep(names(tmp), each = nrow(df1))

you can visualize the data as desired easily:

ggplot(df2, aes(x = value1, y = value2, color = variable)) +
    geom_point() + 
    geom_smooth(method=lm, se=FALSE, fullrange=TRUE) +
    labs(x = 'R', y = 'S')

enter image description here

mt1022
  • 16,834
  • 5
  • 48
  • 71
  • possible to get an answer without `data.table` ? – user3206440 Jul 15 '20 at 06:43
  • @user3206440, of course. see the edited answer. It does matter how you reformat the data, as long as the reorganized data are in long format that are suitable for ggplot. For the current data with just several columns, you can do it manually with column subsetting -> rename->rbind, and finally, adding a column of the grouping variable. But an automatic way is more general and works for dataset with more columns. – mt1022 Jul 15 '20 at 07:11
2

tidyverse solution

library(tidyverse)

df1 %>% 
  pivot_longer(everything()) %>% #wide to long data format
  separate(name, c("key","number"), sep = "_") %>% #Separate elements like R_1700 into 2 columns 
  group_by(number, key) %>% #Group the vaules according to number, key
  mutate(row = row_number()) %>% #For creating unique IDs 
  pivot_wider(names_from = key, values_from = value) %>% #Make separate columns for R and S
  ggplot(aes(x=R, y=S, color = number, shape = number)) +
  geom_point() + 
  geom_smooth(method=lm, se=FALSE, fullrange=TRUE)

enter image description here

UseR10085
  • 7,120
  • 3
  • 24
  • 54
  • great .. would you also please add comments to the steps being done? Also in `separate` you have used `sep = "_"` - what if the instead of `R_1700` its a string pattern like `ABC_CA_BDEF_for_KEY_1700` and instead of 'S_1700' is a string pattern like `ABC_CA_XYZX_for_KEY_1700`. Note that the difference between the two is between the third and fourth underscores – user3206440 Jul 15 '20 at 07:09
  • 1
    See the updated answer. For details about `separate` you can visit [this](https://tidyr.tidyverse.org/reference/separate.html). – UseR10085 Jul 15 '20 at 07:17
  • 1
    Also visit [this](https://stackoverflow.com/a/44424567/6123824) – UseR10085 Jul 15 '20 at 07:23