-2

I have a numerical dataset. Which has 3 independent variable and 1 dependent variable.

ex: variable names are a,b,c and d. Where d is dependent variable.

data sample

In my data set d = f(a,b,c). I would like to plot -> variable d in y axis and all the other variables in x axis. preferably a line plot.

Please advise.

Thanks

Ruchi
  • 1
  • 2
  • 2
    You want a 4D plot. Can you explain how to plot 4 axis? – Rui Barradas Jun 27 '19 at 11:51
  • Can you please help me to undertand, if this works? http://www.sthda.com/english/wiki/impressive-package-for-3d-and-4d-graph-r-software-and-data-visualization – Ruchi Jun 27 '19 at 12:00
  • That page is about 3D and 2D graphics, not 4D. – Rui Barradas Jun 27 '19 at 12:06
  • if one or more of a,b and c are categorical variables, you can facet your plots using `facet_wrap` or `facet_grid` from `ggplot2` package – fmarm Jun 27 '19 at 12:12
  • My data is having only numerical values. Is there a way to visualize it by ploting all the 3 independent variables in x axis and dependent variable in y axis? – Ruchi Jun 27 '19 at 12:15
  • Welcome to SO! Please add some data to make your example reproducible (using `dput(head(your_data,20))` (first 20 rows of your data)and posting the result if you can publish it, or doing the same with fake data. Also some attempts and mockups of the goal are gladly seen. – s__ Jun 27 '19 at 12:34

2 Answers2

1

A sample dataset would greatly improve your chances of success here. Please see How to make a great R reproducible example for how to create better questions in the future.

That said, here's a quick-and-dirty example that may help you get where you're going. First, the data. Copy this text and save it in your working directory as "test.csv" Note that the working directory has to be either a default one, or the one you started an R script from, or the one you set with a setwd command in your script.

a,b,c,d
10,8,5,1
8,3,6,2
7,4,4,3
6,6,5,4
5,4,6,5
7,7,4,6

Now some code to make it go:

library("reshape2")
library("ggplot2")

df <- read.csv("test.csv")

df2 <- melt(df, id.vars = "d")
ggplot(df2, aes(d, value, col = variable, group = variable))+
  geom_line()

There's lots you can do to make it pretty, but this at least demonstrates what I think you're trying to accomplish. The magic is in melting the data into columns that can be plotted (take a look at df2), then defining the multiple series in ggplot.

Here's what your result should look like:

3-series plot example

DanM
  • 337
  • 3
  • 9
  • Note I actually created a solution for 3 **dependent** variables and only 1 independent. I'm guessing from the text that's what you actually meant. I'm having a little trouble visualizing what the other would look like, but again, perhaps you can edit your question to clarify a little better if this is not your solution. – DanM Jun 27 '19 at 12:46
  • Thank you very much for your response. I have added added my data sample and some information to the quesiton now. I actually need a solution for 1 dependent variables and 3 independent variables. Being dependent variable in Y axis. – Ruchi Jun 27 '19 at 13:16
1

There are some different ways you could tackle your problem, if I understand you correctly.

The first one is that you try to code every independent variable as a graphical parameter:

library(tidyverse)
  tibble(a = rnorm(50),
         b = rnorm(50),
         c = rnorm(50),
         d = rnorm(50)) %>%
    ggplot(aes(y = d, x = a, size = b, color = c)) +
    geom_line() +
    theme_minimal()

enter image description here

Since this method results in pretty messy plots, the second one is that you try to group some of your independent variables into groups of quantiles and try to plot those.

One way could be this:

  library(tidyverse)
tibble(a = rnorm(50),
       b = rnorm(50),
       c = rnorm(50),
       d = rnorm(50)) %>%
  mutate(c = cut(c,breaks = c(-Inf,quantile(c))),
         b = cut(b,breaks = c(-Inf,quantile(b)))) %>%
  ggplot(aes(y = d, x = a,color = b, group = c)) +
  geom_line() +
  theme_minimal()

enter image description here

Or, since this one is still pretty messy, using facet_wrap:

tibble(a = rnorm(50),
       b = rnorm(50),
       c = rnorm(50),
       d = rnorm(50)) %>%
  mutate(c = cut(c,breaks = c(-Inf,quantile(c))),
         b = cut(b,breaks = c(-Inf,quantile(b)))) %>%
  ggplot(aes(y = d, x = a,color = b)) +
  geom_line() +
  geom_point() +
  facet_wrap(~c,drop = T) +
  theme_minimal()

enter image description here


One last way you could try is to melt your data:

library(tidyverse)
library(reshape2)
tibble(a = rnorm(50),
       b = rnorm(50),
       c = rnorm(50),
       d = rnorm(50)) %>%
  melt(id.vars = 'd') %>%
  ggplot(aes(y = d, x = value,color = variable)) +
  geom_line() +
  theme_minimal()

enter image description here

Or, a bit more tidy, using facet_wrap again:

library(tidyverse)
library(reshape2)
tibble(a = rnorm(50),
       b = rnorm(50),
       c = rnorm(50),
       d = rnorm(50)) %>%
  melt(id.vars = 'd') %>%
  ggplot(aes(y = d, x = value,color = variable)) +
  geom_line() +
  theme_minimal() +
  facet_wrap(~variable)

enter image description here

Max Teflon
  • 1,760
  • 10
  • 16
  • I love your variety of options @Max Teflon. Maybe a little overwhelming for the OP, but I may hang onto this for my own reference! – DanM Jun 27 '19 at 12:55
  • @Max Teflon Thank you very much for your response. Looks like this will help me to reproduce the solution in my dataset. Let me quickly try and get back here with results – Ruchi Jun 27 '19 at 13:08
  • @Max Teflon : it looks like i want a plot similar to this and hence I tried your comment ->One last way you could try is to melt your data: but could not run the last comand.. below is the error which i see: Error: unexpected '=' in "tibble(Macedonia.probe.1$Number.of.buffering.events =" ..... Can you please help me – Ruchi Jun 28 '19 at 10:44
  • The `tibble(a = rnorm(50), b = rnorm(50), c = rnorm(50), d = rnorm(50))`- part is just there to simulate data to have sth to display. Just replace it by your data.frame, obviously you will then also need to change the column names to represent those in your data – Max Teflon Jun 28 '19 at 12:27
  • @Max Teflon : Thank you very much for your explanation. The comand worked for my dataset. However, I am concerned about the value in x and y axis for example, which is varying from -3 to 3 or from -2 to 2. Is it possible to show the real value of that particular axis for both x and y. Also, is it possibe to add x, y lables? Your help is much appreciated and greatly helps me. – Ruchi Jul 01 '19 at 14:01
  • [This](https://github.com/rstudio/cheatsheets/blob/master/data-visualization-2.1.pdf) should have all the answers concerning specific graphical parameters you are searching for. – Max Teflon Jul 03 '19 at 11:55
  • @Max Teflon : thank you very much. I was succesful to get the parameters to my graph. However, when I set my y and x limits, the data points in the plot will not look the same as in the data file. I mean, the y value in the dataset ranges from 3.6 to 4.0 and i would like to keep it the same way in the plot. But in the plot it ranges from -2 to 2. How can I get the actual data points to be plotted in my graph in this case? – Ruchi Jul 04 '19 at 09:17