-3

I am trying to create a scatter plot by following several variables. I am wondering how can I show the day trends by following the state. Please let me know anything needs to provide. Thank you very much! The data are shown below.

State Day1 Day2 Day3 Day4
CA    1    5     7    9
NY    10   8    20    90 
VT    4   6    9    10 
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66

2 Answers2

0

Base R

Use matplot to get all the lines at once. But the data must be transposed, R uses column first order.

matplot(t(df1[-1]), type = "l", lty = 1)
legend("topleft", legend = df1$State, col = 1:3, lty = 1)

enter image description here


Package ggplot2

With package ggplot2, this type of problems generally has to do with reshaping the data. The format should be the long format and the data is in wide format. See this post on how to reshape the data from wide to long format.

library(ggplot2)

df1 |>
  tidyr::pivot_longer(-State, names_to = "Day") |>
  dplyr::mutate(Day = as.integer(sub("[^[:digit:]]+", "", Day))) |>
  ggplot(aes(Day, value, color = State)) +
  geom_line()

enter image description here


Data

df1 <- read.table(text = "
State Day1 Day2 Day3 Day4
CA    1    5     7    9
NY    10   8    20    90 
VT    4   6    9    10 
", header = TRUE)
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
0

In Stata you need a long layout for this to work well -- just as in ggplot2 in R. After something like

clear 
input str2 State Day1 Day2 Day3 Day4
CA    1    5     7    9
NY    10   8    20    90 
VT    4   6    9    10 
end 

reshape long Day, i(State) j(Time)
rename Day Whatever 
encode State, gen(Where)
xtset Where Time 

you can check out tsline and xtline. But: if your real data are 50 states of the United States (plus DC? Puerto Rico? Guam? ??) then your data define 50?+ lines and some strategy may be needed.

Nick Cox
  • 35,529
  • 6
  • 31
  • 47