0

I am a beginner with R so I appreciate your patience and help in advance!

I am trying to make a line graph using ggplot to display the changes in the y variable over years. When I input my data into ggplot this is what I am getting...

The line for Y variable doesn't reflect the changes in the data set

This is the code I used to make the graph...

ggplot(sqft2, aes(x = year, y = '100015', group = 1))+
  geom_line()

Here is the data that I am using...

   year 100015
1  1998   1504
2  1999   1504
3  2000   1504
4  2001   1504
5  2002    984
6  2003   1504
7  2004   1504
8  2005   1968
9  2006   1968
10 2007   1968
11 2008   1968
12 2009   1968
13 2010   1968
14 2011   1968
15 2012   1968
16 2013   1968
17 2014   1968
18 2015   1968
19 2016   1968
20 2017   1968
21 2018   1968
22 2019   1968
23 2020   1968
24 2021   1968


'data.frame':   24 obs. of  2 variables:
 $ year  : int  1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 ...
 $ 100015: num  1504 1504 1504 1504 984 ...

Any suggestions or help on why this is happening is greatly appreciated!

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
Estefan
  • 3
  • 2
  • 3
    It looks like the `y` variable is being passed as a string. Replace the single quotes with backticks. – cazman Jan 28 '22 at 17:16
  • 1
    Also, I would advise to avoid starting column names with numbers. – cazman Jan 28 '22 at 17:17
  • @cazman This solution worked! Thank you for your help!! I will try avoiding using numbers as column names next time. – Estefan Jan 28 '22 at 17:20

2 Answers2

1

Try this:

A: year as numeric:

  1. Starting column names with digits is not good. Here we use rename to rename to X100015.
  2. to get year in the order we use arrange.
  3. with group=1 we get the line as desired.
library(tidyverse)
df %>% 
  ## use backticks for your current non-standard column name
  rename(X100015 = `100015`) %>% 
  arrange(year) %>% 
  ggplot(aes(x=year, y=X100015, group=1)) +
  geom_line()

enter image description here

B: year as factor:

  1. Starting column names with digits is not good. Here we use rename to rename to X100015
  2. to get year in the order we use fct_inorder from forcats package (it is in tidyverse)
  3. with group=1 we get the line as desired.
library(tidyverse)

df %>% 
  rename(X100015 = `100015`) %>%
  mutate(year = factor(year)) %>%
  ggplot(aes(x=fct_inorder(year), y=X100015, group=1)) +
  geom_line()

enter image description here

data:

df <- tribble(
  ~year,  ~`100015`,
1998,   1504,
1999,   1504,
2000,   1504,
2001,   1504,
2002,    984,
2003,   1504,
2004,   1504,
2005,   1968,
2006,   1968,
2007,   1968,
2008,   1968,
2009,   1968,
2010,   1968,
2011,   1968,
2012,   1968,
2013,   1968,
2014,   1968,
2015,   1968,
2016,   1968,
2017,   1968,
2018,   1968,
2019,   1968,
2020,   1968,
2021,   1968)
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
TarJae
  • 72,363
  • 6
  • 19
  • 66
0

Hi Estefan and welcome to Stack Overflow! In the future, please try to post a reproducible example with your question. These help respondents better understand and diagnose your issue.

In regards to your question, the issue is that your column name is numeric and ggplot() reads y = '100015' as a string instead of a column name. This can be overcome by putting the term y = df$'100015'instead. It is generally a best practice to avoid purely numeric column names for this reason. Alternatively if you are not married to the column name being '100015' you can simply rename it with colnames(df)[2]<-"ResponseVar"

Here is a reproducible example:

library(ggplot2)
library(dplyr)

##Current approach##
df<-data.frame(year=c(1998:2021))
df$'100015'<-case_when(df$year %in% c(1998:2001, 2003, 2004) ~ 1504,
                       df$year == 2002 ~ 984,
                       TRUE ~ 1968)

ggplot(data = df, aes(x = year, y = df$'100015')) + geom_line() # ggplot will complain about the column name, but will still provide the correct output


##Renaming approach##
colnames(df)[2]<-"ResponseVar"
ggplot(data = df, aes(x = year, y = ResponseVar)) + geom_line() # ggplot won't complain

Sean McKenzie
  • 707
  • 3
  • 13