MrFlick's comment has already given you the solution, so here's an attempt at explaining where things went wrong.
TL;DR explanation
The color = "some color"
part is inside aes()
, but scale_color_discrete()
isn't prepared to handle it properly.
Long explanation
Let's use a base data set for illustration, since not everyone has the VGAMdata package:
df <- subset(airquality, Month == 5)[, c("Day", "Wind", "Temp")]
> head(df)
Day Wind Temp
1 1 7.4 67
2 2 8.0 72
3 3 12.6 74
4 4 11.5 62
5 5 14.3 56
6 6 14.9 66
> summary(df)
Day Wind Temp
Min. : 1.0 Min. : 5.70 Min. :56.00
1st Qu.: 8.5 1st Qu.: 8.90 1st Qu.:60.00
Median :16.0 Median :11.50 Median :66.00
Mean :16.0 Mean :11.62 Mean :65.55
3rd Qu.:23.5 3rd Qu.:14.05 3rd Qu.:69.00
Max. :31.0 Max. :20.10 Max. :81.00
Plot:
ggplot(df,
aes(x = Day)) +
geom_line(aes(y = Wind, color = "red")) +
geom_line(aes(y = Temp, color = "blue")) +
scale_color_discrete(name = "Variables",
labels = c("wind", "temperature")) +
theme(legend.position = "bottom",
legend.direction = "vertical")

Several problems come to mind:
- We know from the data that Temp's values (56-81) are magnitudes larger than Wind's values (5.7-20.1). So why is the upper line (Temp's values) labelled "wind"?
- Why are pink / cyan shown in the plot when we specified red / blue?
When a color is specified inside aes()
, it's interpreted as a variable value by default, rather than a color (scale_XX_identity
overrides this). If I replaced the above with color = "xyz"
/ color = "abc"
, the chart would look exactly the same except for the legend labels.
Essentially, ggplot understands the code to say "Line color takes on the variable value "red" in the first geom_line
and "blue" in the second; map "red" & "blue" to the default color palette in alphabetical order. If a vector of labels is provided, it is mapped to the alphabetically sorted vector of c("blue", "red")"
So we have:
|Variable |Assigned.color.value |Mapped.color.value |Mapped.label |
|:--------|:--------------------|:------------------|:------------|
|Temp |blue |pink |wind |
|Wind |red |cyan |temperature |
(Side note: the default palette translates to pinkish-cyan if there are two values, red-green-blue if there are three, red-green-blue-purple if there are four, etc. See this question for more details on how this comes about.)
And if that's not confusing enough, this is what happens if you change both colors to the same value (again, it doesn't matter what that value is... could be "black", "white", "purple", or any other character string):
|Variable |Assigned.color.value |Mapped.color.value |Mapped.label |
|:--------|:--------------------|:------------------|:------------|
|Temp |black |pink |wind |
|Wind |black |pink |wind |
Since there's only one value given, the lines take on the first color in the default palette, and only the first value in the label vector is mapped to it. Hence your plot's lines all turned pink, and all but the first line of legend disappeared.
Alternative solution
Generally, if I need to use several geom_line()
lines & assign a different colour to each (rather than convert the original data into long format), I prefer to define each line's color using its intended label inside aes()
, and specify the corresponding color manually using scale_XX_manual()
:
ggplot(df,
aes(x = Day)) +
geom_line(aes(y = Wind, color = "1: wind")) +
geom_line(aes(y = Temp, color = "2: temp")) +
scale_color_manual(name = "Variables",
values = c("1: wind" = "red", "2: temp" = "blue")) +
theme(legend.position = "bottom",
legend.direction = "vertical")
This has the advantage of keeping all the movable parts in one place (inside scale_XX_manual()
), so that if I need to update colours later on, I only need to look at that one line. The use of a named vector for values also ensures that the colors & labels are always mapped correctly to one another.
In this case, if I want to change all the line colours to black, I'd just use values = c("1: wind" = "black", "2: temp" = "black")
instead.

For reference, this is what I got when I ran your original code:

The colours shown in the plot follow ggplot's default template for 4 values, rather than any combination of black / blue / red / pink.
In addition, the initial color values were (in stated order) c("black", "blue", "red", "pink")
, but the alphabetical order would have been c("black", "blue", "pink", "red")
. The labels are mapped to this alphabetical order, which is why the purple line ("Mild thinness" according to the legend) is actually the line for "Normal range (lower bound)".