2

I am trying to create a plot using two variables (DATE and INT_RATE) using for filter the content of a third variable GRADE. The problem is that I can't really figure out how to use the variable GRADE as a filter for the row.

In the below section i provide a detailed sample of starting data as well as draw of the plot I'm trying to achieve. Thanks in advance.

 STARTING DATA     

 | DATE  | INT_RATE | GRADE |
––––––––––––––––––––––––––––––
 | 1-jan | 5%       | A     | <-- A
 | 5-feb | 3%       | B     |
 | 9-feb | 2%       | D     |
 | 1-apr | 3%       | A     | <-- A
 | 5-jun | 5%       | A     | <-- A
 | 1-aug | 3%       | G     |
 | 1-sep | 2%       | E     |
 | 3-nov | 1%       | C     |
 | 8-dec | 8%       | A     | <-- A
 |   .   | .        | .     |
 |   .   | .        | .     |
 |   .   | .        | .     |

And this is the kind of graph i would like to achieve, which is a very basic one, except for the filtering work needed before.

WANTED RESULT:

GRADE "A"

   INT_RATE
       |
       |
    8%-|                            •   
       |                           ̷
       |                        ̷ 
       |                     ̷
    5%-|  •              •
       |   \            /
       |     \        /
       |       \     /
       |         \ /
    3%-|          •
       |
       |  
       |
       |
    ––––––––––––––––––––––––––––––––––-–––>
       |  ˆ       ˆ      ˆ           ˆ   DATE
       |1-jan   1-apr   5-jun      8-dec

EDIT 1:

Following the precious help from @apax I managed to get a plot, but the result is not satisfying because of the weird way R is displaying it (I think it might be related to the fact that the dataset in question is very large 800k rows). Do you have any suggestion?

malformed graph

By the way, this solved my problem:

plot(x = DATE, y = INT_RATE, data =  filter(df, GRADE == "A"))

I am also uploading a PNG of the malformed chart. Thanks again to all.

scugn1zz0
  • 301
  • 1
  • 6
  • 15
  • I see you accepted an answer and edited your post to ask a new question. If you're interested in responses to the second question, it is best to ask in a new post rather than editing your current post. Note that in order to help as many users as possible, your post is given a single title, the scope of your problem is bounded, and you can only accept a single answer - i.e., when other users encounter the same kind of problem, they might come across this post. A question-answer buried in the comments is not going to achieve this. – CPak Apr 18 '18 at 18:37
  • @CPak thanks a lot for the suggestion. You have been very kind – scugn1zz0 Apr 19 '18 at 06:32

2 Answers2

2

You could use ggplot2 and facet_wrap(...)

library(ggplot2)
ggplot(mtcars, aes(x=mpg, y=disp)) +
  geom_point() +
  facet_wrap(~cyl)

For your data

ggplot(data, aes(x=DATE, y=INT_RATE)) +
  geom_line() +
  facet_wrap(~GRADE)

P.S. This gives separate graphs for all grades. But that should not be a problem.

anup
  • 465
  • 3
  • 18
CPak
  • 13,260
  • 3
  • 30
  • 48
2

Here's a quick one-liner solution where I assume your data is stored in an object named df

library(dplyr) ## For filter() function below

plot(x = DATE, y = INT_RATE, data =  filter(df, GRADE == "A"))
apax
  • 160
  • 7
  • 1
    Or base R's `by`: `by(df, df$GRADE, function(d) plot(DATE, INT_RATE, d))` – Parfait Apr 18 '18 at 15:41
  • @apax your solutions works just fine except for the fact that DATA on the X axis is inexplicably not in order. Do you maybe know the reason? – scugn1zz0 Apr 18 '18 at 16:34
  • @scugn1zz0 my understanding of your question is that you're asking how to sort your data. Have a look at this relevant stack overflow [thread](https://stackoverflow.com/questions/1296646/how-to-sort-a-dataframe-by-columns) – apax Apr 18 '18 at 17:09
  • @apax thanks a lot, i managed to solve, the problem was that the date was not interpreted from R correctly i had to do as follows: `df$DATE <- as.Date(gsub("^", "01-", df$DATE), format="%d-%b-%Y")` – scugn1zz0 Apr 18 '18 at 17:12
  • @apax I uploaded the graph I am obtaining, do you maybe know a way to improve it? – scugn1zz0 Apr 18 '18 at 17:22
  • @scugn1zz0 At this point you should probably re-post your graphing question to specifically ask about the formatting of your graph as this post's original question has been answered here. As a heads up, you may not get helpful responses without posting a minimal, reproducible example. So be prepared for that. – apax Apr 18 '18 at 18:35