I have a CSV file with the following lines:
10,130,A,100,1000
10,130,B,200,-200
10,130,C,300,1200
20,140,A,120,1050
20,140,B,220,-300
20,140,C,320,1250
30,120,A,145,1130
30,120,B,255,1000
30,120,C,355,1110
...
And so on, three lines for each increment by 10
in the first column. Each line contains two values in columns 4 and 5 for each type in column 3 (A, B and C). Values in columns 1 and 2 are the same for each set of three lines.
I read it with:
data <- read.csv("data_out.csv", header=FALSE, sep=",")
Each set of three lines read from the file contains 8 values that could be plotted as Y on a graph (example for the first three lines):
- Value in column 1 (10)
- Value in column 2 (130)
- Value in column 4 for type A (100)
- Value in column 5 for type A (1000)
- Value in column 4 for type B (200)
- Value in column 5 for type B (-200)
- Value in column 4 for type C (300)
- Value in column 5 for type C (1200)
They would be plotted for X = 10.
So the first 8 dots would have the following coordinates (X,Y):
(10,10); (10,130); (10,100); (10,1000); (10,200); (10,-200); (10,300); (10,1200)
The next three lines for 20 in column 1 would have coordinates:
(20,20); (20,140); (20,120); (20,1050); (20,220); (20,-300); (20,320); (20,1250)
And similarly for the third and any further set of three lines from the input file.
Dots representing each one of those 8 values from each set of three lines should be connected to form a line chart, similar to this one (but with 8 line charts, not 4 as on the example). So there would be 8 line charts on the same graph representing values for X=10, X=20, X=30, and so on.
Questions about the solution
I know how to plot one line, e.g. plot(data[,1],data[,4],type="l")
but how to plot multiple lines?
And how to ensure that the 0 for Y is in the correct place so that the negative values for C5 can be properly plotted as well?
Also, I know that there is the aggregate
function which could be used to group by the type (A, B, C), but I don't want to perform any summary or averaging, so I am probably looking for a filter (by the type) rather than aggregate?
I would probably also want to ensure that the amount of distinct values in C1 is the same as in C2 (purely for verification that the input data is fine).