1

I have a small csv tab delimited file with the following data:

alg f1  prec    recall
rf  0.85891 0.808976    0.915413
svm 0.927857    0.988347    0.874345
knn 0.653483    0.611013    0.702298
nb  0.372421    0.253795    0.699256

I want to plot it like this:

enter image description here

I am complete newbie in R so I load my data the following way:

library(ggplot2)
library(plotly)

# performance of various algs
test <- data.frame(header <- c("F-1", "Precision", "Recall"),
                   alg1 <- c(0.66381,   0.523659,   0.906397),
                   alg2 <- c(0.909586,  0.951798,   0.87096),
                   alg3 <- c(0.402166,  0.282086,   0.700253),
                   alg4 <- c(0.141439,  0.078692,   0.698064)
                  )

# plotting
ppl <- function() {
  ggplot(test, aes(header, colour = "alg", group = 4)) + 
    geom_line(aes(y = alg1, colour = "rf"), size=1) +
    geom_line(aes(y = alg2, colour = "svm"), size=1) +
    geom_line(aes(y = alg3, colour = "knn"), size=1) +
    geom_line(aes(y = alg4, colour = "nb"), size=1) +
    xlab("measures") +
    ylab("score") +
    labs(title = "") +
    theme(legend.justification = c(1, 1), legend.position = c(1, 1))
}

ppl()

So, for each plot line I manually insert the numbers while I know that I can do

data = read.table(file=file.choose(), sep="\t", header = TRUE)

And then somehow arrange the data so that ggplot wouldn't complain about "Aesthetics" unfortunately I don't know how. Is there are a better and less tedious way to plot the following file table?

minerals
  • 6,090
  • 17
  • 62
  • 107

2 Answers2

2

Here is the solution for you:

library(ggplot2)
library(reshape2)

# performance of various algs
header <- c("F-1", "Precision", "Recall")
                   alg1 <- c(0.66381,   0.523659,   0.906397)
                   alg2 <- c(0.909586,  0.951798,   0.87096)
                   alg3 <- c(0.402166,  0.282086,   0.700253)
                   alg4 <- c(0.141439,  0.078692,   0.698064)
test <- data.frame(header,alg1,alg2,alg3,alg4)

test2 <- melt(test,id="header")

# plotting
ggplot(test2, aes(x=header,y=value,color=variable,group=variable)) + 
    geom_line(size=1) +
    xlab("measures") +
    ylab("score") +
    labs(title = "") +
    theme(legend.justification = c(1, 1), legend.position = c(1, 1)) +
    scale_x_discrete(labels = c("F-1", "Precision", "Recall"))

You need to melt the data frame at first using reshape2 package and further use created columns (value and variables) as y value and grouping argument subsequently.

zx8754
  • 52,746
  • 12
  • 114
  • 209
Mal_a
  • 3,670
  • 1
  • 27
  • 60
  • I would still have to **manually** insert all the f1, prec, rec numbers into vector. Is there a way to just load this file and plot its data? – minerals Aug 17 '17 at 09:24
1

Try this:

library(ggplot2)
library(reshape)

# example data
df1 <- read.table(text = "
alg f1  prec    recall
rf  0.85891 0.808976    0.915413
svm 0.927857    0.988347    0.874345
knn 0.653483    0.611013    0.702298
nb  0.372421    0.253795    0.699256", header = TRUE)

# melt the data, wide-long
df1_melt <- melt(df1)

# then plot
ggplot(df1_melt, aes(x = variable, y = value, colour = alg, group = alg)) +
  geom_line(size = 1) +
  # prettify
  scale_y_continuous(breaks = seq(0.25,0.75, 0.25), limits = c(0, 1)) +
  xlab("measures") +
  ylab("score") +
  labs(title = "") +
  theme(legend.justification = c(1, 1), legend.position = c(1, 1))
zx8754
  • 52,746
  • 12
  • 114
  • 209
  • This example is better because I can load the file, apply `melt` on the loaded data and plot it directly without any manual typing. – minerals Aug 17 '17 at 09:25