I have the following series of commands:
my_data = read.csv(file='r-stats.out', sep='\t', skip=1)
histsub = subset(my_data, my_data[,10] != "Invalid")
hist(as.numeric(histsub[,10]))
r-stats.out is a file that has 10 columns, and column number 10 (one which I am trying to plot) has numbers ranging from -2000 to 10000 or the word "Invalid" which I try to first filter out. For some reason, my histogram only has range from 0 to 2500 IGNORING everything else. Why? What is happening? I did a
print(histsub)
and everything looks okay, those numbers are there in the histsub, but not on the plot. Please help.
EDIT: Adding a few lines from my_data print and also from histsub: my_data:
38 629345 1 633201 0 -41 Invalid 0 g 0 -37
39 633201 0 628727 0 4496 323 0 g 0 4629
40 628727 0 631371 1 7835 202 0 g 0 Invalid
41 631371 1 625871 1 7317 112 0 g 0 7379
42 625871 1 633427 1 1351 348 0 g 0 1321
histsub:
38 629345 1 633201 0 -41 Invalid 0 g 0 -37
39 633201 0 628727 0 4496 323 0 g 0 4629
41 631371 1 625871 1 7317 112 0 g 0 7379
42 625871 1 633427 1 1351 348 0 g 0 1321