I'm having a problem where the output number R is giving me is several times higher than the input.
My script is reading a tab-delimited text file with the following data (extract):
4800.000000000004 63.79541685299562
4808.000000000004 65.44888307144669
4816.000000000004 65.66174624010496
4824.000000000004 65.85413227845713
4832.000000000004 66.3271958214957
4840.000000000004 66.67304406065
4848.000000000004 66.90294325983125
4856.000000000004 67.16391462118467
4864.000000000004 67.3649619902818
4872.000000000004 67.47950644400306
4880.000000000004 67.53568545748826
4888.000000000004 67.5820448431992
4896.000000000004 67.70983887523283
4904.000000000004 67.84124194437604
4912.000000000004 67.78234409282649
4920.000000000004 67.17896344097808
4928.000000000004 65.16964351857043
This is labeled as intenistyFile -- it's intensity data from an audio analysis program. The first column is time in milliseconds, the second is intensity in decibels.
From here, I grab all intensity data between two time values (taken from another file in a loop):
beginTime <- labelFile[i,1]
endTime <- labelFile[i,2]
...
#Read intensity file. Grab all pitch measurements >= begin time and <= end time
C <- subset(intensityFile, V1>=beginTime & V1<=endTime)
#Do the following calculations on the F0, stored in the data table
maxIntense <- max(as.numeric(C$V2))
minIntense <- min(as.numeric(C$V2))
rangeIntense <- maxIntense - minIntense
meanIntense <- mean(as.numeric(C$V2))
stdevIntense <- sd(as.numeric(C$V2))
(I've left out defining "labelFile", which is where I get the time values.)
The problem is that after I do this operations, I get values like this:
maxIntense minIntense rangeIntense meanIntense
23242 19110 4132 21466.66667
24699 19851 4848 23384
22109 16905 5204 20892.28571
25442 13973 11469 20764.46154
26410 16347 10063 23433.18182
25452 13750 11702 20401.63636
27241 9788 17453 23040.41667
23795 19965 3830 22413.5
23528 19584 3944 22074.14286
27530 14302 13228 21571.91667
Which are obviously massively inflated. These are humans speaking, not planet-busting bombs. I have tried using as.double() rather than as.numeric() (I have to force a type, as the intensity for some reason gets read as a factor otherwise). What could be causing this weird inflation?
A note -- I do essentially the same operation to a file indicating pitch values, but no weird inflation (it's also tab-delimited text).
EDIT: Fixed due to first comment by joran comment below. The reason C$V2 was reading as a factor was that each file had a number of values "--undefined--". I manually deleted these before running in R and it worked out. Apparently there is a duplicate, but I won't be needing that.