I have a data frame (df) that I want to subset according to the value of the column t. In my pipeline this is done in a loop that allows me to process at each repeat of the loop only the part of the data frame that has a certain t. Here is a part of the data frame:
df
t d avrg se s_n
105 4.034 574.383 533.3125 15.842750 0.5742241
106 4.034 579.906 526.2601 16.520519 0.5666307
107 4.034 585.429 517.3978 16.603408 0.5570885
108 4.034 590.951 514.8851 16.378100 0.5543831
109 4.034 596.474 517.5682 16.031580 0.5572721
110 4.034 601.997 524.1770 16.301832 0.5643879
111 4.034 607.520 521.1787 16.773292 0.5611595
112 4.034 613.043 511.4275 17.079401 0.5506602
113 4.034 618.566 506.8916 16.757593 0.5457765
114 4.034 624.089 511.3979 17.165346 0.5506284
115 4.034 629.612 511.7480 17.175872 0.5510053
116 4.034 635.135 509.7872 17.862666 0.5488941
117 4.034 640.658 507.4556 19.080856 0.5463837
118 4.244 0.000 984.5679 1.842083 1.0600964
119 4.244 5.523 1040.4532 4.488659 1.1202687
120 4.244 11.046 1284.3719 24.832460 1.3828990
121 4.244 16.569 1503.8378 49.605517 1.6192007
122 4.244 22.092 1558.5444 49.223158 1.6781039
123 4.244 27.615 1631.0177 36.870109 1.7561368
124 4.244 33.137 1741.2543 30.006613 1.8748300
125 4.244 38.660 1872.4405 37.725207 2.0160797
in order to get the levels of t I do this:
times<-as.numeric(levels(as.factor(df$t)))
Then, as I loop, I subset my data frame like this:
for (j in times) {
df_t <- df[df$t==j,]
*and here all the things I need to do with df_t**
}
I noticed that the subsetting works well for some values of t but not for others. For example:
> df_t <- df[df$t==times[1],]
> df_t
t d avrg se s_n
105 4.034 574.383 533.3125 15.84275 0.5742241
106 4.034 579.906 526.2601 16.52052 0.5666307
107 4.034 585.429 517.3978 16.60341 0.5570885
108 4.034 590.951 514.8851 16.37810 0.5543831
109 4.034 596.474 517.5682 16.03158 0.5572721
110 4.034 601.997 524.1770 16.30183 0.5643879
111 4.034 607.520 521.1787 16.77329 0.5611595
112 4.034 613.043 511.4275 17.07940 0.5506602
113 4.034 618.566 506.8916 16.75759 0.5457765
114 4.034 624.089 511.3979 17.16535 0.5506284
115 4.034 629.612 511.7480 17.17587 0.5510053
116 4.034 635.135 509.7872 17.86267 0.5488941
117 4.034 640.658 507.4556 19.08086 0.5463837
> df_t1 <- df[df$t==times[2],]
> df_t1
[1] t d avrg se s_n
<0 rows> (or 0-length row.names)
Why does the subsetting work with some values of my levels and not with others? If I check them manually, the values seem to be both correct and my data frame clearly has those values in the t column...
> times[1]
[1] 4.034
> times[2]
[1] 4.244
I also tried other ways of subsetting, like this:
> subset.data.frame(df, df$t==times[2])
[1] t d avrg se s_n
<0 rows> (or 0-length row.names)
> subset.data.frame(df, df$t==times[1])
t d avrg se s_n
105 4.034 574.383 533.3125 15.84275 0.5742241
106 4.034 579.906 526.2601 16.52052 0.5666307
107 4.034 585.429 517.3978 16.60341 0.5570885
108 4.034 590.951 514.8851 16.37810 0.5543831
109 4.034 596.474 517.5682 16.03158 0.5572721
110 4.034 601.997 524.1770 16.30183 0.5643879
111 4.034 607.520 521.1787 16.77329 0.5611595
112 4.034 613.043 511.4275 17.07940 0.5506602
113 4.034 618.566 506.8916 16.75759 0.5457765
114 4.034 624.089 511.3979 17.16535 0.5506284
115 4.034 629.612 511.7480 17.17587 0.5510053
116 4.034 635.135 509.7872 17.86267 0.5488941
117 4.034 640.658 507.4556 19.08086 0.5463837
>
But as you can see the subsetting still works with one value and not with the other. Do you have any suggestion on how to solve this problem?
UPDATE1
As suggested in the comments, using
times <- unique(dt$t)
instead of my first method, works well and seem to solve the problem for now.
UPDATE2
Following some comments, here I try to provide a reproducible form of my df
> dput(df)
structure(list(t = c(4.034, 4.034, 4.034, 4.034, 4.034, 4.034,
4.034, 4.034, 4.034, 4.034, 4.034, 4.034, 4.034, 4.244, 4.244,
4.244, 4.244, 4.244, 4.244, 4.244, 4.244), d = c(574.383, 579.906,
585.429, 590.951, 596.474, 601.997, 607.52, 613.043, 618.566,
624.089, 629.612, 635.135, 640.658, 0, 5.523, 11.046, 16.569,
22.092, 27.615, 33.137, 38.66), avrg = c(533.312475247525, 526.260069306931,
517.397752475248, 514.885089108911, 517.568217821782, 524.17702970297,
521.178702970297, 511.427475247525, 506.891643564356, 511.397861386139,
511.74796039604, 509.787158415842, 507.455584158416, 984.567900990099,
1040.45316831683, 1284.37189108911, 1503.83781188119, 1558.54437623762,
1631.01772277228, 1741.25434653465, 1872.44046534653), se = c(15.8427501449439,
16.5205192226773, 16.6034079506853, 16.3780996947454, 16.0315801572497,
16.3018319583687, 16.7732924683709, 17.0794011397917, 16.7575928861432,
17.1653457253679, 17.1758716618221, 17.8626655326283, 19.0808563725021,
1.84208337262486, 4.48865895211631, 24.8324597734051, 49.6055165744209,
49.2231582153052, 36.8701085501606, 30.0066129040664, 37.7252068402058
), s_n = c(0.574224096797455, 0.566630684643339, 0.557088519188004,
0.554383103659479, 0.557272061321729, 0.564387850300072, 0.561159515055948,
0.550660248318982, 0.545776462597893, 0.550628362710457, 0.551005318617323,
0.548894099025929, 0.546383664366651, 1.06009635986746, 1.12026871405828,
1.38289900076007, 1.61920065503164, 1.67810388524744, 1.75613682819789,
1.8748299558705, 2.01607966234353)), .Names = c("t", "d", "avrg",
"se", "s_n"), row.names = 105:125, class = "data.frame")
>