0

I am trying to fill a Tidy data.frame according to calculated indices. After creating a vector of indices from calculated values, I can test and confirm that the calculated values are correct. However, when I use that vector to assign values to rows in the data.frame, I can see that the values are not being filled in exactly the right rows (one additional row is being filled that shouldn't be). When I assign the same values to a vector using the from:to method, the values fill in the correct rows.

When I assign the same values to a vector using the from:to method, the values fill in the correct rows.

PostPulseIndices <- ((Alpha+StepDuration)/SampleInterval+1) : ((Alpha+StepDuration+0.120)/SampleInterval)

yields the correct results:

> head(PostPulseIndices)
[1] 2801 2802 2803 2804 2805 2806

But when used to fill rows in particular columns in a data.frame, the values are filled in starting one row early:

DataFrame$"A1 min"[PostPulseIndices] <- A1_min
DataFrame$"A2 min"[PostPulseIndices] <- A2_min
> DataFrame[2799:2802,]
     A1 min A2 min
2799     NA     NA
2800  0.001  8e-04
2801  0.001  8e-04
2802  0.001  8e-04

Why are A1 and A2 filling at row 2800 when they should be filling at row 2801, where the index vector starts, and how can I ensure that the right rows are filled going forward?

EDIT: Per @Gregory 's request for additional code, here are the values used to calculate PostPulseIndices:

Alpha <- 0.02   
StepDuration <- 0.120
SampleInterval <- 5e-05 

And here is the code used to create the data.frame:

Sweeps <- 1
SweepDuration <- 0.3
x <- seq(SampleInterval, SweepDuration, by = SampleInterval) 
ColNames <- c("A1 min", "A2 min")
DataFrame <- setNames(data.frame(matrix(ncol = (length(ColNames)), nrow = (length(x) * Sweeps))), ColNames)
  • To clarify, when I assign the same values to the index values using `from:to` notation, the data.frame fills the specified column starting at 2801, as expected. – carbontaxfan Sep 10 '19 at 16:12

1 Answers1

0

The minimal working example below does not reproduce what is happening to you. Could you provide a reproducible example that does?

df <- data.frame(letters = letters[1:10], value = NA)
indicies <- 3:10
df[indicies,]$value <- 1
df

# letters value
# 1        a    NA
# 2        b    NA
# 3        c     1
# 4        d     1
# 5        e     1
# 6        f     1
# 7        g     1
# 8        h     1
# 9        i     1
# 10       j     1
Gregory
  • 4,147
  • 7
  • 33
  • 44
  • I tried to reproduce the problem by assigning values to the vector, but the problem did not reproduce. Should I edit my comment above to link to the code itself? – carbontaxfan Sep 10 '19 at 18:56
  • Rather than linking a huge dataset (much better than nothing), could you try reducing the data to what will minimally reproduce the problem? You could use `dplyr::sample_n` to get random rows (do some cause the problem but not others?) and `dplyr::select` to select only the relevant columns. Then call `dput(reduced_dataframe_name)` on the reduced dataframe and copy the output into a code block in your question. – Gregory Sep 10 '19 at 19:02
  • LOL. I don't think you quite got the idea of a minimal [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Try to create some code that we can run which will create the dataframe, do the operations you want, and produce results we can compare to what you expect. It shouldn't take 30,000 rows and 36 columns. – Gregory Sep 10 '19 at 20:03
  • @carbontaxfan That said, you have duplicate column names that could cause you trouble ("fit2 Tau2" is used 3 times). I ran your code (so far as I could put it together) on your dataframe it produced the result you expected, and did not reproduce the problem. – Gregory Sep 10 '19 at 20:03
  • Good catch on the column names. That was not what causing this problem, but would have caused me problems down the road. I removed extraneous columns and rows. Since it's the calculated example that's the problem, I left that as-is. – carbontaxfan Sep 10 '19 at 21:24
  • What's there is enough to reproduce the problem, which can be verified using `DataFrame[2799:2802,]`. – carbontaxfan Sep 12 '19 at 02:39