I have a dataframe that looks like this:
*VarName1* - *VarValue1*
*VarName2* - *VarValue2*
*Etc.*
In practice it looks somethings like this:
nmlVar - noFloat
Date-Batch - 2011020147
Weight - 10
Length - 5
Height - 8
Date-Batch - 2011020148
Weight - 10.3
Length - 6
Height - 8
Date-Batch - 2011020147
Weight - 10
Length - 5
Height - 8
I am preparing to organise the data in such a way that I can use it for analysis. I already found out how to transpose the rows into columns in this post: Transposing rows into columns, then split them
I used this code to transpose:
library(dplyr)
library(tidyr)
DFP %>%
mutate(sample = cumsum(nmlVar == 'Batch')) %>%
spread(nmlVar, noFloat)
I want to do the same, but then use the "Date-Batch" variable as key variable in the function above. This is needed because this is the key used in another dataframe and I want to merge those.
The problem is that this Date-Batch variable not always has unique values (check the first and third occurence). I am trying to find a function that deletes every second occurence of the same Date-Batch value.
I tried to describe it in 'programming words':
FOR
Date-Batch IN
nmlVar IF
duplicate DELETE
second occurence
I don't know if this is the best way to do this, or perhaps you can set me up in another way.