R tidyr spread Error: Duplicate identifiers for rows

Question

I'm having an issue with the tidyr::spread()function on R.

Previously I runned the melt() function to remove NAs values and shrink my data.

    `NPP0 <- melt(NPP, variable.names("3", "13", "14", "15", "16", "24", "25", "26"), na.rm=T)`

It worked fine.. and resulted on on column named 'variable', with my 'variable.names', as above, and a value column with corresponding values.

    variable   value
2           3 2688.00
3           3 1432.00
4           13 1336.00
5           14 1152.00
8           .. 1832.00

Now I want to get back and group each variable by one column, corresponding to its categorical name.

Just checking..
str(NPP0)
'data.frame':   5783 obs. of  2 variables:
 $ variable: Factor w/ 8 levels "3","13","14",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ value   : num  2688 1432 1336 1152 1832 ...

Then:

    NPP1 <-  spread(NPP0, key='variable', value='value', convert = T)

Gives:

    Error: Duplicate identifiers for rows (1, 2, 3,...)

I tried reshape2::dcast() function too. Although it gives something really strange:

    NPP1 <- dcast(NPP0, value ~ variable, value.var = 'value')

Aggregation function missing: defaulting to length

       value  3 13 14 15 16 24 25 26
1       0.16  0  0  0  0  0  1  0  0
2       0.92  0  7  0  0  0  0  0  0
3       1.00  0  2  0  0  0  0  0  0

Can anyone help with this?

What exactly do you want your output to look like? – camille May 08 '18 at 22:31 — camille, May 08 '18 at 22:31

score 1 · Answer 1 · answered May 15 '18 at 03:37

I solved using this:

# Removing NA values #
NPP0 <- melt(NPP, variable.names("3", "13", "14", "15", "16", "24", "25",26"), na.rm=T)

library(tidyr)

NPP1 <- as.data.frame (NPP0 %>% 
  group_by(variable) %>% 
  mutate(id = row_number()) %>% 
  spread(variable, value) )

Which gives:
View(NPP1)
[Reulting dataframe][1]

  [1]: https://i.stack.imgur.com/kI1HD.png

tHANK you for helping..

score 0 · Answer 2 · answered May 08 '18 at 21:45

0

Your data do not have any identifier for rows. it might be the reason.

NPP0$samples<-rownames(NPP0)
NPP1 <-  spread(NPP0, key='variable', value='value', fill=0)

try it, I hope it works.

answered May 08 '18 at 21:45

BJK

153
5

Hi BJK, I solved the problem using: NPP0 <- melt(NPP, variable.names("1", "2", "9", "10", "11", "12", "13", "20", "21", "22", "23"), na.rm=T) library(tidyr) NPP1 <- as.data.frame (NPP0 %>% group_by(variable) %>% mutate(id = row_number()) %>% spread(variable, value) %>% select(-id) ) – RODRIGO NUNES May 11 '18 at 13:36

R tidyr spread Error: Duplicate identifiers for rows

2 Answers2