12

There are other issues here addressing the same question, but I don't realize how to solve my problem based on it. So, I have 5 data frames that I want to merge rows in one unique data frame using rbind, but it returns the error:

"Error in row.names<-.data.frame(*tmp*, value = value) : 'row.names' duplicated not allowed In addition: Warning message: non-unique values when setting 'row.names': ‘1’, ‘10’, ‘100’, ‘1000’, ‘10000’, ‘100000’, ‘1000000’, ‘1000001 [....]"

The data frames have the same columns but different number of rows. I thought the rbind command took the first column as row.names. So tried to put a sequential id in the five data frames but it doesn't work. I've tried to specify a sequential row names among the data frames via row.names() but with no success too. The merge command is not an option I think because are 5 data frames and successive merges will overwrite precedents. I've created a new data frame only with ids and tried to join but the resulting data frame don't append the columns of joined df.

Follows an extract of df 1:

  id    image     power     value pol class
1  1 tsx_sm_hh 0.1834515 -7.364787  hh    FR
2  2 tsx_sm_hh 0.1834515 -7.364787  hh    FR
3  3 tsx_sm_hh 0.1991938 -7.007242  hh    FR
4  4 tsx_sm_hh 0.1991938 -7.007242  hh    FR
5  5 tsx_sm_hh 0.2079365 -6.820693  hh    FR
6  6 tsx_sm_hh 0.2079365 -6.820693  hh    FR
[...]
1802124 1802124 tsx_sm_hh 0.1991938 -7.007242  hh    FR  

The four other df's are the same structure, except the 'id' columns that don't have duplicated numbers among it. 'pol' and 'image' columns are defined as levels. and all.pol <- rbind(df1,df2,df3,df4,df5) return the this error of row.names duplicated.

Any idea?

Thanks in advance

Jecogeo
  • 339
  • 1
  • 4
  • 10
  • I can't reproduce your error. Can you post the `str` of two of the data frames? Have you tried just binding two and get the same error? is `rbind(df1,df2,df3,df4,df5)` the exact code you use to produce these errors? – rawr Mar 31 '14 at 16:41
  • > str(forest) 'data.frame': 1802124 obs. of 6 variables: $ id : int 1 2 3 4 5 6 7 8 9 10 ... $ image: Factor w/ 9 levels "tsx_sm_hh","tsx_sm_hv",..: 1 1 1 1 1 1 1 1 1 1 ... $ power: num 0.183 0.183 0.199 0.199 0.208 ... $ sigma:'data.frame': 1802124 obs. of 1 variable: ..$ value: num -7.36 -7.36 -7.01 -7.01 -6.82 ... $ pol : Factor w/ 3 levels "hh","hv","vv": 1 1 1 1 1 1 1 1 1 1 ... $ class: chr "FR" "FR" "FR" "FR" ... – Jecogeo Mar 31 '14 at 19:42
  • > str(herb) 'data.frame': 1960371 obs. of 6 variables: $ id : int 1802125 1802126 1802127 1802128 1802129 1802130 1802131 1802132 1802133 1802134 ... $ image: Factor w/ 9 levels "tsx_sm_hh","tsx_sm_hv",..: 1 1 1 1 1 1 1 1 1 1 ... $ power: num 0.16 0.165 0.165 0.165 0.185 ... $ sigma:'data.frame': 1960371 obs. of 1 variable: ..$ value: num -7.96 -7.84 -7.84 -7.84 -7.32 ... $ pol : Factor w/ 3 levels "hh","hv","vv": 1 1 1 1 1 1 1 1 1 1 ... $ class: chr "HB" "HB" "HB" "HB" ... – Jecogeo Mar 31 '14 at 19:42
  • @rawr I tried to bind only 2 df (in many combinations) and got the same error. The code I've used is 'all.pol <- rbind(forest,herb)'. I'm not sure if is it enough for you. – Jecogeo Mar 31 '14 at 19:45
  • 2
    you seem to have some data frames in your data frame. what is sigma? and how did you put it in forest and herb? You might fix it by doing something like `forest$sigma <- forest$sigma$value` – rawr Mar 31 '14 at 19:51
  • Bingo!! Nice! I'm a little rookie in it, so I can't figure that issue. The rbind now worked. Some NAs was generated, but its another problem that I will review. Thanks a lot! (sorry but I'm new in stackoverflow too and dont know how to finish this issue...) – Jecogeo Mar 31 '14 at 20:01

2 Answers2

27

I had the same error recently. What turned out to be the problem in my case was one of the attributes of the data frame was a list. After casting it to basic object (e.g. numeric) rbind worked just fine.

By the way row name is the "row numbers" to the left of the first variable. In your example, it is 1, 2, 3, ... (the same as your id variable).

You can see it using rownames(df) and set it using rownames(df) <- name_vector (name_vector must have the same length as df and its elements must be unique).

7

I had the same error.

My problem was that one of the columns in the dataframes was itself a dataframe. and I couldn't easily find the offending column

data.table::rbindlist() helped to locate it

library(data.table)
rbindlist(a)
# Error in rbindlist(a) : 
#   Column 25 of item 1 is length 2 inconsistent with column 1 which is length 16. Only length-1 columns are recycled.


a[[1]][, 25] %>% class # "data.frame" K- this should obviously be converted to a column or removed

After removing the errant columndo.call(rbind, a) worked as expected

stevec
  • 41,291
  • 27
  • 223
  • 311
  • 3
    This was my error too, but I found it without data.table by checking str(a) – SCallan Apr 22 '21 at 16:40
  • 1
    My problem as well - `lapply(a,class)` will help find the offending columns - anything with a type `list` or `data.frame` had to be addressed for me – slammaster Oct 04 '22 at 18:37