0

I have a problem in R and hope someone can help me.

I have a dataframe of 400 rows an 2448 columns. I just wanted to create a new dataframe with a new order of the columns, so I wrote the following script:

Daten_ausgewaehlt_weniger_sort <- Daten_ausgewaehlt_weniger[,c("id", "p_sex", "p_age", "p_single", "p_mothertongue", "p_mothertongue_other", 
"p_courseofstudies_yes_no", "p_courseofstudies", "p_occupation")]

(With the difference that in my real script I have all of my 2448 columns in the c-vector and not just these 9 as here displayed.)

When I try to run this, i get this error message:

Error in [.data.frame(Daten_ausgewaehlt_weniger, , c("id", "p_sex", : undefined columns selected

I searched for wrong spelled column-names with this function:

setdiff(names(Daten_ausgewaehlt_weniger), c("id", "p_sex", "p_age", "p_single", "p_mothertongue", "p_mothertongue_other", 
"p_courseofstudies_yes_no", "p_courseofstudies", "p_occupation"))

First I found one wrong column name, but I corrected it and now the setdiff says "character(0)". So all column names should be defined right. But I still get the same error with "undefined columns selected".

I don't know what the mistake is. I think it is a really easy thing I want to do, but I don`t find a solution. I would be very happy if anyone could help me or has ideas what I could try.

Thanks in advance (and sorry for my english)!

esia_1
  • 53
  • 1
  • 2
  • 4
  • This thread may help: https://stackoverflow.com/questions/5620885/how-does-one-reorder-columns-in-a-data-frame – roarkz Jul 18 '17 at 19:52
  • Is there any *systematic* relationship between how the columns are arranged now and how you want them to be arranged in the future? Writing out 2448 columns is going to be fraught with error and also very long code. – Mark White Jul 18 '17 at 20:12
  • No, unfortunately there is no systematic relationship. I've already written out all the 2448 column names (I did not really write them all, but I used dput(colnames(Daten_ausgewaehlt_weniger)), what I've found here in another thread, and then just rearranged the columns). But now I have them, it is really difficult to find any mistake in them. – esia_1 Jul 18 '17 at 20:42

1 Answers1

0

This probably helps you:

        Library(data.table)
   #1  make the df to datatable format 
       setDT(Daten_ausgewaehlt_weniger)

   #2
        Daten_ausgewaehlt_weniger_sort <- Daten_ausgewaehlt_weniger[, c("id", "p_sex", "p_age", "p_single", "p_mothertongue", "p_mothertongue_other", 
            "p_courseofstudies_yes_no", "p_courseofstudies", "p_occupation")] 

   # alternative way and probably easier way would be just to go based on the number of columns name order 
        Daten_ausgewaehlt_weniger_sort <- Daten_ausgewaehlt_weniger[, c(3, 5, 1, 14:19)]# basically you reorder the column names based on their order place number in your original file ,i.e. Daten_ausgewaehlt_weniger

   #3 you can also use setcolorder()
      setcolorder(Daten_ausgewaehlt_weniger, c(3, 5, 1, 14:19)) 

   # one way to get the real order of your column names would be:
       as.list(names(Daten_ausgewaehlt_weniger))# you will get the list of the col names with their corresponding col number 
Daniel
  • 1,202
  • 2
  • 16
  • 25
  • 1
    You don't need to assign `dat <- setDT(dat)`, just calling `setDT(dat)` does this *by reference*. – juan Jul 18 '17 at 20:19