4

I have a data.frame:

     target_id sample1 sample10 sample100 sample101 sample102 sample103
1: ENST00000000233       9        0   3499.51         0         0         0
2: ENST00000000412       0        0      0.00         0         0         0
3: ENST00000000442       0        0      0.00         0         0         0
4: ENST00000001008       0        0      0.00         0         0         0
5: ENST00000001146       0        0      0.00         0         0         0
6: ENST00000002125       0        0      0.00         0         0         0

I would like to convert it to another data.frame, where $target_id will be a row name. Specifically, I want to perform clustering on numerical data (from sample columns) and then be able to access their gene entities (for example: ENST00000000233)

                sample1 sample10 sample100 sample101 sample102 sample103
ENST00000000233       9        0   3499.51         0         0         0
ENST00000000412       0        0      0.00         0         0         0
ENST00000000442       0        0      0.00         0         0         0
ENST00000001008       0        0      0.00         0         0         0
ENST00000001146       0        0      0.00         0         0         0
ENST00000002125       0        0      0.00         0         0         0

Is it possible to create such data.frame in R?

M--
  • 25,431
  • 8
  • 61
  • 93
Olha Kholod
  • 539
  • 1
  • 5
  • 11

3 Answers3

10

It can be achieved without defining a new variable:

df1 <- data.frame(df1[,-1], row.names = df1[,1])


#                 sample1 sample10 sample100 sample101 sample102 sample103 
# ENST00000000233       9        0   3499.51         0         0         0 
# ENST00000000412       0        0      0.00         0         0         0 
# ENST00000000442       0        0      0.00         0         0         0 
# ENST00000001008       0        0      0.00         0         0         0 
# ENST00000001146       0        0      0.00         0         0         0 
# ENST00000002125       0        0      0.00         0         0         0
M--
  • 25,431
  • 8
  • 61
  • 93
  • Thank you for your suggestion, but, unfortunately I got the following error: `Error in df[, 1] : object of type 'closure' is not subsettable` – Olha Kholod Aug 05 '17 at 22:12
  • 1
    @OlhaKholod, `object of type 'closure'` means a function. You are using the name of a `base R` function, `df`, the density of the `F` distribution. Change the name of your `data.frame`. Since you've mentioned package `data.table`, you should also avoid `dt`, for the very same reason. – Rui Barradas Aug 06 '17 at 10:56
6

First your data example.

mydf <-
structure(list(target_id = c("ENST00000000233", "ENST00000000412", 
"ENST00000000442", "ENST00000001008", "ENST00000001146", "ENST00000002125"
), sample1 = c(9L, 0L, 0L, 0L, 0L, 0L), sample10 = c(0L, 0L, 
0L, 0L, 0L, 0L), sample100 = c(3499.51, 0, 0, 0, 0, 0), sample101 = c(0L, 
0L, 0L, 0L, 0L, 0L), sample102 = c(0L, 0L, 0L, 0L, 0L, 0L), sample103 = c(0L, 
0L, 0L, 0L, 0L, 0L)), .Names = c("target_id", "sample1", "sample10", 
"sample100", "sample101", "sample102", "sample103"), class = "data.frame", row.names = c("1:", 
"2:", "3:", "4:", "5:", "6:"))

Now the code.

result <- mydf[-1]
row.names(result) <- mydf$target_id
result
                sample1 sample10 sample100 sample101 sample102 sample103
ENST00000000233       9        0   3499.51         0         0         0
ENST00000000412       0        0      0.00         0         0         0
ENST00000000442       0        0      0.00         0         0         0
ENST00000001008       0        0      0.00         0         0         0
ENST00000001146       0        0      0.00         0         0         0
ENST00000002125       0        0      0.00         0         0         0

Simple, no?

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
  • Thank you for your answer! When I ran `row.names(result) <- mydf$target_id`, I got an error: `Error in row.names<-.data.frame(*tmp*, value = c("ENST00000000233", : invalid 'row.names' length` – Olha Kholod Aug 05 '17 at 22:28
  • I fix this error. My data.frame had also data.table class, so I save it only as data.frame – Olha Kholod Aug 05 '17 at 22:38
6

Here is an option using tidyverse

library(tidyverse)
df1 %>%
     remove_rownames() %>%
     column_to_rownames(var = 'target_id')
#                sample1 sample10 sample100 sample101 sample102 sample103
#ENST00000000233       9        0   3499.51         0         0         0
#ENST00000000412       0        0      0.00         0         0         0
#ENST00000000442       0        0      0.00         0         0         0
#ENST00000001008       0        0      0.00         0         0         0
#ENST00000001146       0        0      0.00         0         0         0
#ENST00000002125       0        0      0.00         0         0         0
akrun
  • 874,273
  • 37
  • 540
  • 662