4

I have a data frame with dimensions 3695 X 20. The first column contains alphanumeric identifiers, the other 19 columns are all numeric. So, rownames(df) provides the numbers 1-3695, and colnames(df) gives the names of the columns. df[,1] provides the alphanumeric identifiers.

I would like to convert the data frame to a matrix and use column 1 of the existing data frame to be the rownames of the new matrix and maintain the column names of the data frame as the column names of the matrix.

I would also like to automate this process for use with data frames of similar but different dimensions. So, if the solution to this requires knowing the number of rows and/or columns, how can I get this information into the code without me having to look at the monitor ?

I have looked at data.matrix and reshape2 but can not seem to figure out how to do what I want.

Matthew
  • 79
  • 1
  • 1
  • 8
  • Wait you said the first column contains identifiers, but also that `rownames(df)` identifiers. There are not the same things. `rownames()` are not stored as a column, they are stored as an attribute. So what's the case with your data? Have you simply duplicated the information. You really should add a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) to make it clear what your input is and what your desired output is. You can use `dim()`, `nrow()`, or `ncol()` to get the dimensions of a data.frame if that's the only question. – MrFlick Sep 05 '14 at 21:56
  • Sorry, my mistake, rownames(df) is just a list of numbers; 1-3695. – Matthew Sep 05 '14 at 22:37
  • If there is a mistake in your question, you should edit it. Again, it's not fun to answer without a reproducible example because we just end up wasting a lot of time creating one and ours may or may not actually match your data. See [how to ask a good question](http://stackoverflow.com/help/how-to-ask). – MrFlick Sep 05 '14 at 22:47
  • For simplicity sake, here is a 4 X 4 simpler version of my larger data frame: structure(list(gene = c("AT1G01040", "AT1G01270", "AT1G01471", "AT1G01680"), log2.fold_change._Mer7_2.1_Mer7_2.2 = c(0, 0, 0, 0), log2.fold_change._Mer7_1.2_W29_S226A_1 = c(0, 0, -1.14, 0 ), log2.fold_change._Mer7_1.2_W29_1 = c(0, 0, 0, 0)), .Names = c("gene", "log2.fold_change._Mer7_2.1_Mer7_2.2", "log2.fold_change._Mer7_1.2_W29_S226A_1", "log2.fold_change._Mer7_1.2_W29_1"), row.names = c(NA, 4L), class = "data.frame") – Matthew Sep 05 '14 at 22:48

2 Answers2

19

With your sample data

X<-structure(list(gene = c("AT1G01040", "AT1G01270", "AT1G01471", "AT1G01680"), log2.fold_change._Mer7_2.1_Mer7_2.2 = c(0, 0, 0, 0), log2.fold_change._Mer7_1.2_W29_S226A_1 = c(0, 0, -1.14, 0 ), log2.fold_change._Mer7_1.2_W29_1 = c(0, 0, 0, 0)), .Names = c("gene", "log2.fold_change._Mer7_2.1_Mer7_2.2", "log2.fold_change._Mer7_1.2_W29_S226A_1", "log2.fold_change._Mer7_1.2_W29_1"), row.names = c(NA, 4L), class = "data.frame")

You can write a simple helper function to create a matrix and set the right names

matrix.please<-function(x) {
    m<-as.matrix(x[,-1])
    rownames(m)<-x[,1]
    m
}

and you would use it like

M <- matrix.please(X)
str(M)
#  num [1:4, 1:3] 0 0 0 0 0 0 -1.14 0 0 0 ...
#  - attr(*, "dimnames")=List of 2
#   ..$ : chr [1:4] "AT1G01040" "AT1G01270" "AT1G01471" "AT1G01680"
#   ..$ : chr [1:3] "log2.fold_change._Mer7_2.1_Mer7_2.2"  
# "log2.fold_change._Mer7_1.2_W29_S226A_1" "log2.fold_change._Mer7_1.2_W29_1"

So we have a 4x3 matrix with the correct row and col names.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
0

Using tidyverse and pipes.

X<-structure(list(gene = c("AT1G01040", "AT1G01270", "AT1G01471", "AT1G01680"), log2.fold_change._Mer7_2.1_Mer7_2.2 = c(0, 0, 0, 0), log2.fold_change._Mer7_1.2_W29_S226A_1 = c(0, 0, -1.14, 0 ), log2.fold_change._Mer7_1.2_W29_1 = c(0, 0, 0, 0)), .Names = c("gene", "log2.fold_change._Mer7_2.1_Mer7_2.2", "log2.fold_change._Mer7_1.2_W29_S226A_1", "log2.fold_change._Mer7_1.2_W29_1"), row.names = c(NA, 4L), class = "data.frame")

M <- X |>
  tibble::column_to_rownames(var = "gene") |>
  as.matrix() 

str(M)
#>  num [1:4, 1:3] 0 0 0 0 0 0 -1.14 0 0 0 ...
#>  - attr(*, "dimnames")=List of 2
#>   ..$ : chr [1:4] "AT1G01040" "AT1G01270" "AT1G01471" "AT1G01680"
#>   ..$ : chr [1:3] "log2.fold_change._Mer7_2.1_Mer7_2.2" "log2.fold_change._Mer7_1.2_W29_S226A_1" "log2.fold_change._Mer7_1.2_W29_1"

Created on 2023-08-21 with reprex v2.0.2

JWilliman
  • 3,558
  • 32
  • 36