How to decrease the dimensionality of dataset by creating a new column

Question

I'm looking to find a way to decrease the dimensionality of a dataset by creating a new column.

My original dataset is like this

Gene  ID1   ID2   ID3 ... ID1000
 
ACE    1     1     2 ...    0

GBA    0     1     2 ...    1

...

However, as you can tell, the dimensionality of the dataset is too high that I can't import that table into SQL database.

That's why I want to transform the dataset into the following form

Gene   ID    Genotype

ACE     1       1

ACE     2       1

ACE     3       2

...

ACE    1000     0

GBA.    0       0

GBA.    1       1

.....

Basically, I'm trying to create a new column called ID so that I can decrease the number of columns even though this will increase the number of rows...

What's the best way of doing this in R?

Sorry about the format, I don't know how to insert a table properly here....

score 1 · Answer 1 · answered Feb 22 '21 at 21:02

1

We can use

library(dplyr)
library(tidyr)
df1 %>%
   pivot_longer(cols = -Gene, names_to = 'ID', values_to = 'Genotype')

answered Feb 22 '21 at 21:02

akrun

874,273
37
540
662

score 1 · Answer 2 · answered Feb 22 '21 at 22:03

1

A base R option using reshape may help

reshape(
  setNames(df,gsub("ID","ID.",names(df))),
  direction = "long",
  idvar = "Gene",
  varying = -1
)

answered Feb 22 '21 at 22:03

ThomasIsCoding

96,636
9
24
81

How to decrease the dimensionality of dataset by creating a new column

2 Answers2