I am new to R and have been trying to implement a code setup to analyses gene expression and genetic mutation status to predict outcomes in breast cancer patients.
the original code was published in Nature for Acute myeloid Leukemia data sets and can be downloaded from: http://www.nature.com/ncomms/2015/150109/ncomms6901/full/ncomms6901.html
following supplemental data 4 code
I am unable to replicate their data, as there is a code error in the data.frame
I am able to load all of my data from cBioportal using the following code:
mycgds <- CGDS("http://www.cbioportal.org/public-portal/")
brca_tcga <- getCancerStudies(mycgds)[15,1] ## 15 for BRCA
cases <- getCaseLists(mycgds,brca_tcga)[8,1] ## 8 for RNA expression z scores
g <- lapply(split(as.numeric(entrez), seq_along(entrez)%/%500), function(genes) getProfileData(mycgds,genes,getGeneticProfiles(mycgds,brca_tcga)[2,1],cases)) ## loads my sample information into a data.frame "g"
then I try to impliment following code:
g <- do.call("cbind", g)
which yields an error-
> g <- do.call("cbind", g)
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 173, 0
I have tried to follow threads but some of them are above my head, I am not sure if something went wrong in constructing the data.frame or where to begin to fix this issue. Any assistance would be appreciated or pointing me to a good document explaining whats going on.
I can print my data by calling g:
WDR38 WDR63 WDR86 ZBED9 ZCWPW2 ZNF283 ZNF300P1 ZNF418 ZNF600
TCGA.AB.2803.03 NA NA NA NA NA NA NA NA NA
TCGA.AB.2805.03 NA NA NA NA NA NA NA NA NA
TCGA.AB.2806.03 NA NA NA NA NA NA NA NA NA
TCGA.AB.2807.03 NA NA NA NA NA NA NA NA NA
TCGA.AB.2808.03 NA NA NA NA NA NA NA NA NA
small example, but am unable to go through the next step of code.
:-(
Thank you all for any assistance or education you may provide!