0

I'm trying to write a program in R language and i use for loops and if statement

i have a data that contain 17 rows and 1091 columns (ariables)

I want to compare the values of a the 17th row and put the columns that have the same values in one data fram to treate them after the algorithm i though of contain the following steps :

1-Take the column i want to compare and put it in new data frame (Sub_data)
2- compare the value in the 17th of this column with all the others values of other columns in the first data (All_data)

3-when the value of the column equal to the value of any other column (B) take that column B and add it to the data frame

4-after that i want to compare the variation of the variables in the Sub_data (that contains the same values of the 17th rows) and chose one column of the columns that has the same variation and eliminate the others

Here i present the rows and the first two columns of my data ( All_data)

                         MT95T843               MT95T756
QC_G.F9_01_4768           70027.0213162601      95774.1359666849
QC_G.F9_01_4765           69578.1863357392      81479.2957458262
QC_G.F9_01_4762           69578.1863357392      87021.9542724389
QC_G.F9_01_4759           68231.1433794304      95558.7673782843
QC_G.F9_01_4756           64874.1293568862      96780.772452217
QC_G.F9_01_4753           63866.6577969569      91854.3530432699
CtrF01R5_G.D1_01_4757     66954.3879935821      128861.361627886
CtrF01R4_G.D5_01_4763     97352.5522885788      101353.25926633
CtrF01R3_G.C8_01_4754     61311.7857641721      7603.60895516428
CtrF01R2_G.D3_01_4760     85768.3611731878      109461.75444564
CtrF01R1_G.C9_01_4755     85302.8194715206      104253.845374077
BtiF01R5_G.D7_01_4766     61252.4254487766      115683.737549183
BtiF01R4_G.D6_01_4764     81873.9637852956      112164.142293011
BtiF01R3_G.D2_01_4758     84981.2191408476       0
BtiF01R2_G.D4_01_4761     36629.0246187626      124806.491006666
BtiF01R1_G.D8_01_4767      0                    109927.264246577
rt                        13.9018138671285      13.9058590777331
  • Code for input dataframe :
df1 <- data.frame(Name  = c("QC_G.F9_01_4768", "QC_G.F9_01_4765", "QC_G.F9_01_4762", "QC_G.F9_01_4759", "QC_G.F9_01_4756", "QC_G.F9_01_4753",
                            "CtrF01R5_G.D1_01_4757", "CtrF01R4_G.D5_01_4763", "CtrF01R3_G.C8_01_4754", "CtrF01R2_G.D3_01_4760", "CtrF01R1_G.C9_01_4755",
                            "BtiF01R5_G.D7_01_4766", "BtiF01R4_G.D6_01_4764", "BtiF01R3_G.D2_01_4758", "BtiF01R2_G.D4_01_4761", "BtiF01R1_G.D8_01_4767",
                            "rt"),
                  MT95T843 = c(70027.0213162601, 69578.1863357392, 69578.1863357392, 68231.1433794304, 64874.1293568862, 63866.6577969569, 66954.3879935821,
                               97352.5522885788, 61311.7857641721, 85768.3611731878, 85302.8194715206, 61252.4254487766, 81873.9637852956, 84981.2191408476,
                               36629.0246187626, 0, 13.9018138671285),
                  MT95T756 = c(95774.1359666849, 81479.2957458262, 87021.9542724389, 95558.7673782843, 96780.772452217, 91854.3530432699, 128861.361627886,
                               101353.25926633, 7603.60895516428, 109461.75444564, 104253.845374077, 115683.737549183, 112164.142293011, 0, 124806.491006666,
                               109927.264246577, 13.9058590777331))

                                          
df1
#>                     Name    MT95T843     MT95T756
#> 1        QC_G.F9_01_4768 70027.02132  95774.13597
#> 2        QC_G.F9_01_4765 69578.18634  81479.29575
#> 3        QC_G.F9_01_4762 69578.18634  87021.95427
#> 4        QC_G.F9_01_4759 68231.14338  95558.76738
#> 5        QC_G.F9_01_4756 64874.12936  96780.77245
#> 6        QC_G.F9_01_4753 63866.65780  91854.35304
#> 7  CtrF01R5_G.D1_01_4757 66954.38799 128861.36163
#> 8  CtrF01R4_G.D5_01_4763 97352.55229 101353.25927
#> 9  CtrF01R3_G.C8_01_4754 61311.78576   7603.60896
#> 10 CtrF01R2_G.D3_01_4760 85768.36117 109461.75445
#> 11 CtrF01R1_G.C9_01_4755 85302.81947 104253.84537
#> 12 BtiF01R5_G.D7_01_4766 61252.42545 115683.73755
#> 13 BtiF01R4_G.D6_01_4764 81873.96379 112164.14229
#> 14 BtiF01R3_G.D2_01_4758 84981.21914      0.00000
#> 15 BtiF01R2_G.D4_01_4761 36629.02462 124806.49101
#> 16 BtiF01R1_G.D8_01_4767     0.00000 109927.26425
#> 17                    rt    13.90181     13.90586

I'm stuck in the third step where i got this error message

Error in Sub_data[1, i] : subscript out of bounds

Here's the code i used :

   library("readxl")
   library("janitor")


All_data <- read_excel("DataMatrix_Excel.xlsx")
dim(All_data)
17 1091
 for(i in 1:1091){
      #Add column
      Sub_data <- cbind(All_data[ , 1, drop=F])
      for(j in 2:1091){
        if(Sub_data[17,1]==All_data[17,j]) {
          Sub_data <- cbind(Sub_data,All_data[ , j, drop=F])
          #I added this line just to see if my code work 
          print(paste("The dim is " , dim(Sub_data)))
          
        }

Please tell me if you need any more informations or clarification, also please tell me if you need any suggestions Thank you very much

Reda
  • 449
  • 1
  • 4
  • 17
  • Welcome to SO, Anouar! Please make this question *reproducible*. This includes sample code you've attempted (including listing non-base R packages, and any errors/warnings received), sample *unambiguous* data (e.g., `data.frame(x=...,y=...)` or the output from `dput(head(x))`), and intended output given that input. Refs: https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info. – r2evans Oct 22 '21 at 20:14
  • I edited my question , i hope it's reproducible now, thank you – Reda Oct 22 '21 at 20:21
  • It is not reproducible. Are you imagining that we have access to your `"DataMatrix_Excel.xlsx"` file or know what it looks like? Please read the links I gave, they have some good discussions on providing unambiguous, *sample* data to your questions. Thanks. – r2evans Oct 22 '21 at 20:23
  • Thank you , i added capture and other informations of the data – Reda Oct 22 '21 at 20:31
  • The third of the links I provided says *"Do not embed pictures for data or code"*; it breaks screen-readers and it cannot be copied or searched (ref: https://meta.stackoverflow.com/a/285557 and https://xkcd.com/2116/). Please just include the code, console output, or data (e.g., `data.frame(...)` or the output from `dput(head(x))`) directly. (***Really***, please go through the links. Asking questions in a way that makes it easy for others to help *is to your advantage*. Make it easier and others are more likely to give it a try. Thanks!) – r2evans Oct 22 '21 at 20:34
  • I added example of my data, i hope it's clear now , thank you very much for your clarifications – Reda Oct 22 '21 at 20:46
  • Your error code mentions `Sub_data_3[1,i]`; discarding the `_3`, none of *this* code subsets `Sub_data` by `i`-columns. – r2evans Oct 22 '21 at 20:56
  • i corrected it, thank you – Reda Oct 22 '21 at 20:59
  • 1
    I'm just really confused. The error says that somewhere, somebody (you other otherwise) tried to subset `Sub_data` on the `i`th column, but in your code, you create `Sub_data` as a single-column frame, and you only reference it wholly from there on out. Perhaps related, but you have `for (i in 1:1091)` *and you never reference `i`*. – r2evans Oct 22 '21 at 22:46

0 Answers0