I have tab delimited text files. Each file has three columns -ProbeID
, Avgsignalintenities
, Pvalue
. Before further analysis, I want to ensure that the data in the ProbeID
column are correct. The ProbeID
column in each file contains approximately 47,315 values and so I'm concerned about performance. I've included a screen shot of a single file opened in Excel. Valid files should have only 47,234 ProbeIDs.
If you want more information I can provide you immediately.
I have given the minimal information in r code.I have 4 files in which file1 is length 10 while the others are 7,I want pass all these files together into a function and check whether all of them are same length or not..if not It should return a message that the a particular file(ie file 1) is not of equal length
file1=list(ProbeID=c(360450,1690139,5420594,3060411,450341,5420324,730162,4200739,1090156,7050341),X1234Avgintensity=c(110.3703,469.5097,407.557,123.9965 ,2234.529,190.7429,110.072,314.7892,153.486,160.4385),X1234Pvalue=c(0.8424522,0.01054713,0.01450231,0.5800923,0,0.1437047,0.8477257,0.02900461,0.286091,0.2406065))
file2=list(ProbeID=c(360450,1690139,5420594,3060411,450341,5420324,730162),X3456Avgintensity=c(110.3703,469.5097,407.557,123.9965,2234.529,190.7429,110.072),X3456Pvalue=c(0.8424522,0.01054713,0.01450231,0.5800923,0,0.1437047,0.8477257))
file3=list(ProbeID=c(360450,1690139,5420594,3060411,450341,5420324,730162),X678Avgintensity=c(66.78696,160.4022,207.996,80.48443,1187.988,91.58123,85.80681),X678Pvalue=c(0.9538563,0.02768622,0.01450231,0.6031641,0,0.313118,0.444298))
file4=list(ProbeID=c(360450,1690139,5420594,3060411,450341,5420324,730162),X8701Avgintensity=c(83.57081,141.5529,238.9153,98.10896,1060.654,97.65002,83.88175),X8701Pvalue=c(0.814766,0.03493738,0.005273566,0.3651945,0,0.3750824,0.808174))