I have a list of data tables stored in an object ddf
(a sample is shown below):
[[43]]
V1 V2 V3
1: b c a
2: b c a
3: b c a
4: b c a
5: b b a
6: b c a
7: b c a
[[44]]
V1 V2 V3
1: a c a
2: a c a
3: a c a
4: a c a
5: a c a
[[45]]
V1 V2 V3
1: a c b
2: a c b
3: a c b
4: a c b
5: a c b
6: a c b
7: a c b
8: a c b
9: a c b
.............and so on till [[100]]
I want to Subset the list ddf
such that the result only consists of ddf's which:
- have at least 9 rows each
- each of the 9 rows are same
- I want to store this sub-setted output
I have written some code for this below:
for(i in 1:100){
m=(as.numeric(nrow(df[[i]]))>= 9)
if(m == TRUE & df[[i]][1,] = df[[i]][2,] =
=df[[i]][3,] =df[[i]][4,] =df[[i]][5,] =df[[i]][6,]=
df[[i]][7,]=df[[i]][8,]=df[[i]][9,]){
print(df[[i]])
}}
Please tell me whats wrong & how I can generalize the result for sub-setting based on "n" similar rows.
[Follow-up Question]
Answer obtained from Main question:
> ddf[sapply(ddf, function(x) nrow(x) >= n & nrow(unique(x)) == 1)]
$`61`
V1 V2 V3
1: a c b
2: a c b
3: a c b
4: a c b
5: a c b
6: a c b
7: a c b
$`68`
V1 V2 V3
1: a c a
2: a c a
3: a c a
4: a c a
5: a c a
6: a c a
7: a c a
8: a c a
$`91`
V1 V2 V3
1: b c a
2: b c a
3: b c a
4: b c a
5: b c a
6: b c a
7: b c a
..... till the last data.frame which meet the row matching criteria (of at least 9 similar rows)
There are only 2 types of elements in the list:
**[[.. ]]**
**Case 1.** >70% accuracy
**Case 2.** <70% accuracy
You will notice that the Output shown above in the "Follow Up Question" is for
$'61', $'68' & $'91', but there is NO output for the other dataframes which don't match the "matching row" criteria.
I need an output where these missing values which don't match the output criteria give an output of "bad output".
Thus the Final list should be the same length as the input list.
By placing them side-by-side using paste I should be able to see each output.