0

I have to subset a dataframe based on a specific value of a row. This means that if the row containing values greater than 10 must be used as criteria to extract all the column that satisfy the condition in that row.

Here is my data sample.

structure(list(`Copper ores and concentrates; copper mattes, cemen` = c(200.53, 
274.84, 1.37, 376.686907694609), `Fabrics, woven, of man-made fabrics` = c(4093.12, 
1184.47, 0.29, 342.762777758776), Copper = c(44.76, 91.45, 2.04, 
186.843219392315), Zinc = c(80.14, 110.73, 1.38, 152.996417519341
), `Waste, parings and scrap, of plastics` = c(590.3, 286.3, 
0.49, 138.857682534305), `Fixed vegetable fats & oils, crude, refined, fract.` = c(864.14, 
344.63, 0.4, 137.44281817761), `Sulphur and unroasted iron pyrites` = c(23.99, 
55.11, 2.3, 126.599087119633), `Radio-actives and associated materials` = c(48.59, 
76.67, 1.58, 120.977338958633), `Rails & railway track construction mat., iron, steel` = c(464.66, 
214.76, 0.46, 99.259367279301), `Iron ore and concentrates` = c(46.91, 
67.8, 1.45, 97.9927520784481), `Crude vegetable materials, n.e.s.` = c(164.46, 
123.26, 0.75, 92.3812939316551), `Other plastics, in primary forms` = c(187.76, 
124.21, 0.66, 82.169386983383), `Crude animal materials, n.e.s.` = c(43.08, 
56.52, 1.31, 74.1529805013928), `Pig iron & spiegeleisen, sponge iron, powder & granu` = c(17.17, 
33.03, 1.92, 63.5399475829936), `Ores and concentrates of base metals, n.e.s.` = c(15.7, 
27.6, 1.76, 48.5197452229299), `Furskins, tanned or dressed, excluding those of 8483` = c(178.49, 
75.12, 0.42, 31.6152972155303), `Metalworking machinery (excludingmachine-tools) & parts` = c(179.18, 
71.69, 0.4, 28.6832018082375)), row.names = c("SD", "Mean", "INTENSITY", 
"INTENSITY2"), class = "data.frame")

I want that the dataframe must limit itself to values greater than 10 in the row named INTENSITY2.

I tried this tf4[, tf4[,"INTENSITY2" > 10, ]] but it does not work.

ambrish dhaka
  • 689
  • 7
  • 27
  • Possible duplicate of [Filter data.frame rows by a logical condition](https://stackoverflow.com/questions/1686569/filter-data-frame-rows-by-a-logical-condition) – Matt Summersgill Jun 19 '19 at 18:40
  • this is different, just checked the page. – ambrish dhaka Jun 19 '19 at 18:41
  • Try it: `df[, df["INTENSITY2", ] > 10, FALSE]`. The only difference with your code is in filtering: you made a typo and filtering by column values, not by row ones. – Pavel Filatov Jun 19 '19 at 18:45
  • It gives error as `Error in `[.data.frame`(tf4, , tf4["INTENSITY2", ] > 10, FALSE) : undefined columns selected` – ambrish dhaka Jun 19 '19 at 18:49
  • That's weird. Your code should produce such an error. Does `tf4["INTENSITY2", ] > 10` produce a logical vector? Allright, there is another solution: `dplyr::select_if(tf4, tf4["INTENSITY2", ] > 10)`. – Pavel Filatov Jun 19 '19 at 19:00
  • Thanks, I would rather stick to simple ideas as my skills are elementary. Transposing the matrix is one great idea that I got here from the discussion and then working on columns is easy. – ambrish dhaka Jun 19 '19 at 19:03
  • 1
    Yes! Using tidy data is much easier. I'll provide another snippet of code, so you may choose =) `library(tidyverse); tf4 %>% rownames_to_column("tmp") %>% gather(variable, val, -tmp) %>% spread(tmp, val) ` – Pavel Filatov Jun 19 '19 at 19:07

1 Answers1

2

This works as well.

tf4[,unname(apply(tf4['INTENSITY2',],1,function(x) which(x>10)))]
spazznolo
  • 747
  • 3
  • 9
  • 1
    apply returns the names of the columns. unname removes them. 1 signifies that we are applying the function rowwise, and not columnwise (which is 2). – spazznolo Jun 19 '19 at 18:41
  • though I have understood and found it exactly suitable, I wonder why working with rows is so difficult than working with columns. – ambrish dhaka Jun 19 '19 at 18:43
  • 1
    variables as columns and observations as rows is the standardized way to set up a dataframe. most work is done to improve these operations. However, in this operation, we are treating the rows (sd, mean, etc.) as the variable, and the columns (materials) as observations, which is not an ideal setup. – spazznolo Jun 19 '19 at 18:47