-1

I have a data frame of the form

> geneRows[1:3,]
      Probe/gene       logFC       CI.L       CI.R  AveExpr          t   P.Value adj.P.Val
17656  220307_at -0.09017596 -0.4395575 0.25920561 6.104288 -0.5992736 0.5662047         1
37517  220307_at  0.08704844 -0.2613434 0.43544028 6.104288  0.5801327 0.5784053         1
57376  220307_at -0.03501474 -0.1267764 0.05674688 6.152467 -0.7816350 0.4409881         1
              B  gene      GSE             Group1                   Group2  shape color
17656 -5.639256 CD244  GSE2461               Male                   Female x-open black
37517 -5.978691 CD244  GSE2461 ulcerative colitis irritable bowel syndrome x-open black
57376 -5.141940 CD244 GSE27887   nonlesional skin            lesional skin x-open black

I want to subset this so that I can get at the CI.R column when the CI.L column is a certain value. For example, I've tried

geneRows$CI.R[geneRows$CI.L == -0.4395575]

but I get back numeric(0), meaning an empty vector. However when I try it on a sample dataset, something like

mtcars$mpg[mtcars$carb==8]

it works just fine. They are the same data types and everything, so what's the issue here?

Background

I am creating lines to be added to a plotly plot.

line <- list(
      type = "line",
      line = list(color = "grey"),
      width = 0.2,
      xref = "x",
      yref = "y"
    )

    lines <- list()
    for (i in geneRows$CI.L) {
      line[["x0"]] <- i
      line[["x1"]] <- #here

      lines <- c(lines, list(line))
    }

They need to be drawn from CI.L to CI.R for each line. I am trying to get the end point x1 from the table via the start point.

Kyle Weise
  • 869
  • 1
  • 8
  • 29
  • can you check the class of each column? is it numeric ? – cderv Jul 10 '17 at 20:32
  • Yes both `CI.L` and `CI.R` are numeric – Kyle Weise Jul 10 '17 at 20:34
  • Try `dplyr::filter`: `filter(geneRows, CI.L == -0.4395575)` – Mako212 Jul 10 '17 at 20:36
  • and is it possible thath CI.L column is more thant 7 digits after zero ? `` – cderv Jul 10 '17 at 20:37
  • And you can wrap in `select()` to subset your columns – Mako212 Jul 10 '17 at 20:37
  • 1
    It's a bad idea to ever test a number with decimal place with `==`. For example: `0.1 + 0.05 == 0.15` is FALSE. Is that really what you need in this case? – MrFlick Jul 10 '17 at 20:38
  • Im going through`geneRows$CI.L` and want to get the value from `geneRows$CI.R` that is associated with that row. However I should take care of the comparing is unclear – Kyle Weise Jul 10 '17 at 20:40
  • 1
    @KyleWeise Then you should take a step back and describe the problem you are really trying to solve. You can directly extract those two columns; there's shouldn't be a need to look one up from the other. Seems like a case of the [XY Problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) – MrFlick Jul 10 '17 at 20:43
  • @MrFlick added some background..hope this helps. – Kyle Weise Jul 10 '17 at 20:50
  • @KyleWeise You should be using `Map` or `mapply` to iterate both vectors simultaneously to build such a data structure. – MrFlick Jul 10 '17 at 20:58
  • I don't see how that allows me to get the value from `CI.R`, which was my original question – Kyle Weise Jul 10 '17 at 21:06
  • @KyleWeise It would look like this: `lines<-Map(function(x0,x1) list(x0=x0, x1=x1), geneRows$CI.L, geneRows$CI.R)` No `for()` loop. – MrFlick Jul 10 '17 at 21:09
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/148830/discussion-between-kyle-weise-and-mrflick). – Kyle Weise Jul 10 '17 at 22:17

1 Answers1

0

Your numbers might have more precision than is being printed. For example:

> -0.4395575 == -0.4395575
[1] TRUE
> -0.4395575 == -0.4395575001
[1] FALSE

You could instead use all.equal which by default has a tolerance of 1.5e-8 but can be adjusted.

> all.equal(-0.4395575, -0.4395575001)
[1] TRUE
> all.equal(-0.4395575, -0.4395575001, tolerance = 1e-10)
[1] "Mean relative difference: 2.275015e-10"
Eric Watt
  • 3,180
  • 9
  • 21