0

The vector a looks like it is all ones, is numeric, but somehow there are elements not equal to 1. I created it with colSums of a dataframe where I had normalized the columns.

> taxa_abundance = 
+   taxa_counts %>%
+   ### remember to avoid non-numeric columns
+   mutate_if(is.numeric, funs(./sum(.))) %>%
+   ### remember to avoid non-numeric columns
+   mutate(rowmean = apply(select_if(.,is.numeric), 1, mean))
>   # mutate(rowmean = apply(select(., sample_names), 1, mean))
> 
> ### Can do a sanity check:
> a = taxa_abundance %>% 
+   select_if(is.numeric) %>% 
+   select(-rowmean) %>%
+   colSums() %>%
+   as.numeric() %>% c()
> 
> print(a)
  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 [39] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 [77] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[115] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
> 
> print(class(a))
[1] "numeric"
> 
> print(typeof(a))
[1] "double"
> 
> print(any(a != 1))
[1] TRUE

> a != 1
  [1] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE
 [14]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
 [27] FALSE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 [40]  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE
 [53]  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 [66]  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 [79]  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE FALSE  TRUE  TRUE
 [92]  TRUE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE
[105]  TRUE  TRUE FALSE FALSE FALSE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
[118]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[131]  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE

Is there a sensible and clean way to do the operation I'm trying to do?

abalter
  • 9,663
  • 17
  • 90
  • 145
  • 5
    I think it is related to https://stackoverflow.com/questions/9508518/why-are-these-numbers-not-equal. In one of the steps, you are doing `mean` and it can be that the print value is not exactly equal to 1. Just take the difference `a - 1` – akrun May 21 '19 at 18:17
  • 2
    Surprising things can happen with floating point numbers. What exactly do you need to do? Are you trying to test for numbers that are basically 1? As the duplicate suggests, you can use `all.equal()` rather than `==` to do more robust comparisons with floating point numbers. – MrFlick May 21 '19 at 18:34
  • @akrun, I've excluded the mean from the final step as it should NOT sum to one. Just the normalized columns. – abalter May 21 '19 at 18:48
  • @MrFlick -- that's a good suggestion. – abalter May 21 '19 at 18:48
  • @MrFlick -- I tried `isTRUE(all.equal(a,1))` and got `FALSE`. I gave normalizing the columns my best shot--dividing by the colsums. What else can I do? – abalter May 21 '19 at 18:53
  • 1
    A different answer on the duplicate post from MrFlick had a solution that did work for me. I used `dplyr`'s `near` function, which reported all of a equal to 1. Perhaps it makes sense since I was using dplyr to do the normalizing work. – abalter May 21 '19 at 18:56

0 Answers0