0

I have a data frame in R that I want to analyse. I want to know how many specific numbers are in a data frame column. for example, I want to know the frequency of number 0.9998558 by using

sum(deviation_multiple_regression_3cell_types_all_spots_all_intersection_genes_exclude_50_10dec_rowSums_not_0_for_moran_scaled[,3]== 0.9998558)

However, it seems that the decimal shown is not the actual one (it must be 0.9998558xxxxx) since the result I got from using the above command is 0 (the correct one should be 3468). How can I access that number without knowing the exact decimal numbers so that I get the correct answer? Please see the screenshot below.

dataframe

MK Huda
  • 605
  • 1
  • 6
  • 16
  • 1
    You need to round to a specific precision. Read this: https://stackoverflow.com/a/9508558/1412059 – Roland Dec 19 '22 at 11:59
  • 1
    You could set `x` to be one of the values that you know is correct (`x <- df[a,b]`), and then count how many values equal `x`. – Andrew Gustar Dec 19 '22 at 12:01
  • @Roland Rounding actually has its own issues, a better solution would be to replace the equality check with a check that uses a relative floating point tolerance. (Unfortunately the `all.equal` function, which implements this, uses a fairly hard to understand method of computing the relative tolerance.) – Konrad Rudolph Dec 19 '22 at 12:20

2 Answers2

-1

The code below gives the number of occurrences in the column.

x <- 0.9998558
length(which(df$a==x))
-2

If you are looking for numbers stating with 0.9998558, I think you can do it in two different ways: working with data as numeric or as character. Let x be your variable:

Data as character

This way counts exactly what you are looking for

  sum(substr(as.character(x),1,9)=="0.9998558")

Data as numeric

This will include all the values with a difference with the reference value lower than 1e-7; this may include values not starting exactly with 0.9998558

  sum(abs(x-0.9998558)<1e-7)

You can also "truncate" the numbers in your vector and compare them with the number you want. Here, we write 10^7 because 7 is the number of decimals you want to compare.

sum(trunc(x*10^7)/10^7)==0.9998558)
R18
  • 1,476
  • 1
  • 8
  • 17
  • 1
    Sorry, that first method is a *terrible* way of doing this and is not a good recommendation, hence the downvote. The second way is *okay*, but using a relative tolerance is almost always more appropriate than using an absolute tolerance. `all.equal` implements this. – Konrad Rudolph Dec 19 '22 at 12:21
  • 1
    The first option is a "practical" way of getting what @MK Huda is asking for, you can also truncate the number and compare them, but at the end is the same . About the relative tolerance, I think it is not necesary for this purpose because you have to do more operations than those you really need. – R18 Dec 20 '22 at 06:57