1

There's a weird left_join I have. The numbers between -1.0 and 1.0 aren't matching.

Here's the class of the left table:

> str(standings)    
'data.frame':   30 obs. of  9 variables:
     $ team     : chr  "Cleveland Cavaliers" "Toronto Raptors" "Miami Heat" "Atlanta Hawks" ...
     $ w        : chr  "57" "56" "48" "48" ...
     $ l        : chr  "25" "26" "34" "34" ...
     $ w/l%     : chr  ".695" ".683" ".585" ".585" ...
     $ conf     : chr  "east" "east" "east" "east" ...
     $ conf_rank: int  1 2 3 4 5 6 7 8 9 10 ...
     $ tm_pts   : num  104 103 100 103 106 ...
     $ op_pts   : num  98.3 98.2 98.4 99.2 102.5 ...
     $ pt_diff  : num  6 4.5 1.6 3.6 3.2 2.7 1.7 0.6 -1.5 -0.5 ...

And here's the right table:

> str(prob)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   4515 obs. of  6 variables:
 $ conf_rank: int  1 1 1 1 1 1 1 1 1 1 ...
 $ pt_seq   : num  -15 -14.9 -14.8 -14.7 -14.6 -14.5 -14.4 -14.3 -14.2 -14.1 ...
 $ pop_mean : num  6.79 6.79 6.79 6.79 6.79 ...
 $ pop_sd   : num  1.88 1.88 1.88 1.88 1.88 ...
 $ zscore   : num  -18.6 -18.5 -18.4 -18.3 -18.2 ...
 $ prob     : num  2.79e-31 5.17e-31 9.57e-31 1.77e-30 3.25e-30 ...
 - attr(*, "spec")=List of 2
  ..$ cols   :List of 6
  .. ..$ conf_rank: list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ pt_seq   : list()
  .. .. ..- attr(*, "class")= chr  "collector_double" "collector"
  .. ..$ pop_mean : list()
  .. .. ..- attr(*, "class")= chr  "collector_double" "collector"
  .. ..$ pop_sd   : list()
  .. .. ..- attr(*, "class")= chr  "collector_double" "collector"
  .. ..$ zscore   : list()
  .. .. ..- attr(*, "class")= chr  "collector_double" "collector"
  .. ..$ prob     : list()
  .. .. ..- attr(*, "class")= chr  "collector_double" "collector"
  ..$ default: list()
  .. ..- attr(*, "class")= chr  "collector_guess" "collector"
  ..- attr(*, "class")= chr "col_spec"

Here are the 4 rows from the standings data frame that don't join with prob on standings$pt_diff:

> standings %>%  anti_join(prob, by = c("pt_diff" = "pt_seq"))
                    team  w  l w/l% conf conf_rank tm_pts op_pts pt_diff
1 Portland Trail Blazers 44 38 .537 west         5  105.1  104.3     0.8
2        Detroit Pistons 44 38 .537 east         8  102.0  101.4     0.6
3       Dallas Mavericks 42 40 .512 west         6  102.3  102.6    -0.3
4        Houston Rockets 41 41 .500 west         8  106.5  106.4     0.1

Any idea why the only numbers in standings$pt_diff not to match are the numbers between -1 and 1? I think I might be missing something, but I'm not sure.

The full repo is here and the file this is in here. Thanks!

Axeman
  • 32,068
  • 8
  • 81
  • 94
Jay
  • 25
  • 1
  • 3
  • 3
    floating point issues, i.e. FAQ 7.31? See if multiplying by 10 and rounding to integer helps ... ? – Ben Bolker Oct 20 '16 at 03:05
  • https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f – MFR Oct 20 '16 at 03:58
  • Thanks, all. Didn't realize I stepped into this floating point issue. Somehow I've never come across it until this little project. – Jay Oct 20 '16 at 18:21

0 Answers0