Roughly speaking, bootstrap sampling is just sampling by replacement, which naturally leads to samples of the original dataset being left out, while other samples being present more than once.
I thought that random forest was already a technique using bootstrap
You are right in that the original RF algorithm as suggested by Breiman indeed incorporates bootstrap sampling by default (this is actually an inheritance from bagging, which is used in RF).
Nevertheless, implementations like the scikit-learn one, understandably prefer to leave available the option not to use bootstrap sampling (i.e. sampling with replacement), and use the whole dataset instead; from the docs:
The sub-sample size is controlled with the max_samples
parameter if bootstrap=True
(default), otherwise the whole dataset is used to build each tree.
Similar is the situation in the standard R implementation (here the respective parameter is called replace
, and, like here, it's also set by default to TRUE
).
So, nothing really strange here, beyond the (generally desirable) design choice of leaving room and flexibility for the practitioner to be able to select bootstrap sampling or not. In the RF early days, bootstrap sampling offered the extra possibility to calculate out-of-bag (OOB) error without using cross-validation, an idea that (I think...) eventually fell out of favor, and "freed" the practitioners to try leaving out the bootstrap sampling option, if this leads to better performance.
You may also find parts of my answer in Why is Random Forest with a single tree much better than a Decision Tree classifier? useful.