Using some mathematical tricks and MATLAB, we can easily calculate the entropy of given input. For instance
x = [10 25 4 10 9 4 4];
[a,b]=hist(x,unique(x));
x =
10 25 4 10 9 4 4
a =
3 1 2 1
b =
4 9 10 25
My question is the following: Because we are using log function, is it advised to add a small constant within the logarithm function to ensure proper calculations? For instance, should we use +eps
? As an example:
probbailities=a./numel(x);
probbailities =
0.4286 0.1429 0.2857 0.1429
-sum(probbailities .*log2(probbailities));
ans =
1.8424
-sum(probbailities .*log2(probbailities+eps));
ans =
1.8424