Although your image is 2D, your histogram of gray values is only 1D. Finding peaks, or maxima, basically amounts to searching for points with a higher value than all of their neighboring points.
However, since your histogram curve isn't smooth, notice that if you do a naive search you will find lots and lots of local peaks due to the tiny oscillations.
What you want is to find maxima using a "coarser" version of your curve. You can get that by smoothing it first. This is done via convolution with a low pass filter, which is analogous to performing local weighted averaging of the values within a certain window.
scipy.signal.find_peaks_cwt will automatically smooth an array and return its peaks for you. All you need to do is specify the expected width of the peaks you're interested in finding.
That will give you the indices of the peaks. If you want the "outer" ones, simply take the first and last. Then use those indices to find the corresponding histogram bins (grayscale values).
Note, however, that for region-based segmentation in general the relevant peaks might not always be the outer ones! That happened to be the case for that particular coins image, but you will probably need to experiment a bit depending on the image. It is more likely to happen when there is a high contrast between background and foreground (and they are both roughly homogeneous). In the tutorial you referred to, it seems to me the peaks were actually chosen by (human) inspection of the histogram.