Unicode characters with code above 0xFFFF, are encoded in UTF-16 by pairs of 16-bit code units called **surrogate pairs**.
Unicode characters outside the Basic Multilingual Plane, that is characters with code above 0xFFFF, are encoded in UTF-16 by pairs of 16-bit code units called surrogate pairs, by the following scheme:
- 0x010000 is subtracted from the code point, leaving a 20-bit number in the range 0..0x0FFFFF;
- the top ten bits (a number in the range 0..0x03FF) are added to 0xD800 to give the first code unit or high surrogate, which will be in the range 0xD800..0xDBFF;
- the low ten bits (also in the range 0..0x03FF) are added to 0xDC00 to give the second code unit or low surrogate, which will be in the range 0xDC00..0xDFFF.