Excerpt from an email from Ken Thompson, the designer of UTF-8, found
on the internet at http://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt

We define 7 byte types:
T0	0xxxxxxx	7 free bits
Tx	10xxxxxx	6 free bits
T1	110xxxxx	5 free bits
T2	1110xxxx	4 free bits
T3	11110xxx	3 free bits
T4	111110xx	2 free bits
T5	111111xx	2 free bits

Encoding is as follows.
From hex	Thru hex	Sequence		Bits
00000000	0000007f	T0			7
00000080	000007FF	T1 Tx			11
00000800	0000FFFF	T2 Tx Tx		16
00010000	001FFFFF	T3 Tx Tx Tx		21
00200000	03FFFFFF	T4 Tx Tx Tx Tx		26
04000000	FFFFFFFF	T5 Tx Tx Tx Tx Tx	32

Some notes:

1. The 2 byte sequence has 2^11 codes, yet only 2^11-2^7
are allowed. The codes in the range 0-7f are illegal.
I think this is preferable to a pile of magic additive
constants for no real benefit. Similar comment applies
to all of the longer sequences.

2. The 4, 5, and 6 byte sequences are only there for
political reasons. I would prefer to delete these.

3. The 6 byte sequence covers 32 bits, the FSS-UTF
proposal only covers 31.

4. All of the sequences synchronize on any byte that is
not a Tx byte.
