Ken Thompson and Rob Pike produced the first implementation for the Plan 9 operating system in September 1992.
UTF-8 was designed as a superior alternative to UTF-1, a proposed variable-width encoding with partial ASCII compatibility which lacked some features including self-synchronization and fully ASCII-compatible handling of characters such as slashes.
#Utf 8 converter from binary code
Since ASCII bytes do not occur when encoding non-ASCII code points into UTF-8, UTF-8 is safe to use within most programming and document languages that interpret certain ASCII characters in a special way, such as / ( slash) in filenames, \ ( backslash) in escape sequences, and % in printf. It was designed for backward compatibility with ASCII: the first 128 characters of Unicode, which correspond one-to-one with ASCII, are encoded using a single byte with the same binary value as ASCII, so that valid ASCII text is valid UTF-8-encoded Unicode as well. Code points with lower numerical values, which tend to occur more frequently, are encoded using fewer bytes. UTF-8 is capable of encoding all 1,112,064 valid character code points in Unicode using one to four one- byte (8-bit) code units. Defined by the Unicode Standard, the name is derived from Unicode (or Universal Coded Character Set) Transformation Format – 8-bit. UTF-8 is a variable-width character encoding used for electronic communication.