|
|
||
|
Unicode - Wikipedia, the free encyclopedia
|
||
|
Haven’t mastered the basics of Unicode and character sets? Please don’t write another line of code until you’ve read this article. ... Unicode was a brave effort to create a single character set that included every reasonable writing system on the planet and some make-believe ones like Klingon, too.
|
||
|
Formally, a version of the Unicode Standard is defined by an edition of the core specification, The Unicode Standard, together with the Code Charts, Unicode Standard Annexes and the Unicode Character Database.
|
||
|
@@@ The Unicode Standard 5.2 @@@+ U52M090904.lst Final Unicode 5.2 names list. (Amd 5 & Amd 6) This file is semi-automatically derived from UnicodeData.txt and a set of manually created annotations using a script to select or suppress information from the data file.
|
||
|
Unicode is a 16-bit character set designed to cover all the world's major living languages, in addition to scientific symbols and dead languages that are the subject of scholarly interest. It eliminates the complexity of multibyte character sets that are currently used on UNIX and Windows to support Asian languages.
|
||
|
If you have no access to the paper documents defining the Unicode character set, you can look up all Unicode characters except for the Hangul syllables on charts.unicode.org but there you will not find the additional information on how these characters interact.
|
||
|
HTML 4.0 - Unicode instead of ISO 8859-1 As mentioned, HTML 4.0 uses Unicode as its base character set. With this change a whole new set of officially named and numbered character entities are introduced.
|
Copyright © 2009, Dictionary.com, LLC. All rights reserved.