shopping cart
Call us:  800-878-7323 HELP
McAfee SECURE helps keep you safe from identity theft, credit card fraud, spyware, spam, viruses and online scams.
Original Essays | June 27, 2009

All posts by Fran Cannon Slayton On Wakes and Rum (and Coke)

"Unfortunately, I've been to my fair share of wakes." Continue »


  1. $11.89 Sale Hardcover add to wish list

    When the Whistle Blows

    Fran Cannon Slayton

Ships free on qualified orders.
$17.50
List price: $74.99
HARDCOVER, USED
Usually ships in 5 to 7 business days
Add to Wishlist
available for shipping or prepaid pickup only
Qty Store Section
1 Remote Warehouse Computers Reference- General


The Unicode Standard, Version 4.0: The Unicode Consortium with CDROM

by Joan Aliprand

The Unicode Standard, Version 4.0: The Unicode Consortium with CDROM Cover

Synopses & Reviews

Publisher Comments:

U+2FF0-U+2FFB.

Bopomofo.

Bopomofo: U+3100-U+312F.

Hiragana and Katakana.

Hiragana: U+3040-U+309F.

Katakana: U+30A0-U+30FF.

Katakana Phonetic Extensions: U+31F0-U+31FF.

Halfwidth and Fullwidth Forms: U+FF00-U+FFEF.

Hangul.

Hangul Jamo: U+1100-U+11FF.

Hangul Compatibility Jamo: U+3130-U+318F.

Hangul Syllables: U+AC00-U+D7A3.

Yi.

Yi: U+A000-U+A4CF.

12. Additional Modern Scripts.

Ethiopic.

Ethiopic: U+1200-U+137F.

Mongolian.

Mongolian: U+1800-U+18AF.

Osmanya.

Osmanya: U+10480-U+104AF.

Cherokee.

Cherokee: U+13A0-U+13FF.

Canadian Aboriginal Syllabics.

Canadian Aboriginal Syllabics: U+1400-U+167F.

Deseret.

Deseret: U+10400-U+1044F.

Shavian.

Shavian: U+10450-U+1047F.

13. Archaic Scripts.

Ogham.

Ogham: U+1680-U+169F.

Old Italic.

Old Italic: U+10300-U+1032F.

Runic.

Runic: U+16A0-U+16F0.

Gothic.

Gothic: U+10330-U+1034F.

Ugaritic.

Ugaritic: U+10380-U+1039F.

Linear B.

Linear B Syllabary: U+10000-U+1007F.

Linear B Ideograms: U+10080-U+108FF.

Aegean Numbers: U+10100-U+1013F.

Cypriot Syllabary.

Cypriot Syllabary: U+10800-U+1083F.

14. Symbols.

Currency Symbols.

Currency Symbols: U+20A0-U+20CF.

Letterlike Symbols.

Letterlike Symbols: U+2100-U+214F.

Math Alphanumeric Symbols: U+1D400-U+1D7FF.

Mathematical Alphabets.

Fonts Used for Mathematical Alphabets.

Number Forms.

Number Forms: U+2150-U+218F.

Superscripts and Subscripts: U+2070-U+209F.

Mathematical Symbols.

Mathematical Operators: U+2200-U+22FF.

Supplements to Mathematical Symbols and Arrows.

Supplemental Math Operators: U+2A00-U+2AFF.

Miscellaneous Math Symbols-A: U+27C0-U+27EF.

Miscellaneous Math Symbols-B: U+2980-U+29FF.

Arrows: U+2190-U+21FF.

Supplemental Arrows.

Standardized Variants of Mathematical Symbols.

Technical Symbols.

Control Pictures: U+2400-U+243F.

Miscellaneous Technical: U+2300-U+23FF.

Optical Character Recognition: U+2440-U+245F.

Geometrical Symbols.

Box Drawing: U+2500-U+257F.

Block Elements: U+2580-U+259F.

Geometric Shapes: U+25A0-U+25FF.

Miscellaneous Symbols and Dingbats.

Miscellaneous Symbols: U+2600-U+26FF.

Dingbats: U+2700-U+27BF.

Yijing Hexagram Symbols: U+4DC0-U+4DFF.

Tai Xuan Jing Symbols: U+1D300-U+1D356.

Enclosed and Square.

Enclosed Alphanumerics: U+2460-U+24FF.

Enclosed CJK Letters and Months: U+3200-U+32FF.

CJK Compatibility: U+3300-U+33FF.

Braille.

Braille Patterns: U+2800-U+28FF.

Byzantine Musical Symbols.

Byzantine Musical Symbols: U+1D000-U+1D0FF.

Western Musical Symbols.

Musical Symbols: U+1D100-U+1D1FF.

15. Special Areas and Format Characters.

Control Codes.

Layout Controls.

Invisible Operators.

Deprecated Format Characters.

Deprecated Format Characters: U+206A-U+206F.

Surrogates Area.

Surrogates Area: U+D800-U+DFFF.

Variation Selectors.

Private-Use Characters.

Private Use Area: U+E000-U+F8FF.

Supplementary Private Use Areas.

Noncharacters.

Noncharacters: U+FFFE, U+FFFF, and Others.

Specials.

Specials: U+FEFF, U+FFF0-U+FFFD.

Tag Characters.

Tag Characters: U+E0000-U+E007F.

16. Code Charts.

Character Names List.

Images in the Code Charts and Character Lists.

Character Names.

Aliases.

Cross References.

Information About Languages.

Case Mappings.

Decompositions.

Reserved Characters.

Noncharacters.

Subheads.

CJK Unified Ideographs.

Hangul Syllables.

17. Han Radical-Stroke Index.

A. Han Unification History.

B. Abstracts of Unicode Technical Reports.

Unicode Standard Annexes.

UAX #9: The Bidirectional Algorithm.

UAX #11: East Asian Width.

UAX #14: Line Breaking Properties.

UAX #15: Unicode Normalization Forms.

UAX #24: Script Names.

UAX #29: Text Boundaries.

Unicode Technical Standards.

UTS #6: A Standard Compression Scheme for Unicode.

UTS #10: Unicode Collation Algorithm.

Unicode Technical Reports.

UTR #16: UTF-EBCDIC.

UTR #17: Character Encoding Model.

UTR #18: Unicode Regular Expression Guidelines.

UTR #20: Unicode in XML and Other Markup Languages.

UTR #22: Character Mapping Markup Language (CharMapML).

UTR #26: Compatibility Encoding Scheme for UTF-16: 8-Bit (CESU-8).

Other Unicode References.

Unicode Technical Notes.

FAQ (Frequently Asked Questions).

Charts.

Conferences.

Policies.

Updates and Errata.

Versions.

Where Is My Character?

C. Relationship to ISO/IEC 10646.

History.

Unicode 1.0.

Unicode 2.0.

Unicode 3.0.

Unicode 4.0.

Encoding Forms in ISO/IEC 10646.

Zero Extending.

UCS Transformation Formats.

UTF-8.

UTF-16.

Synchronization of the Standards.

Identification of Features for the Unicode Standard.

Character Names.

Character Functional Specifications.

D. Changes from Unicode Version 3.0.

Versions of the Unicode Standard.

Changes from Unicode Version 3.0 to Version 3.1.

New Characters Added.

Unicode Character Database Changes.

Changes Affecting Conformance.

Unicode Standard Annexes.

Changes from Unicode Version 3.1 to Version 3.2.

New Characters Added.

Unicode Character Database Changes.

Changes Affecting Conformance.

Unicode Standard Annexes.

Changes from Unicode Version 3.2 to Version 4.0.

New Characters Added.

Unicode Character Database Changes.

Changes Affecting Conformance.

Unicode Standard Annexes.

Errata.

G. Glossary.

R. References.

Source Standards and Specifications.

Source Dictionaries for Han Unification.

Other Sources for the Unicode Standard.

Selected Resources: Technical.

Selected Resources: Scripts and Languages.

I. Indices.

Unicode Names Index.

General Index.

Book News Annotation:

The Unicode Standard defines a consistent way of encoding multilingual text necessary for easy electronic exchange of text data and is the default encoding of HTML and XML on the Web. Treating all characters (textual, ideographic, and symbolic) equivalently, it requires no control codes or escape sequences. It is also conformant with international standard ISO/IEC 10646. Version 4.0 has added Old Italic, Gothic, Deseret, Shavian, Cypriot, Ugaritic, Tai Le, and other scripts and has doubled the number of Han ideographs. This text explains the use of the standard and provides code charts for 96,248 common characters. The CD-ROM contains the entire Unicode Character database, as well as the text of the Unicode Standard Annexes. Annotation (c)2003 Book News, Inc., Portland, OR (booknews.com)

Synopsis:

A comprehensive guide to the Unicode programming standard, created and authorized by the Unicode Consortium. The accompanying CD-ROM contains the entire Unicode Character Database, plus other materials.

Synopsis:

The authoritative guide to universal character encoding

The official way to implement ISO/IEC 10646

The key to advancing global interoperability in information technology products

Unicode 4.0The Unicode Standard

The Unicode Standard provides a unique code number for every character in electronic text, no matter what the platform, no matter what the application, no matter what the language. It is required for XML and is at the core of modern software products. Unicode 4.0 contains 96,248 characters covering languages of the world. The Unicode Standard contains extensive descriptions of each writing system, as well as definitions of character properties and detailed conformance requirements. It is the complete and definitive user's guide for novices and experts alike.

This edition, The Unicode Standard, Version 4.0 adds 47,188 new characters for minority and historic scripts, several sets of symbols, and a very large collection of additional CJK ideographs. It provides updated specifications covering structure, conformance, character behavior and semantics, as well as implementation guidelines, detailed discussions of writing systems, comprehensive charts, and an extensive glossary. The accompanying CD-ROM includes the text of all the Unicode Standard Annexes and the entire Unicode Character Database.

0321185781B07232003

About the Author

The Unicode Consortium is a non-profit organization founded to develop, extend, and promote the use of the Unicode Standard. The membership of the Consortium represents a broad spectrum of corporations and organizations in the computer and information processing industry. The Unicode Consortium actively cooperates with many of the leading standards development organizations, including ISO/IEC JTC1, W3C, IETF, and ECMA.

0321185781AB07232003

Table of Contents

Acknowledgments.

Unicode Consortium Members and Directors.

Figures.

Tables.

Preface.

1. Introduction.

Coverage.

Standards Coverage.

New Characters.

Design Goals.

Text Handling.

Interpreting Characters.

Text Elements.

The Unicode Standard and ISO/IEC 10646.

The Unicode Consortium.

The Unicode Technical Committee.

Submitting New Characters.

2. General Structure.

Architectural Context.

Basic Text Processes.

Text Elements, Characters, and Text Processes.

Text Processes and Encoding.

Unicode Design Principles.

Universality.

Efficiency.

Characters, Not Glyphs.

Semantics.

Plain Text.

Logical Order.

Unification.

Dynamic Composition.

Equivalent Sequences.

Convertibility.

Compatibility Characters.

Compatibility Characters.

Compatibility Decomposable Characters.

Mapping Compatibility Characters.

Code Points and Characters.

Types of Code Points.

Encoding Forms.

UTF-32.

UTF-16.

UTF-8.

Comparison of the Advantages of UTF-32, UTF-16, and UTF-8.

Encoding Schemes.

Unicode Strings.

Unicode Allocation.

Planes.

Allocation Areas and Character Blocks.

Details of Allocation.

Assignment of Code Points.

Writing Direction.

Combining Characters.

Sequence of Base Characters and Diacritics.

Multiple Combining Characters.

Ligated Multiple Base Characters.

Spacing Clones of European Diacritical Marks.

"Characters" and Grapheme Clusters.

Special Characters and Noncharacters.

Byte Order Mark (BOM).

Special Noncharacter Code Points.

Layout and Format Control Characters.

The Replacement Character.

Control Codes.

Conforming to the Unicode Standard.

Supported Subsets.

Related Publications.

3. Conformance.

Versions of the Unicode Standard.

Stability.

Version Numbering.

Errata, Corrigenda, and Future Updates.

References to the Unicode Standard.

References to Unicode Character Properties.

References to Unicode Algorithms.

Conformance Requirements.

Byte Ordering.

Unassigned Code Points.

Interpretation.

Modification.

Character Encoding Forms.

Character Encoding Schemes.

Bidirectional Text.

Normalization Forms.

Normative References.

Unicode Algorithms.

Default Casing Operations.

Unicode Standard Annexes.

Semantics.

Definitions.

Character Identity and Semantics.

Characters and Encoding.

Properties.

Normative and Informative Properties.

Simple and Derived Properties.

Property Aliases.

Default Property Values.

Private Use.

Combination.

Decomposition.

Compatibility Decomposition.

Canonical Decomposition.

Surrogates.

Unicode Encoding Forms.

UTF-32.

UTF-16.

UTF-8.

Encoding Form Conversion.

Unicode Encoding Schemes.

Canonical Ordering Behavior.

Application of Combining Marks.

Combining Classes.

Canonical Ordering.

Canonical Ordering and Collation.

Conjoining Jamo Behavior.

Hangul Syllable Boundaries.

Standard Korean Syllables.

Hangul Syllable Composition.

Hangul Syllable Decomposition.

Hangul Syllable Names.

Default Case Operations.

Definitions.

Case Conversion of Strings.

Case Detection for Strings.

Caseless Matching.

4. Character Properties.

Unicode Character Database.

Case-Normative.

Case Mapping.

Combining Classes-Normative.

Directionality-Normative.

General Category-Normative.

Numeric Value-Normative.

Ideographic Numeric Values.

Bidi Mirrored-Normative.

Unicode 1.0 Names.

Letters, Alphabetic, and Ideographic.

Boundary Control.

Characters with Unusual Properties.

5. Implementation Guidelines.

Transcoding to Other Standards.

Issues.

Multistage Tables.

ANSI/ISO C wchar_t.

Unknown and Missing Characters.

Reserved and Private-Use Character Codes.

Interpretable but Unrenderable Characters.

Default Property Values.

Default Ignorable Code Points.

Interacting with Downlevel Systems.

Handling Surrogate Pairs in UTF-16.

Handling Numbers.

Normalization.

Compression.

Newline Guidelines.

Definitions.

Background.

Recommendations.

Regular Expressions.

Language Information in Plain Text.

Requirements for Language Tagging.

Language Tags and Han Unification.

Editing and Selection.

Consistent Text Elements.

Strategies for Handling Nonspacing Marks.

Keyboard Input.

Truncation.

Rendering Nonspacing Marks.

Canonical Equivalence.

Positioning Methods.

Locating Text Element Boundaries.

Identifiers.

Property-Based Identifier Syntax.

Syntactic Rule.

Alternative Recommendation.

Sorting and Searching.

Culturally Expected Sorting and Searching.

Language-Insensitive Sorting.

Searching.

Sublinear Searching.

Binary Order.

UTF-8 in UTF-16 Order.

UTF-16 in UTF-8 Order.

Case Mappings.

Complications for Case Mapping.

Reversibility.

Caseless Matching.

Normalization.

Unicode Security.

Default Ignorable Code Points.

6. Writing Systems and Punctuation.

Writing Systems.

General Punctuation.

Punctuation: U+0020-U+00BF.

General Punctuation: U+2000-U+206F.

CJK Symbols and Punctuation: U+3000-U+303F.

CJK Compatibility Forms: U+FE30-U+FE4F.

Small Form Variants: U+FE50-U+FE6F.

7. European Alphabetic Scripts.

Latin.

Letters of Basic Latin: U+0041-U+007A.

Letters of the Latin-1 Supplement: U+00C0-U+00FF.

Latin Extended-A: U+0100-U+017F.

Latin Extended-B: U+0180-U+024F.

IPA Extensions: U+0250-U+02AF.

Phonetic Extensions: U+1D00-U+1D6A.

Latin Extended Additional: U+1E00-U+1EFF.

Latin Ligatures: FB00-FB06.

Greek.

Greek: U+0370-U+03FF.

Greek Extended: U+1F00-U+1FFF.

Cyrillic.

Cyrillic: U+0400-U+04FF.

Cyrillic Supplement: U+0500-U+052F.

Armenian.

Armenian: U+0530-U+058F.

Georgian.

Georgian: U+10A0-U+10FF.

Modifier Letters.

Spacing Modifier Letters: U+02B0-U+02FF.

Combining Marks.

Combining Diacritical Marks: U+0300-U+036F.

Combining Marks for Symbols: U+20D0-U+20FF.

Combining Half Marks: U+FE20-U+FE2F.

8. Middle Eastern Scripts.

Hebrew.

Hebrew: U+0590-U+05FF.

Alphabetic Presentation Forms: U+FB1D-U+FB4F.

Arabic.

Arabic: U+0600-U+06FF.

Cursive Joining.

Ligatures.

Arabic Presentation Forms-A: U+FB50-U+FDFF.

Arabic Presentation Forms-B: U+FE70-U+FEFF.

Syriac.

Syriac: U+0700-U+074F.

Syriac Shaping.

Syriac Cursive Joining.

Ligatures.

Thaana.

Thaana: U+0780-U+07BF.

9. South Asian Scripts.

Devanagari.

Devanagari: U+0900-U+097F.

Bengali.

Bengali: U+0980-U+09FF.

Gurmukhi.

Gurmukhi: U+0A00-U+0A7F.

Gujarati.

Gujarati: U+0A80-U+0AFF.

Oriya.

Oriya: U+0B00-U+0B7F.

Tamil.

Tamil: U+0B80-U+0BFF.

Telugu.

Telugu: U+0C00-U+0C7F.

Kannada.

Kannada: U+0C80-U+0CFF.

Malayalam.

Malayalam: U+0D00-U+0D7F.

Sinhala.

Sinhala: U+0D80-U+0DFF.

Tibetan.

Tibetan: U+0F00-U+0FFF.

Limbu.

Limbu: U+1900-U+194F.

10. Southeast Asian Scripts.

Thai.

Thai: U+0E00-U+0E7F.

Lao.

Lao: U+0E80-U+0EFF.

Myanmar.

Myanmar: U+1000-U+109F.

Khmer.

Khmer: U+1780-U+17FF.

Khmer Symbols: U+19E0-U+19FF.

Tai Le.

Tai Le: U+1950-U+197F.

Philippine Scripts.

Tagalog: U+1700-U+171F.

Hanunoo: U+1720-U+173F.

Buhid: U+1740-U+175F.

Tagbanwa: U+1760-U+177F.

11. East Asian Scripts.

Han.

CJK Unified Ideographs.

CJK Unified Ideographs Ext. B: U+20000-U+2A6D6.

CJK Compatibility Ideographs: U+F900-U+FAFF.

CJK Compatibility Supplement: U+2F800-U+2FA1D.

Kanbun: U+3190-U+319F.

CJK and KangXi Radicals: U+2E80-U+2FD5.

Ideographic

Product Details

ISBN:
9780321185785
Subtitle:
The Unicode Consortium [With CDROM]
Editor:
Aliprand, Joan
Editor:
Becker, Joe
Editor:
Aliprand, Joan
Editor:
Allen, Julie
Author:
The Unicode Consortium
Editor:
Allen, Julie
Editor:
Becker, Joe
Publisher:
Addison Wesley Publishing Company
Location:
Boston
Subject:
Programming Languages - General
Subject:
Programming - General
Subject:
Unicode (Computer character set)
Subject:
Unicode
Subject:
Data Processing - General
Subject:
Data processing
Copyright:
Series Volume:
relatâorio no 6/2002
Publication Date:
August 2003
Binding:
Hardcover
Grade Level:
Professional and scholarly
Language:
English
Illustrations:
Yes
Pages:
1462
Dimensions:
29 cm. +

Other books you might like

  1. $39.95 New Trade Paper add to wish list
  2. $25.95 New Trade Paper add to wish list
  3. $30.00 Used Hardcover add to wish list

    Cosmos

    Giles Sparrow
  4. $12.00 Used Trade Paper add to wish list
  5. $34.95 Used Trade Paper add to wish list

Related Aisles

  • back to top

Powell's City of Books is an independent bookstore in Portland, Oregon, that fills a whole city block with more than a million new, used, and out of print books. Shop those shelves — plus literally millions more books, DVDs, and eBooks — here at Powells.com.