System News
Challenges in Developing Indic Scripts Support
Support, Processing and Presentation Considerations
October 15, 2002,
Volume 56, Issue 3

Developers attempting to implement Indic scripts in applications and systems face challenges such as language diversity, lack of presentation standards and inconsistencies in character support between ISCII and Unicode Indic script support.

India officially recognizes 15 languages and writing scripts including: Hindi, Marathi, Sanskrit, Punjabi, Gujarati, Oriya, Bengali, Assamese, Telugu, Kannada, Malayalam, Tamil, Urdu Sindhi and Kashmiri.

Of these, Urdu, Sindhi and Kashmiri are usually written in Perso-Arabic scripts, though sometimes in Devanagari. Apart from the Perso-Arabic scripts, the remaining 10 scripts in Indian languages are evolved from a common source, the ancient Brahmi script. Their common phonetic structure makes it possible to have a common character set. Unicode (ISO 10646) covers most recognized scripts in India today. However, the standard requires further elucidation to simplify its use.

The major scripts of India, including Devanagari, are encoded so that comparable characters are in the same order and relative location. This structural arrangement is based upon the Indian national standard (ISCII) encoding for scripts. Unlike Unicode, ISCII is an 8-bit encoding that uses escape sequences to announce the particular Indic script being represented by a coded character sequence.

ISCII provides the Indian script character set in the upper 96 characters, while retaining the ASCII character set in the lower half. The Indian script keyboard overlay is designed for the standard English QUERTY overlay, ensuring that English can co-exist with Indian scripts.

Having a common code and keyboard for all the Indian scripts would yield many advantages. Software that allows ISCII codes to be used in Indian scripts would be more commercially viable and would allow immediate transliteration between different Indian scripts, simply by changing display modes. Simultaneous availability of multiple Indian languages will accelerate the script's development and facilitate national integration.

Sun has actively participated in Open Source initiatives for Indic scripts and has led the Indian language support for MozillaTM. Sun was one of the first organizations to use PLS APIs (developed by OpenGroup/X-Open more than eight years ago) for its CTL script implementation in the SolarisTM Operating Environment (Solaris OE). Support was initially provided for Thai, Arabic, Hebrew and more recently with Indic scripts in Solaris 9 OE.

Sun added Hindi support to CDE/Motif with the latest release of Solaris 9 OE, which will also include the addition of another seven Indian scripts subsequently in update releases of Solaris 9 OE. As of now only Sun supports Indian script (CTL script) in CDE/Motif and is also in the process of transferring from CDE to Gnome as the default desktop.

From the JavaTM platform, areas of development include the "pluggable" locales and a creation of a greater variety of locales (e.g. th_TH-TH). Also for the desktop, Sun has added Hindi calendar support with text rendering optimized throughout.

For more information, see:

http://www.sun.com/developers/gadc/technicalpublications/articles/indic/indic1.html [...read more...]

Keywords:

fullsource
 

Other articles in the Developer section of Volume 56, Issue 3:

See all archived articles in the Developer section.



News and Solutions for Users of Solaris, Java and Oracle's Sun hardware products
Just the news you need, none of what you don't – 42,000+ Members – 24,000+ Articles Published since 1998