Working with Chinese Text in JavaScript

Alex Amies April 27, 2009

Contents

Introduction

This article is a follow on to the article Processing Chinese Text with PHP, which discussed server side processing of Chinese text in PHP, laying a foundation for creating interactive web applications that either use Chinese text or provide tools for working with Chinese text. This article extends that by discussing the principles of making more responsive and usable applications using JavaScript. The goal is to help developers of Chinese language or English / Chinese bilingual sites with some JavaScript that will provide their users with a better experience. The article also gives some insight into the language tools used at chinesenotes.com.

This article introduces processing Chinese text with JavaScript. The article will start with some simple examples and goes on to discuss areas and gives examples where JavaScript is strong and where it is weak. In addition to basic JavaScript features, I will discuss some libraries, such as Prototype, Dojo, and Google. There are a number of other very good toolkits that I do not discuss, such as MochiKit, YUI (Yahoo), and EXT.

The JavaScript language is written using the Unicode character set. This means that it can represent nearly any human language. ECMAScript versions 1 and 2 only allow Unicode characters in comments and quoted string literals but ECMAScript version 3 allows Unicode characters anywhere in a JavaScript program. This gives us a great head start in processing Chinese text. However, there are some missing internationalization features, such as lack of a capability to externalize display text based on the user's locale like Java resource bundles. Sorting aand comparing string is another area where JavaScript lacks out of the box support for Chinese text. In any case, the focus of this article will be on the things that are special to Chinese text, problems specific to web sites that aim to help people learn Chinese, and Chinese language web sites that want to be more friendly to people with limited Chinese reading ability.

The version of JavaScript used with be JavaScript 1.5, which is the version number from the Mozilla Foundation. This is equivalent to the European Computer Manufacturer's Association ECMA-262 (ECMAScript) and Microsoft's JScript 5.5. The article will only discuss client side JavaScript and the core JavaScript language. It will depend on DOM implementations in Firefox 1.0 or later or Internet Explorer 5.0 and later. The article is applicable to other browsers, such as Safari, but I haven't taken the time to test any of the examples with other browsers. However, many parts of the article do not apply to apply to older browsers 5 or more years old.

This article has something for those who have little background in JavaScript and those who have more. The first few sections will assume only a very minimal background and experienced programmers may want to skip them. To get some JavaScript background or some more background please see the article A re-introduction to JavaScript [MDC2] or the very comprehensive book JavaScript: The Definitive Guide [FLAN]. Other generally good starting points are the Mozilla Developer Center JavaScript page [MDC3] or the Microsoft Developer Network site [MSDN]. You will aslo find the Core JavaScript 1.5 Reference [MDC1] at the Mozilla Developer Center and the Microsoft JScript Reference at the Microsoft Developer Network site.

Previous   Contents   Next  
References

Copyright Alex Amies 2008. Please send comments to alex@chinesenotes.com.