Chinese Notes

Dictionary Style Guide

This page describes the things to consider when writing a dictionary entry. To add a word use the form at Add or Change Word. The design of the dictionary is described in the page Dictionary Design.

Except for the CEDICT project, wich has a Creative Commons Attribution license, or the Unihan, which has a licence that allows copying, never copy text directly. For sources subject to copyright compare several sources and compare against the text in CEDICT, Wikipedia, Baidu Baike, and examples from the corpus. Use the most reasonable subject and / or reword the text. It is not necessary to reference the CEDICT project for specific entries because the dictionary in is based on CEDICT. If published sources are used then they should be referenced in Harvard short style or with an abbreviations from the Help page.

Dictionary Entry Components


Headwords can be words or multi-word expressions. Multi-word expressions should be included if they are used as a unit, for example a company name, or a common expressions, for example 红色预警 'red alert.' A headword is created by adding one or more entries to the words.txt file. The chinesenotes software will group together multiple lexical units with the same simplified and traditional Chinese text into the same headword. If the term is very specific to a single organization then it should be added in a file or its own. An example of this is the fgs_mwe.txt used for Fo Guang Shan specific terms. Each headword has a unique headword id.

The headword is strictly defined by the traditional Chinese text of the word. For example, the character xì could be any of the traditional characters , , or . In this case, there are three headwords, one for each traditional writing.

Word id

Each lexical unit has a unique word id. These are assigned in numeric sequence. There will generally be multiple lexical units for each headword. Each lexical unit will differ in word sense, grammar, or pronunciation.

Simplified Chinese

The Simplified Chinese field is compulsory. If the simplified Chinese text is the same as the traditional Chinese then add it here and leave the traditional Chinese field blank. For single character word entries check the Unihan Database.

Traditional Chinese

The Traditional Chinese field is only added if the traditional Chinese text is different from the simplified Chinese. The Unihan Database (“Unicode Han Database,” 2015) should be consulted for mappings between simplified and traditional Chinese characters.


The Pinyin field is mandatory. Add the pinyin text with the proper diacritics to indicate tone. For example, 凤梨 fènglí. Use the Format Pinyin tool to add the diacritics.


The Grammar field is mandatory. Use one of the values in grammar.txt.

English equivalents

The English field is mandatory. It should contain direct translations or near synonyms in English delimited by '/.' Commas, periods, and other punctuation other than double quotes can be include. Do not use plain ASCII double quotes ". If you have to use double quotes, use a Unicode stylized double quote like “”. Several English equivalents of the Chinese source word should be provided because one English equivalent will rarely be adequate in all situations.


Concept is an optional field indicating a hyperonym, which is what kind of thing a word refers to . For example, for 凤梨 'pineapple' the concept is (水果, Fruit). If you enter a value for concept you must add simplified Chinese and English values. There is no fixed list of concepts.

Domain Labels

Domain Label fields are mandatory. Domain labels indicate specialized fields that words are used in, such as lingustics or medicine. In this dictionary domain labels are critical for separating modern Chinese, literary Chinese, and Buddhist vocabulary. Other values are helpful for dictionary users. For example, the entry for 凤梨 'pineapple' the domain is (饮食, Food and Drink). The topic_cn and topic_en fields in Chinese and English respecitvely are used for this purpose. The values must be chosen from the list topics.txt. The parent_cn and parent_en are subdomains, which are optional. For example, History | China, i.e. Chinese History as a subdomain of the domain History. There is no fixed list for subdomains. They can have any value but, if a value is provided, then both Simplified Chinese and English should be provided.


Try to provide some notes to allow users to find out more about a word. The notes section provides additional information about the word. Examples should illustrate the word sense with English gloss of Chinese examples. English gloss is given in single quotes. The source of the example given in English or pinyin and Chinese. Examples should be taken from the corpus, so that they are real. Citations should be to an items listed in the References page. Explanations of collocations may be added as well, preferably with an example. Examples should not be invented. If an example from the corpus is too long or contains extraneous words, replace them with '...' Make sure that there are entries for any Chinese words used in the notes.

Synonyms in Chinese may be provided to illustrate the word sense. Synonyms should exist in the dictionary. If the synonym does not exist then add it.

Definitions or explanations of words may be provided, especially for concepts that do not have native English equivalents. This is important for historic and Buddhist words to avoid the need for the user to have to consult further references to understand a concept. Definitions of common concepts are difficult and may not be very useful. It may be more useful to the reader to cite a metaphor.

If the word is another way of writing another word fully defined elsewhere then use the text 'See {main headword}.' For example, in the notes for 北京市 Běijīng shì 'Beijing Municipality' the notes are 'See 北京.'

Examples are encouraged but should be based on evidence. That evidence may be based on usage in the corpus, especially collocations, or idioms (chengyu). The compact form of chengyu and historic roots make chengyu good candidates for examples to illustrate word senses for the individual words making up the chengyu.

Hyperonyms may be optionally used in the way that monolingual dictionaries word definitions are frequently formed (Svensén 2009, pp. 248-249). For example, square: a quadrilateral with right angles.


For more on specific types of entries see the Dictionary Entry Templates page.


  1. Atkins, B.T.S. & Rundell, M., 2008. The Oxford Guide to Practical Lexicography. Oxford University Press, Oxford.
  2. Svensén, Bo 2009, A Handbook of Lexicography: The Theory and Practice of Dictionary-Making, New York: Cambridge University Press.