Skip to main content
Comprehensive Language Knowledge Base (Institute of Computational Linguistics)
Share Dataverse

Share this dataverse on your favorite social media networks.

The Comprehensive Language Knowledge Base (CLKB) was built by Peking University Institute of Computational Linguistics since 1986. CLKB includes 6 language knowledge base, 10 specifications and standards, basic software tools and four application systems, which support each other to form an organic whole. CLKB series of language knowledge covers words, phrases, sentences, chapters of the various units and lexical, syntactic, semantic aspects, from Chinese to multi-language radiation, from the general field into the professional field.
Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Find Advanced Search

1 to 6 of 6 Results
Mar 8, 2018
Yu, Shiwen; Duan, Huiming; Wu, Yunfang, 2018, "Corpus of Multi-level Processing for Modern Chinese", http://doi.org/10.18170/DVN/SEYRX5, Peking University Open Research Data Platform, V1
Peking University Institute of Computational Linguistics began to research the multi-level processing of the modern Chinese from 1992, and annotated corpus of the People's Daily, 1998 from April 1999 to April 2002. The modern Chinese multi-level processing corpus includes 52 mill...
Jan 12, 2018
Sui, Zhifang; Yu, Shiwen, 2018, "Multi-domain Chinese-English Terminology Database", http://doi.org/10.18170/DVN/PUDSHB, Peking University Open Research Data Platform, V1
Terminology is a condensed form of specialized domain knowledge. In the practices of domain knowledge engineering, Peking University Institute of Computational Linguistics has accumulated a number of terminology databases in specialized fields, including: Sports Terminology Datab...
Jan 11, 2018
Liu, Yang; Yu, Shiwen, 2017, "Multilingual Concept Dictionary", http://doi.org/10.18170/DVN/JAU6RB, Peking University Open Research Data Platform, V2
(1) The Chinese Concept Dictionary (CCD) implements Chinese corresponding to the English concepts in the WordNet 1.6 version. The total number of concepts is close to 100,000 (of which the total number of words far exceeds 100,000), including 66025 concepts of nouns, 12127 of ver...
Jan 3, 2018
Yu, Shiwen; Zhu, Xuefeng, 2017, "A dictionary of modern Chinese grammar information", http://doi.org/10.18170/DVN/EDQWIL, Peking University Open Research Data Platform, V3
A dictionary of modern Chinese grammar information, including 3 million 600 thousand grammatical attributes of 80 thousand words.
Jan 3, 2018
Yu, Shiwen, 2018, "Knowledge Base of Phrase Structure in Modern Chinese", http://doi.org/10.18170/DVN/NPDNSO, Peking University Open Research Data Platform, V1
The Knowledge Base of Phrase Structure in Modern Chinese contains 676 structural rules for Chinese phrases (including compound words) which are context-free grammar rules. There are three sample libraries released this time: 160 rules that contain the adjective, 184 rules that co...
Jan 3, 2018
Yu, Shiwen, 2018, "CLKB Common Material", http://doi.org/10.18170/DVN/XR0STB, Peking University Open Research Data Platform, V1
This dataset holds relevant information about the CLKB, such as the introduction of CLKB, award certificates and related information about the authors.
Add Data

Sign up or log in to create a dataverse or add a dataset.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.

Contact Peking University Open Research Data Platform Support

Peking University Open Research Data Platform Support

Please fill this out to prove you are not a robot.

+ =
Send Message