The Design and Construction of the Corpus of China English
摘要：The paper describes the design and construction of the Corpus of China English（CCE）. With the emergence of China English as a developing variety in the family of the world Englishes, more and more research has been done to explore its use in China. In order to provide a reliable resource for researchers in the field of China English, CCE was built with due consideration given to its representativeness and authenticity. The general principles for the corpus were authentic, representative and manageable in size.It was composed of more than 13,962,102 tokens in 15,333 texts evenly divided between the following four genres: newspapers,magazines, fiction and academic writings. The texts cover a wide range of domains, such as news, financial, politics, environment,social, culture, technology, sports, education, philosophy, literary,etc. It will be a helpful resource for research on China English,computational linguistics, natural language processing and corpus linguistics. Moreover, it is also a valuable resource for English language teaching in the context of English as a lingua franca（ELF）in which the theories and practice of English language teaching shift from teaching the "Standard" English to English as a lingua franca. With the corpus evidence, English teaching textbooks and materials can be developed with due consideration given to China English.
2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence （ACAI 2020）