Content area
Full Text
(ProQuest: ... denotes non-US-ASCII text omitted.)
Corpus-based and corpus-driven research, although certainly no longer new, is an exciting direction for researchers interested in the lexicogrammatical patterns in learner writing. This corpus provides researchers interested in conducting corpus research with a collection of 3.7 million words from English learners with 16 different mother tongues (Bulgarian, Chinese, Czech, Dutch, Finnish, French, German, Italian, Japanese, Norwegian, Polish, Russian, Spanish, Swedish, Turkish, and Tswana). Each mother tongue subcorpus consists of approximately 200,000 words, except for the Chinese subcorpus, which consists of nearly 500,000 words. The text type is 93% argumentative essays, with the remaining 7% made up of literary analyses and responses to articles. A majority of the texts (approximately 65%) were written under untimed circumstances, with unrestricted access to reference materials. However, the Chinese-Cantonese subcorpus, the largest of the mother tongue subcorpora, consists mainly of timed texts. The texts have been produced by high-intermediate to advanced learners (B2-C2 proficiency bands of the Common European Framework of Reference), although the editors acknowledge that the proficiency level of the writers is a fuzzy variable and that...