A cross linguistic database of children's printed words in three Slavic languages

Thumbnail Image
RG_CrosslinguisticPV2007.pdf(317.35 KB)
Published Version
Garabik, Radovan
Caravolus, Marketa
Kessler, Brett
Hoeflerova, Eva
Masterson, Jackie
Mikulajova, Marina
Szczerbinski, Marcin
Wierzchon, Piotr
Journal Title
Journal ISSN
Volume Title
Published Version
Research Projects
Organizational Units
Journal Issue
We describe a lexical database consisting of morphologically and phonetically tagged words that occur in the texts primarily used for language arts instruction in the Czech Republic, Poland and Slovakia in the initial period of primary education (up to grade 4 or 5). The database aims to parallel the contents and usage of the British English Children's Printed Word Database. It contains words from texts of the most widely used Czech, Polish and Slovak textbooks. The corpus is accessible via a simple WWW interface, allowing regular expression searches and boolean expression across word forms, lemmas, morphology tags and phonemic transcription, and providing useful statistics on the textwords included. We anticipate extensive usage of the database as a reference in the developmentof psychodiagnostic batteries for literacy impairments in the three languages, as well as for the creation of experimental materials in psycholinguistic research.
Language arts instruction , Slavic languages , Primary education , Psycholinguistic research
Garabík, R., Caravolas, M., Kessler, B., Höflerová, E., Masterson, J., Mikulajová, M., Szczerbiński, M., Wierzchoń, P. (2007). 'A cross-linguistic database of children’s printed words in three Slavic languages'. In Levická, J., & Garabík, R. (Eds.). Computer Treatment of Slavic and East European Languages: Fourth International Seminar, Bratislava, Slovakia, 25−27 October 2007: Proceedings (pp. 51−64). Bratislava: Tribun.