<span class="translation_missing" title="translation missing: en.options.poster">poster</span>: Due mila parole Italiane: Wordlists for Language Acquisition in Schools

The MultiLingProfiler, a lexical profiling tool developed, supports language education for French, Spanish, and German in British schools. With plans to include Italian a wordlist of 2,000 Italian lemmas for teaching up to B1 level was curated, drawing data from various Italian corpora. Despite challenges like regional variations and data influence on frequency lists, this resource highlights the need for nuanced approaches in lexical profiling for Italian language education.

The MultiLingProfiler is a lexical profiling tool, created by the University of York in collaboration with the Department of Education (Finlayson et al., 2023, 4). It is used for French, Spanish and German Language Acquisition Education in British Schools, however not yet for Italian. A wordlist of 2.000 lemmas from Italian has thus been developed with the intention of facilitating language teaching up to the B1 level according to the Common European Framework of Reference for Languages (CEFR) with the MLP (Finlayson et al, 2023, 12).
To achieve this, in a first step data from four different Italian corpora has been compiled: the Perugia Corpus, a corpus of written and spoken Italian, the Corpus pilota di italiano parlato, the Corpus Certificati di Lingua Italiana, corpus of written Italian by learners with a different L1, as well as the Corpus di apprendenti di Italiano, based on learner data as well. In total, the compiled dataset thus consists of almost 27 million words of written and spoken Italian, which were then merged, sorted and tagged according to their word class.
A wordlist of 2.000 lemmas was curated via the frequency lists, which ensures relevance for learners from multiple backgrounds. Regionality is particularly notable for Italian and due to the many regional variations for many concepts, standardization is not always possible. These so called geosynonyms, as well as polysemes and homonyms are not always easily processed and distinguishable when being referenced in the MultiLingProfiler. Additionally, as Finlayson and Marsden (2023, 143) also point out, the frequency lists are influenced and impacted by the available data, especially after the first 500 rows, and thus the resulting word list might differ from everyday Italian.
This work constitutes a foundational resource for Italian language education, highlighting the need for nuanced approaches in lexical profiling tools.

Info

Day: 2024-05-10
Start time: 10:30
Duration: 00:10
Room: Poster titles
Track: Applied Linguistics
Language: en

Links:

Files

Feedback

Click here to let us know how you liked this event.

Concurrent Events