ILC4CWALM

CNR-ILC contribution to the project CWALM
A lexical corpus-based model of Contemporary Written Arabic
(MUR | PRIN 2020)

CWALM provided the scientific community of Social Sciences and Humanities with:

  • A new theoretical approach to overcome the traditional description of Arabic linguistic system.
  • A final test model that aims to be the first large-scale validated CWA resource.
  • A lexicographic resource that could have a positive social impact on Arabic-speaking communities.

For more details, please visit:

CWALM logo

FOR

CNR logo

Language Resources and Tools

A growing set of resources and tools to support research on Moroccan Arabic


CWALM Linguistic Resources

DiMorph Tool

DiMorph (Dialect Morphological Analyzer) is a tool specifically designed for Dialect Arabic, so far adopted for the Moroccan dialect.
It supports automatic Part-of-Speech (POS) tagging and syntactic analysis, enabling researchers to efficiently explore dialectal data.

Corpus

The corpus consists of Moroccan dialect data.
We have collected data from the Facebook platform to create a rich and diverse corpus of user-generated content, ensuring comprehensive coverage of colloquial Arabic varieties.

Lexical Resources

Lexical resources are integrated into our DiMorph tool. So far, the resources are in tabular files (.txt).

All resources are made available in open access and distributed under the CC-BY license via the repository: