Database and Annotation Tool for Computational Modeling of Arabic Nominal Gender, Number and Rationality MorphosyntaxTechnology #cu14137
Questions about this technology? Ask a Technology Manager
- License this Technology
- NON-COMMERCIAL SOFTWARE LICENSE AGREEMENT- CU14137 $0.00
- Image Gallery
- Nizar Habash
- Managed By
- Richard Nguyen
This technology is a linguistic database of Arabic functional gender, functional number, and rationality. These are important features for modeling Arabic morphosyntactic agreement. In addition, this technology includes a tool for annotating the Linguistic Data Consortium (LDC) Arabic treebanks with the morphosyntatic information mentioned above. Arabic has complex agreement patterns and irregular morphology; and current Arabic LDC treebanks represent nominal gender and number by shallow (non-functional) forms and do not include nominal rationality. The database and annotation tool can improve computational modeling of Arabic for natural language processing and linguistics research applications.
The annotation tool requires that researchers obtain Arabic corpora from the LDC.
- Annotate Arabic corpora with correct morpho-syntactic agreement computationally.
- Build computational models of Arabic morphology and syntax.
- Engineer Arabic language processing systems.
- Study of Arabic linguistic phenomena.
- Translate Arabic language with correct nominal gender, number and rationality agreement.
- Annotates LDC treebanks with missing information regarding nominal gender, number, and rationality agreement.
- Improves computational modeling of Arabic morphosyntax for natural language processing applications.
Tech Ventures Reference: IR CU14137
S. Alkuhlani, N. Habash. A Corpus for Modeling Morpho-Syntactic Agreement in Arabic: Gender, Number and Rationality. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: shortpapers, Vol. 2, June. 2011, pp. 357-362.
S. Alkuhlani, N. Habash. Identifying broken plurals, irregular gender, and rationality in Arabic Text. Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, April 2012, pp. 675-685.