Next-generation AI-based Ecosystem for Noolaham

Summary: The project aims to build a platform capable of doing large scale content analysis of digitised Sri Lankan Tamil Texts. This is related to the field of semantic culturomics in which researchers data mine large digital archives to investigate cultural phenomena reflected in language and word usage. It is a form of computational lexicology that ...

Entity Extraction in Tamil Tweets

Summary: Social media text such as Twitter holds information regarding various important aspects. Extraction of such information serves as the basis for the most preliminary task in Natural Language Processing called Entity extraction. Entities are real world elements or objects such as Person names, Organization names, Product names, Location names. Entities are often referred to as Named ...

Paraphrase Identification System for Tamil

Summary: Paraphrase can be defined as “the same meaning of a sentence is expressed in another sentence using different words”. Paraphrases can be identified, generated or extracted. This project focuses on sentence level paraphrase identification for Tamil. Identifying paraphrases in Tamil is a difficult task, because evaluating the semantic similarity of the underlying content and ...

Language Resource Development for Tamil

Summary: The pre-requisites for developing NLP applications in any language are the availability of Lexical Resources, Corpora and Computational Models. The sparseness of these resources for Tamil is one of the major reasons for the slow growth of NLP work in Tamil language. Through this project we aim to reduce the gap by creating required language ...

A Taxonomy of Tamil NLP Research

A Taxonomy of Tamil NLP Research
Summary: The Center for Tamil NLP research conducts a thorough literature review on the existing methodologies, prior work and language resources regarding the research topics identified. The literature review would aim to identify gaps in current knowledge, avoid reinventing the wheel and allows to show that we build on a foundation of existing knowledge and ...