Community Discourse Modeling using Indic NLP
Overview:
Janastu (http://www.janastu.org/) is an organization focusing on empowering communities through development of Free and Open Source Software (FOSS) and ICT led interventions. One of their initiatives is the COW mesh (https://cowmesh.net/)– a community owned WiFi mesh, meant for communities to take ownership of local communications and reduce the cost of access to information. COW mesh has been in operation in some villages of Tumkur district of Karnataka, and has recorded huge success among the population. In recent times, COW mesh played a significant role in spreading awareness about COVID and precautionary behaviour.
Community-driven discourses are deeply rooted in the local realities and history, and often differ greatly with mainstream discourses about several issues. Over several years of its operation, COW mesh has gathered a large corpus of community interactions that contain ample insights into the community’s worldview and their pressing concerns. Understanding community worldviews is a critical requirement for designing effective interventions and policies towards sustainable development.
However, a large part of the dataset is in the local language (in this case, Kannada), and mostly as audio recordings, and ample use of local slangs and metaphors. Understanding community interactions at scale, requires development of NLP concepts in Indic languages.
In this regard, the IndicNLP project of the AI4Bharat initiative (https://indicnlp.ai4bharat.org/) at IIT Madras, is very relevant. AI4Bharat has created a large corpus of Indian languages, including Kannada, and also created several embedding models (including the polysemy-aware BERT), and pre-trained NLP models over Indian language corpora.
The AI4Bharat toolkit provides a good foundation on which we can attempt to develop community discourse models, using COW mesh as our corpus of study. As part of this project, we plan to develop or improve upon Speech to text conversion for Kannada, and develop semantic processing models like topic modeling, argument mining, and sentiment analysis.
This project has direct relevance to the ongoing field work being conducted under the rubric of Janastu. We are targeting two types of impact from the current project on the objective of Janastu to empower local communities in finding local cooperative social solutions.
From the application of AI and NLP analytics to the data corpus available we expect:
- i) to learn which themes, topics, and concerns dominate the mindshare of the communities;
ii) to use local knowledge to examine how and to what extent the COWMESH technology has enabled communities to address these concerns.
TEAM MEMBERS:
- Dr. Sridhar Mandyam, CADS
Other Members
-
Dr. Srinath Srinivasa, WSL
-
Dr. T B Dinesh, Janastu