The government has taken extensive measures to modernize the use of the Bangla language in the digital sphere.
As part of a string of initiatives to facilitate the use of Bangla on digital platforms and devices, the government recently launched trial versions of the “Bangla Spell Checker- Sathik” and “Bangla OCR-Barna” applications. Users of the trial versions are encouraged to provide their feedback.
Sathik, a free app, was developed to fix spelling, grammar, and writing errors in documents under software and IT service catalog e-governance solutions.
Meanwhile, Barna is a software that can convert non-editable Bangla text in PDFs or images into editable text.
Both applications were developed under the Bengali Language Enrichment in Information Technology through Research and Development Project (EBLICT). The project was launched in 2017 and is set to end in June 2024, costing around Tk159 crore.
REVE Systems, Beximco, Gga, green 71, and Znf are among the firms that have provided technical support to the project.
The trial version of the app Sathik is now available at spell.bangla.gov.bd. Users can also visit https://ocr.bangla.gov.bd/#/login to provide feedback on Barna.
Project Chief Consultant Mohammad Mamun Or Rashid, assistant professor of the Bangla Department of Jahangirnagar University, said the launch of the two apps is a major milestone for digital use of the Bangla language.
"Even if there is weblink, we have an MS Word plugin, browser plugin, and mobile keyboard ready. The spell checker can be used through these extensions," he added.
Around 50 consultants have been involved in the project. In addition to Sathik and Barna, project officials are working on Sentiment Analysis, Bangla Voice to Text, Bangla National Corpus, Bangla Font Converter, Ethnic Language Digitization, Bangla Virtual Assistant, Universal Keyboard Software, UBoard, Search Engine Samudra, Screen Reader, Bengali Sign Language Recognition System, Bangla Data Driven Dictionary, Integrated Service Platform, IPA Converter, and Text-to-Sign Puppet.
Sentiment Analysis- Janamat
The “Janamat” application can perform sentiment analysis of Bengali text. If a sentence, paragraph, or document is given as input, it determines whether the content is positive, negative or neutral. This can help with conducting surveys and data analysis, including for elections.
Bengali voice to text - Katha
The Bengali speech to text application is called “Katha”. It can convert standard clear Bengali Pronunciation to text. Gradually, the capacity of regionally affected Bangla will be added to the app. The application is currently undergoing internal testing and will be released soon.
Bangla National Corpus
A National Corpus is a repository of electronic texts of a language. Its work is ongoing and it is expected to revolutionize the Bangla technology world if completed.
Bengali font converter - Rupantar
The app “Rupantar” is a dynamic mapping system for encoding Bangla. Through this, all old Bangla encoding systems can be converted to Unicode. Apart from web applications, it also has various other uses.
Digitization of ethnic languages
A language archive is being developed by collecting data on 40 languages of Bangladesh. In this process, many endangered languages used in Bangladesh are being scientifically preserved. Data collection of 26 languages has already been completed.
Bangla Virtual Assistant
A chatbot will provide the service, but behind it is a powerful information retrieval system. It will be able to carry on dialogues and general conversation, such as greetings, farewells, and will play a role in providing information about certain government services and domains.
Universal Keyboard Software - Uboard
A universal keyboard has been developed for the first time for all layouts and scripts of the Bangla language and other languages of the country. As a result, languages of minority groups, including Bangla dialects, can be accurately written on computers.
Search engine - Samudra
A new generation Bengali search engine called “Samudra” is being prepared based on the Bengali corpus. It is essentially an information retrieval system. It also includes a real-time powerful web-crawler that can constantly collect data from the web.
Screen reader
A screen reader is a software program that helps visually impaired users use a computer with the help of a speech synthesizer on the computer screen. A screen reader is the interface between the computer operating system, its applications, and the user.
Bengali Sign Language Recognition System
This application will recognize the sign language speakers' hands, fingers, face and other related parts and convert them into conventional Bangla.
Bangla Data Driven Dictionary
This dictionary is the first computer-generated and human-validated Bangla dictionary. The words of the data-driven dictionary will come from the corpus and related tools. At least 200,000 unique words can be published in this edited dictionary.
All the services of the project will be connected through this component. A common user can easily get all the benefits related to Bangla voice-text-image through the frontend graphical interface of this platform. As a result, this component can serve as an Integrated Service Platform.
IPA Converter
IPA is used by linguists, foreign language student teachers, speech pathologists, singers, and translators. This application can convert Bangla Unicode text to IPA. This converter can quickly write the pronounced form of Bengali according to international standards.
Text-to-Sign Puppet
The Text to Sign Puppet (TTuSP) module will convert the Bengali language into sign language using artificial intelligence technology and show it as an animated puppet. This module will be useful for making daily news, road signs, notice boards, airport instructions etc. easier to understand for sign language users, such as speech- and hearing-impaired people.