India can become the leader of AI in non-English markets, says LLM Sutra’s Founder

What if artificial intelligence (AI) could fluently communicate in every Indian language? 

At the Mint Digital Innovation Summit 2024, being held in Mumbai, Pranav Mistry, founder and CEO of Two Platforms Inc., shared his vision for India to become a leader in AI for non-English markets. Mistry emphasized that by developing new multilingual models, which don’t require training from scratch, India can overcome the complexity of its diverse languages and achieve this ambitious goal.

“Language is the interface between people, and the advancement of AI has put it at the centre of communication with machines. Over the past few years, LLMs (large language models) have made tremendous progress by showing human-level performance but only in the English language. In other languages, LLMs have been unable to capture cultural contexts and have often produced incoherent and incorrect answers,” Mistry explained.

Mistry believes that Indian startups, working on fine-tuning the existing models, can drive this change by rethinking their approach and developing new multi-lingual models that do not need training from the ground up can help it to become the leader of AI in non-English markets.

“But the gap in LLMs should not exist, and given India’s diverse language and dialects, we deserve AI with multilingual fluency. But only if we can work around the complexity of Indian languages,” he added.

Two Platforms, Mistry’s Silicon Valley-based deep tech startup, backed by Mukesh Ambani’s Jio Platforms and South Korea’s Naver Corp., recently released Sutra, a multilingual large language model designed specifically for the Indian market. 

“The Sutra’s core innovation is separating the concept of learning from language. We at Sutra have our own 256k new tokenizer, a balanced tokenizer that includes all the languages in a very balanced manner along with high-quality data,” Mistry elaborated.

Sutra, he said, is outperforming most of the local Indian LLMs as well as models like GPT3.5,4 and llama, not only in Hindi but also in languages like Gujarati.

“The English-centric model of large language models cannot solve our problem,” he noted.

You are on Mint! India’s #1 news destination (Source: Press Gazette). To learn more about our business coverage and market insights Click Here!

Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint.
Download The Mint News App to get Daily Market Updates.

More
Less

Published: 24 May 2024, 09:11 PM IST

Leave a Comment