Sarvam AI launches first LLM developed in India for local languages, built with NVIDIA AI
Bangalore, India: Friday, October 25, 2024: Created with NVIDIA NeMo software and trained on NVIDIA Hopper GPUs, Sarvam 1 model delivers efficient support for 11 languages to advance generative AI development across the nation.
Sarvam AI has developed Sarvam 1, India’s first home-grown large multilingual language model (LLM), built entirely on NVIDIA technology. Sarvam 1 is a 2-billion-parameter model, trained on 4 trillion tokens curated by Sarvam on NVIDIA H100 Tensor Core GPUs. Its custom tokenizer is up to four times more efficient than leading
English-trained models on Indian language text. Sarvam 1 supports 11 languages: Bengali, Gujarati, Hindi, Marathi, Malayalam, Kannada, Oriya, Tamil, Telugu, Punjabi, and English.
Sarvam 1 is already powering generative AI agents and other applications from Sarvam AI. Developers can use the base model — available on Hugging Face — to build their own generative AI applications for Indic language speakers.
“The Sarvam 1 model is the first example of an LLM trained from scratch with data, research,
and compute being fully in India”, said Dr. Pratyush Kumar, Co-Founder, Sarvam. He added; “We expect it to power a range of use cases including voice and messaging agents. This is the beginning of our mission to build full stack sovereign AI. We are deeply excited to be working together with NVIDIA towards this mission”.
Sarvam leveraged NVIDIA NeMo Curator to accelerate data processing pipelines and curate a high-quality pretraining corpus of data. NeMo Curator domain and quality classifier models were crucial in improving training data quality and enhancing the models final accuracy.
Sarvam 1, having undergone training on multiple applications, serves as an effective model for fine-tuning in various specialized tasks. These include formal and code-mixed translation, transliteration, preprocessing for text-to-speech systems, and vectorization for Indic content retrieval, as well as quality assessment and domain classification of pre-training data.
"Enterprises are seeking to leverage generative AI to accelerate innovation and tackle complex challenges at scale," said Kari Briski, vice president of AI software, models and services atNVIDIA. "Sarvam AI's multilingual model, developed using NVIDIA's full-stack AI platform including NeMo and Hopper GPUs, showcases how tailored AI solutions can address linguistic diversity and drive inclusive technological growth in regions like India."
NVIDIA TensorRT-LLM supports the low-precision FP8 inference of the Sarvam 1 model on the H100 GPUs and can be efficiently served and scaled using the NVIDIA Triton Inference Server TensorRT-LLM backend. Sarvam AI leverages its model within its voice-to-voice platform, recognized as an industry-leading solution for enterprises developing voice bots in Indian languages. Built on NVIDIA Riva speech and translation AI microservices, included with NVIDIA AI Enterprise, this platform effectively addresses use cases in legal, public, finance, and other sectors, particularly relevant to the Indian market.
Sarvam AI can run on NVIDIA-accelerated infrastructure on premises and on instances from NVIDIA’s global and Indian cloud partners to help advance AI adoption in India. This initiative marks a milestone in the country’s AI journey, helping position India as a leader in AI innovation and making advanced capabilities accessible to millions.
About Sarvam AI
Sarvam AI is a startup in the generative AI space focusing on efficient Indian language voice bots and productivity tools for knowledge workers. Sarvam AI is innovating across layers - building unique datasets, models for Indian languages speech and LLMs, and low-code authoring experiences for customer and professional agents. Sarvam AI is domiciled in India and aims to offer a sovereign stack for population scale AI usage