REQUEST FOR PROPOSAL (RFP)
Speech Data Collection and Annotation for Public Health ASR Models
Closing Date for Submission
March 31, 2026, at 23:59 PM
LEHS is a charitable organization in India whose purpose is to offer basic health and education for the poor. LEHS in furtherance of charitable objectives through its flagship programs Wadhwani AI which aims to build equitable and sustainable systems by making quality primary healthcare available and accessible to the underserved population and to bring the benefits of modern AI technology to underserved populations by building and deploying AI solutions for social impact across domains such as healthcare, agriculture, governance and education in India. LEHS aims to promote the integration of technologies, particularly in emerging domains like artificial intelligence and innovations into the Indian mainstream primary healthcare, education, and agriculture systems through a partnership with the State and National Government, apex institutions, international agencies, and private sector partners e.g. innovators, social enterprises and other ecosystem contributors in line with its stated objectives for the betterment of society particularly focusing on projects of national and social significance.
Wadhwani AI, a unit of LEHS, focuses on developing, deploying and evaluating artificial intelligence solutions to address critical social challenges in India, particularly in domains such as healthcare, agriculture, and education.
LEHS aims to promote the integration of technologies, particularly in emerging domains like artificial intelligence and innovations into the Indian mainstream primary healthcare, education, and agriculture systems through a partnership with the State and National Government, apex institutions, international agencies, and private sector partners e.g. innovators, social enterprises and other ecosystem contributors in line with its stated objectives for the betterment of society particularly focusing on projects of national and social significance. In line with its mission to support projects of national and social significance, LEHS also undertakes rigorous monitoring, evaluation, and learning (MEL) activities to assess the impact, usability, and scalability of different programmatic interventions.
Frontline health workers (FLWs) in India - such as Accredited Social Health Activists (ASHAs), Anganwadi Workers (AWWs), and Auxiliary Nurse Midwives (ANMs) - form the backbone of the public health delivery system. They serve as the primary interface between communities and the formal health system, supporting service delivery across maternal and child health, nutrition, immunization, and disease prevention programs.
To support FLWs in their daily work, an AI-enabled conversational assistant (chatbot) is being developed. The assistant is designed to function across multiple program areas and be accessible through voice-based interactions, recognizing the realities of digital literacy, time constraints, and field conditions faced by FLWs.
The assistant relies on Automatic Speech Recognition (ASR) models to interpret spoken user inputs. However, existing ASR models do not adequately capture the linguistic diversity, colloquial usage, and medical terminology used by FLWs at the last mile. Variations in pronunciation, dialects and local medical terms significantly impact ASR performance.
There is a need to collect and annotate large-scale, high-quality, domain-specific speech datasets to improve ASR performance for healthcare use cases. The initiative plans to collect up to 2,400 hours of Hindi, Marathi, & Oriya speech data across 12 states and Union Territories - Bihar, Uttar Pradesh, Rajasthan, Haryana, Chhattisgarh, Madhya Pradesh, Uttarakhand, Jharkhand, Himachal Pradesh, Maharashtra , Orissa and Delhi. Additionally, the initiative targets to annotate up to 500 hours of speech data across the listed geographies.
There are two key objectives of the project:
- Collection: To collect large-scale, high-quality speech data reflecting real-world usage by FLWs, and other public health professionals, including colloquial expressions, local dialects, code-mixed speech, and healthcare-specific terminology. The recordings must reflect real-world speech as observed in public health settings. To ensure diversity, the dataset must be collected from Front Line Workers with varied demographics across regions, age groups, and education qualifications.
- Annotation: To transcribe and validate collected datasets with high accuracy, ensuring linguistic correctness
This collected and annotated voice data will be used to train and improve foundational ASR models for improving Speech-to-text module in the AI Assistant for FLWs.
The selected agency is expected to:
- Establish partnerships with state health departments and relevant authorities.
- Identify appropriate FLWs and medical professionals for data collection (e.g., ASHA, AWW, ANM).
- Secure administrative approvals for audio recording activities.
- Ensure high-quality, timely, and complete data submissions with annotated transcriptions and associated metadata.
All responses to this RFP must be received no later than March 31, 2026. The proposal should be submitted only through e-mail in PDF format addressed Procurement Team in the below-given e-mail id: rfp.lehs@wadhwaniai.org
Note: Only shortlisted vendors will be contacted for presentations or negotiations. If you do not hear from us within two weeks of submission, consider your proposal not selected. LEHS reserves the right to reject any or all proposals and to negotiate terms and conditions with the selected vendor.
For detailed information, please check the complete version of the RFP attached below