Syed Saad Ahmed

Harnessing AI to Save Lives: How ARMMAN, an Indian Nonprofit, Uses Technology to Improve Maternal and Child Health

There are many Artificial Intelligence tools related to maternal and child health,¹ such as apps to detect malnutrition in children² and algorithms to predict risks and complications during pregnancies.³

But these tools raise several questions: what are the benefits and downsides of AI compared to current approaches? What are the challenges and risks of using AI? How can we ensure that AI tools promote health equity and access rather than deepening existing divides?

I delved into some of these issues in a podcast with Amrita Mahale, the Director of Product and Innovation at ARMMAN. ARMMAN is an Indian non-profit that creates cost-effective, tech-based solutions to reduce maternal and child mortality and morbidity. Through its programs, ARMMAN has reached around 50 million women and children and 400,000 health workers in 21 states across India.⁴

Here is an edited transcript of the podcast with Amrita Mahale.

What are the maternal and child health challenges that ARMMAN is trying to address?

India has made great strides towards reducing maternal mortality, but still, every 20 minutes or so, a woman dies in childbirth.⁵ And for every woman who dies, many more suffer lifelong complications.

One challenge is low healthcare-seeking behavior.⁶ Due to patriarchal strictures and norms, neither women, nor their families pay much attention to their health needs.^7,8 These issues are exacerbated for women from low-income and low-education backgrounds.

Besides, community health workers are often underskilled and overworked.⁹ They are supposed to provide basic care for simple conditions, but that doesn’t always happen. So, women either delay seeking care, opt for private clinics (which can be expensive), or go to tertiary health facilities (which are usually overburdened) [See Glossary #1 below for more details]. Delay in accessing healthcare increases the risk of severe complications and deaths. If you have good preventive care systems, most complications can be averted or caught early and dealt with at the health system’s lower levels. That way, only the most acute cases go to secondary or tertiary facilities.

How is ARMMAN using technology to solve these challenges?

Our programs broadly fall into two buckets. One, programs that empower women with preventive care information through pregnancy and infancy so that they seek healthcare early and regularly.

Two, programs that train and support health workers to better provide healthcare and detect and manage complications early. We try to prevent health complications and ensure that when they occur, they are tackled at the health system’s lower levels to avoid overburdening tertiary facilities.

Two of our largest programs are Kilkari and Mobile Academy. Kilkari, which we run in collaboration with the Government of India, delivers weekly pre-recorded messages to women over mobile phones from the fourth month of pregnancy till the child turns one. It has reached over 47 million women to date and has 3.5 million active subscribers across 20 states of India. Mobile Academy trains frontline health workers known as ASHAs [See Glossary #2 below for more details] using phone-based training modules. Another program is mMitra, which sends automated voice calls with critical health information to around 100,000 women in the state of Maharashtra.

There have been Randomized Controlled Trials (RCT)[See Glossary #3 below for more details] to evaluate the impact of our programs. The mMitra RCT showed a 38% increase in pregnant women who completed the prescribed doses of iron-folic acid tablets and a 22% increase in the number of children who tripled their birth weight after one year. There were also improvements in tetanus vaccine uptake, consulting a doctor for spotting or bleeding during pregnancy, and delivery in a hospital. John Hopkins University conducted the RCT for Kilkari about five years ago. It showed improvements in the vaccination of children, delivery in a hospital, use of contraceptives, and fathers’ knowledge regarding maternal and child health.

What prompted you to use AI in your programs?

ARMMAN has been using AI for several years now, starting with the mMitra program.

In mMitra, we observed dwindling engagement over time, which is common for mobile health programs globally. Some listen actively for a few weeks or months, but since the program goes on for 18 months, they stop listening for various reasons. We wanted them to listen to every message during the program because, say, if a pregnant woman drops out before the child is born, she will miss out on information related to immunization, exclusive breastfeeding, complementary feeding, etc.

We had some rules-based systems to avoid drop-offs. So, when a woman stopped listening, we would call her from our call center, but it would often be too late. And the bigger challenge is that we don’t have a large workforce. These resource constraints meant that we couldn’t call everyone who stopped listening. So, we wanted to identify and predict listenership patterns early on and strategically intervene to ensure higher success rates.

We realized that we have a lot of data, so why not use AI to solve this problem? We partnered with Google Research India and used restless multiarmed bandit algorithms to predict which users are likely to drop off and who among them will benefit the most from an intervention.¹⁰

Now, we are trying to do the same in Kilkari as well. It’s different from mMitra, where we enroll the women ourselves and collect demographic information. So, it’s easy to transfer insights from older cohorts to newer ones. When a woman joins, we can figure out a lot based on her socio-demographic characteristics and information from past subscribers.

In Kilkari, we have no demographic information, so all we can do is look at a woman’s listening trajectory and make predictions about her future. So, we had to make many tweaks to the AI approach to meet the needs of a national program like Kilkari. But because we’ve done it once, we know what to expect and how to design an effective AI study.

We follow an evidence-based approach to scaling innovation. So, we start with small pilot projects [See Glossary #4 below for more details] and then increase their scale before large rollouts.

What other AI initiatives are you working on?

We are working on broadly two kinds of AI projects: 1) Using machine learning and data science to improve our programs 2) Using generative AI and Large Language Models (LLMs) [See Glossary #5 below for more details].

We are planning a pilot where we use AI to predict what the best time slot to call a woman is. And this is a great example of using insights from the real world to select a use case to deploy AI.

We’ve seen in rural India that phone usage patterns are very different from urban India. In cities, we keep checking our phones, but this is not common among our target audience. They use their phones early in the morning, then get busy with domestic chores or work in the fields and check their phones again only after lunch or at the end of the day. So in a day, there are only 2-3 brief windows during which we can reach these women. And many share their phones with their husbands, who might be away at work for most of the day.

But in Kilkari, women can’t choose a time slot to receive calls. And we don’t know if they are using shared phones. So how do we know when to call them? We are thinking of using AI to optimize which time slot the woman gets a call in. We are still in the brainstorming phase, but later this year, we’ll run a pilot to see if we can use machine learning to predict the best time to call a woman, especially given the constraints of the automated calling system.

The other kind of AI solution we are excited about uses generative AI and LLMs. Last summer, we decided to build a learning program for Auxiliary Nurse Midwives (ANMs) [See Glossary #6 below for more details]. We had earlier developed 20 detailed protocols on high-risk factors. We train workers face-to-face regarding these protocols and provide them with digital learning materials for self-paced learning. However, ANMs would sometimes get overwhelmed because of the information overload.

So, we started a WhatsApp helpline where they could pose queries and doctors would respond. The doctors are overworked, so they would often take hours or days to respond. Eventually, ANMs stopped using the service because they were used to getting answers at the speed of a Google search.

That’s when we thought of using LLMs to not generate an answer, but pick the most appropriate response from a list of frequently asked questions and answers in response to the ANMs’ queries. Or the LLM would generate an answer, which the doctor would verify before sending it to the ANM. But we saw that the LLM generated excellent answers! So, we thought of keeping the doctor out of the loop and having the LLM send responses directly to the ANM.

It’s interesting you say that you found LLMs useful because they are often criticized for generating incorrect, incomplete or biased responses. How did you overcome these challenges?

Yes, we were nervous about this problem, so we approached it with caution and responsibility. It was clear to us that we would not launch anything without validating it thoroughly and that we would evaluate the model in small, incremental steps. So, we did not just let the LLM make up answers.

We use something called retrieval-augmented generation — we force the LLM to take answers from the training manuals and clinically validated protocols we had created. In this aspect, we were privileged compared to other organizations trying to build chatbots because we did not have to create any resources from scratch. All we had to do was make them more machine-readable. Since they are learning aids for health workers, the protocols are visual — there are flowcharts, decision trees, and images. So, we had to convert those into plain text, which was slightly time-consuming.

We also had a lot of evaluation materials — health workers take quizzes at the beginning and end of the courses, which help us evaluate the courses’ impact on learning levels. We also have a module on ethics. These became evaluation materials for the LLM. We made sure at every step that the LLM was able to give correct answers in the quiz and match the ethical aspects with the correct answers. So, even before we began using LLMs widely, we ensured it worked well in a variety of contexts.

How do you ensure that AI applications promote health equity and access rather than deepening existing divides?

We follow a problem-first approach and not a technology-first approach for our innovation pilots — AI as well as non-AI. We identify the core problem we have to solve for our users and how technology or AI can solve this problem more efficiently. That ensures we use AI only to create meaningful impact.

ARMMAN’s pilots also go through an ethics review. There is an interdisciplinary team that looks at the study design and preliminary results and thinks through the risks: potential harm and sources of bias.

We work with external collaborators on our AI projects, but we do not share any personally identifiable information with them. Internally also, only those who cannot do their job without this data have access to it; others don’t.

To ensure equity, we follow inclusive design principles. We do extensive user research to understand how our AI projects, especially LLMs, will be used, perceived and interpreted on the ground. We figure out who could get left out if we introduce certain technologies in our program.

For example, in the case of the chatbot we spoke about earlier, we did user research even before we developed the chatbot. We used a prototyping technique called ‘Wizard of Oz’. In this experiment, we simulate an automated experience, but a human actually controls the flow. For the chatbot, we had ANMs send their questions on WhatsApp, but instead of chatbots responding, a human at the other end replied using a set of scripts. It was not a free-flowing conversation.

We learnt early on that many ANMs cannot type. They have nursing diplomas, they can read and write, but they are not comfortable typing complex messages with medical terms. So, they defaulted to sending voice messages. Our initial plan was to launch a proof of concept that only used text mode because voice is much harder to get right. But after the experiment, we realized that we couldn’t launch a product that didn’t have voice mode because many ANMs would be left out. Often, the ANMs not comfortable typing are the ones who probably need this kind of service the most. So, it wouldn’t just leave out a certain percentage of users, but also those users who would benefit the most from the service. So, we made sure we prioritized voice mode even if it delayed development.

Glossary

India has a three-tier public health system: primary, secondary and tertiary. Primary healthcare comprises community health workers and doctors at ‘primary health centers’. They are often the first point of contact for most pregnant women and families.
ASHA stands for Accredited Social Health Activist. They are Indian grassroots health workers who provide health education and encourage people to avail of public healthcare services.
A randomized controlled trial (RCT) is a scientific study to test the efficacy of an intervention. In these studies, participants are randomly allocated to either a treatment group (those who receive the intervention) or a control group (those who do not receive the intervention).
Pilot projects are small-scale preliminary studies to test new services, projects, or products before deploying them at a large scale.
Machine learning is a branch of artificial intelligence (AI) that enables machines to automatically learn from data and past experiences to identify patterns and make predictions with minimal human intervention. Data science is the study of data to extract meaningful insights. Generative AI is an artificial intelligence technology that can produce content such as text, video, images, etc., usually in response to a prompt. Large Language Models are artificial intelligence tools that can comprehend and produce human language. For example, ChatGPT.
Auxiliary Nurse Midwives are grassroots health workers who provide basic nursing care. They provide antenatal check-ups and immunization, among other maternal and child health services.

Disclaimer

Some of the visuals used in this blog are AI-generated on Canva.

Citations

Khan M, Khurshid M, Vatsa M, Singh R, Duggal M, Singh K. On AI Approaches for Promoting Maternal and Neonatal Health in Low Resource Settings: A Review. Front Public Health. 2022;10:880034. Published 2022 Sep 30. doi:10.3389/fpubh.2022.880034
Cullinan M. Innovative app for detecting malnutrition wins 2023 Gold Anthem Award. Action Against Hunger. Published February 15, 2023. Accessed June 22, 2024. https://www.actionagainsthunger.org/press-releases/innovative-app-for-detecting-malnutrition-wins-2023-gold-anthem-award/.
Islam MN, Mustafina SN, Mahmud T, Khan NI. Machine learning to predict pregnancy outcomes: a systematic review, synthesizing framework and future research agenda. BMC Pregnancy and Childbirth. 2022;22(1). doi:10.1186/s12884-022-04594-2
Leveraging technology to create scalable solutions to empower mothers and enable the nurturing of healthy children. ARMMAN. Accessed May 28, 2024. https://armman.org/.
This Mother’s Day Help Expecting Mothers and Their Babies Survive. Child Rights and You. Published May 7, 2022. Accessed June 18, 2024. https://www.cry.org/blog/this-mothers-day-help-expecting-mothers-and-their-babies-survive/.
Reddy PMC, Rineetha T, Sreeharshika D, Jothula KY.^,. Health care seeking behaviour among rural women in Telangana: A cross sectional study. Journal of Family Medicine and Primary Care 9(9):p 4778-4783, September 2020. doi:10.4103/jfmpc.jfmpc_489_20
Moradhvaj, Saikia N. Gender disparities in health care expenditures and financing strategies (HCFS) for inpatient care in India. SSM Popul Health. 2019;9:100372. Published 2019 Feb 2. doi:10.1016/j.ssmph.2019.100372
The Editorial Board. Women’s health is just not considered important enough. The Telegraph Online. Published August 13, 2019. Accessed June 22, 2024. https://www.telegraphindia.com/opinion/womens-health-is-just-not-considered-important-enough/cid/1697387.
Mahajan N, Kaur B. Community health workers in rural Punjab, India: analyzing their role, expectations and challenges. Journal of Health Research. 2021;36(2):255-264. doi:10.1108/jhr-04-2020-0103
Lalan A, Verma S, Madhu Sudan K, Mahale A, Hegde A, Tambe M, Taneja A. Analyzing and Predicting Low-Listenership Trends in a Large-Scale Mobile Health Program: A Preliminary Investigation. KDD workshop on Data Science for SocialGood; July 2023. https://arxiv.org/pdf/2311.07139.

Stay in Touch with Syed Saad Ahmed

ABOUT THE THOUGHT LEADERSHIP FOR PUBLIC HEALTH FELLOWSHIP

The mission of the Boston Congress of Public Health Thought Leadership for Public Health Fellowship (BCPH Fellowship) seeks to:

Incubate the next generation of thought leaders in public health;
Advance collective impact for health equity through public health advocacy; and
Diversify, democratize, and broaden evidence-based public health dialogue and expression.

It is guided by an overall vision to provide a platform, training, and support network for the next generation of public health thought leaders and public scholars to explore and grow their voice.