Using Diet Analysis to Predict and Prevent Child Malnutrition



HDSC Winter 23 Capstone Project Presentation

A Project by Team SEABORN


Malnutrition, as highlighted by the World Health Organization (WHO), remains a pressing global health issue, particularly in low and middle-income countries. It results from an inadequate and unbalanced diet, hindering growth and development in both children and adults. Undernutrition, a significant factor in child mortality, contributes to approximately 45% of all deaths among children under five worldwide. Stunting, wasting, and underweight are forms of undernutrition that increase children’s vulnerability to common childhood diseases like diarrhea, pneumonia, and malaria due to a weakened immune system resulting from insufficient nutrition.

WHO estimated that in 2020, 149 million children were stunted, 45 million were wasted, and 38.9 million were obese, representing various facets of malnutrition. To combat this global issue, the World Health Assembly is closely scrutinizing the factors contributing to and addressing malnutrition in children and adults. Ensuring the consumption of a nutritious diet stands as a crucial element in providing children with the necessary nutrients to reduce the burden of malnutrition and prevent stunting, wasting, and obesity. Analyzing the components of a child’s diet is pivotal in achieving these goals.

Problem Statement

This project seeks to use machine/deep learning algorithms to build models used for predicting malnutrition (stunting, wasting, overweight) in children under the age of 5 using the child’s diets as the predictor.

Aim and Objectives

The aim is to use the diet feature of children to predict and prevent malnutrition in children under the age of 5.

The objectives include:

  1. Investigating the elements of the child’s diet.
  2. Developing a machine/deep learning model to predict malnutrition in children under 5.
  3. Evaluating and deploying the model.

Data Understanding

The dataset is obtained from the global nutrition website. The Global Nutrition Report captures the state of nutrition and progress towards the global nutrition targets at the country, regional and global level. To predict and prevent malnutrition in children using diet analysis the diet, and the burden of malnutrition datasets are used.

Diet: The diet dataset comprises variables such as exclusive breastfeeding, early initiation, solid foods, minimum diet diversity, minimum accept diet and other nutrients which are important for both children and adults.

Burden of Malnutrition: contains variables such as stunting, wasting, low birth weight, overweight of children who are below 5 years and other measures of malnutrition for adults in both country and regional levels. These values are given in percentage.

Data Preparation

  • Reducing Cardinality (Number of Features)

The features from both the diet and burden of malnutrition were reduced to only have the features relating to children under 5. After cleaning, the diet dataset had 126 columns and 3242 rows while the burden of malnutrition comprised 67 columns and 5906 rows. The features from the burden of malnutrition were summed up together to get the overall wasting, stunting and overweight.

  • Missing Values, Duplicates and Outliers

All missing values were replaced with 0.

Exploratory Data Analysis

EDA is an important step in the data analysis process because it allows us to understand the data before applying any statistical models or making any decisions.. After carrying out EDA, the following observations were made.

  • Africa and Asia have recorded the highest number of stunting and wasting cases from 2000–2021.
  • Europe has the highest number of severe overweight cases, followed by North America.
  • Burundi, an African country has the highest percentage of stunted children over the past two decades while Ukraine is leading with the highest number of overweight cases in children under the age of 5.
  • Regions in Europe have the highest percentage of children receiving food from all age groups while regions in Africa have the lowest percentage.
  • Children within the age of 20–23 months have the highest percentage of receiving food
  • Children within the age of 0–5 months are exclusively breastfed In different regions.
  • Across all regions the percentage of children who are exclusively breastfed reduces as age increases.
  • Globally boys are more affected by the burden of malnutrition than girls.

Prevalence of Malnutrition in Children under the age of 5 at Regional Level

Figure 1: Prevalence of stunting, wasting,overweight across regions (2000–21)

Top 10 countries with the most cases of malnutrition from 2000–2021

Figure 2: Top 10 leading countries with highest cases of stunting and overweight from 2000–2021

Modelling & Evaluation

The deep learning regression and machine learning regression algorithms were used to build models. The response variables are stunting, wasting and overweight while the predictors are all the elements of the diet dataset.

The following procedures are carried out.

  • Train and Test Sets

The training and testing sets were formed from an 80/20 split (respectively) of the dataset.

  • Deep Regression Model

Three layers were used, the input, hidden and output layer comprising 100, 10 and 1 neuron respectively. To compile the model, Adam optimizer and mean absolute error was used. The model was trained for 100 and 500 epochs.

Deep Learning Regression Model Building (1)

Deep Learning Regression Model Building (2)

Evaluation Result (1): 100 epochs

Evaluation Result (1): 500 epochs

Loss Curve Plots after 100 epochs

Loss Curve after 100 epochs

Based on the plot above, the model’s loss experiences stability after 20 epochs.There was no further improvement to the decrease in the loss value as the number of

epochs increased.


Loss Curve after 500 epochs

  • Machine Learning Models

Six machine learning algorithms were used to develop the predictive models. Mean absolute error and R2 metrics were used for evaluation. The Polynomial Regression outperformed other models with lowest MSE of 2.85.

Results from Machine Learning Models


It is recommended that:

  • targeted interventions be set up to encourage and support exclusive breastfeeding in the first five months of an infant’s life.
  • increased efforts be made to improve the dietary diversity in regions with lower percentages of children receiving food from multiple groups, particularly in younger age groups.

Data on more dietary nutrients are needed for further analysis.


Based on our findings, it can be inferred that Polynomial Regression demonstrated a moderate superiority compared to other machine learning algorithms in predicting the malnutrition status of children. In the development of the model using Deep Learning techniques, we utilized the Adam optimizer and mean absolute error as part of our approach. This research primarily centered on identifying and predicting key risk factors associated with stunting, wasting, and underweight using machine learning algorithms, aiming to contribute to the reduction of malnutrition among children.


  • Investigate the risk factors of stunting, wasting, and underweight among under-five Bangladeshi children and its prediction based on machine learning approach. link
  • World Health Organization, “Malnutrition,” link
  • A. Talukder and B. Ahammed, “Machine learning algorithms for predicting malnutrition among under-five children in Bangladesh,” Nutrition, vol. 78, Oct. 2020, doi: 10.1016/j.nut.2020.110861. link




Our mission is to develop an army of creative problem solvers using an innovative approach to internships.