African Conflict Over-time: Fatality Forecasting

HDSC Fall ’23 Premiere Project By Team DataMind

6 min readFeb 6, 2024

Project Overview


“History is a vast early warning system.”

Norman Cousins, American journalist (1915–1990)

Over the years, Africa has endured several cases of conflict ranging from severe (genocide and massacre) to mild cases. Africa ranks second among the regions in terms of the number of non-international armed conflicts (NIACs), with over 35 such conflicts occurring in countries including Burkina Faso, Cameroon, the Central African Republic (CAR), the Democratic Republic of the Congo, Ethiopia, Mali, Mozambique, Nigeria, Senegal, Somalia, South Sudan, and Sudan, involving multiple armed groups engaged in conflict with government forces and/or each other (geneva-academy). By utilising machine learning models, it becomes possible to examine historical data to detect regions or circumstances that may be susceptible to an escalation in conflict and a surge in casualties. This empowers conflict management and the implementation of proactive interventions and preventive measures to mitigate violence.

Goal and Objectives

“If you want to understand today you have to search yesterday.”

Pearl S. Buck, American novelist (1892–1973)

The goal of this project is to build a machine-learning model to predict fatality. This model can serve as an early warning system to help prevent conflict escalation and management. To achieve this goal, we have the following objectives:

  • Explore and analyse historical data on conflicts across Africa.
  • Build a machine learning model to predict the casualties.
  • Evaluate and improve the model.
  • Deploy the model for public use.

Project Description

Data Collection

Data Source

The African conflict dataset was sourced from Kaggle using the link:Africa Conflict 1997–2020.

Data Description

The dataset provided some historical information on conflict situations across Africa from 1997 to 2020. Some characteristics considered are conflict type, location, agent, date, political violence event, fatalities, and so on.

Data Pre-processing

The dataset consists of 65535 entries and 29 features. It was a mixture of several text features and numerical features. Some of the text features contain several missing values which we filled out with space. We also checked for duplicate values and data type mismatches.

Exploratory Data Analysis

In this section, we tried to get a deeper understanding of the dataset and get some key insights. We made several plots of the features to understand trends and patterns of conflicts across Africa. We performed exploratory data analysis to answer the following questions:

  1. In which year did Africa suffer the highest casualties and which region/country was most affected?
  2. Which Event/Sub event type will likely lead to more casualties in each country?
  3. What has been the pattern of violence occurrence over time?

a) In which year did Africa suffer the highest casualties and which region/country was most affected?

Africa suffered most of its casualties in 1999 which was followed by 1998 as shown in Figure 2. From 2004 to 2008 there were fewer conflict casualties. East and Central Africa appear to suffer the most casualties every year with the deadliest year being 1999Figure 3. Central Africa suffered heavy losses in 1998 and 1999. Next, we look at the countries in East and Central Africa that were most affected.

Figure 2: Number of casualties over the years (1997 to 2020)

Figure 3: Region with the highest number of casualties

In East Africa, Eritrea and Ethiopia were the most affected (Figure 4). The two countries suffered heavy losses in 1999. With Eritrea suffering the most. Investigating further, we get to know that Eritrea and Ethiopia were engaged in a conflict in the year 1999 Source. This is known as the Badme War. It was a major armed conflict between Ethiopia and Eritrea that took place from May 1998 to June 2000 and lots of casualties were reported.

In the Central Africa region, Angola suffered the most conflict casualties in 1998 and 1999 (Figure 5). This is because Angola was still in the middle of a civil war that started years before then Source.

“In 1998, the then President Dos Santos initiated an armed offensive against UNITA which is a group known as the rebels. Both the Angolan government and UNITA engaged in scorched-earth offensives, siege warfare, and other tactics that primarily targeted civilians. UNITA in particular aimed to push civilian populations into government-held cities to stress the government’s capacity to protect its citizens.” Source

At the end of 1999, the country was filled with lots of landmines which endangered citizens in the area.

Figure 4: A focus on East Africa

Figure 5: A Focus on Central Africa

b). Which Event/Sub event type will likely lead to more casualties in each country?

From the graph (Figure 6), we observe that most of the countries suffered more casualties at the hands of battles. This is followed by violence against civilians and some riots as in the case of DRC. Next, we further investigate which sub-event types lead to more casualties.

Our plot (Figure 7) hows that Armed clashes and Attacks account for the majority of casualties in the various countries. This is because the frequency of armed clashes and attacks in Africa is the highest among the sub-event types as shown in the plot below.

Figure 6: Fatality record by event type

Figure 7: Fatality record by sub-event type

c) What has been the pattern of violence occurrence over time?

It is noteworthy that although the occurrence of violence has shown a steady increase over the years, the number of fatalities resulting from these conflicts has significantly decreased compared to earlier yearsFigure 8. This trend may be attributed to the emergence of other event types that are comparatively less brutal when compared to the prominence of battle events between 1997 and 2003.

Figure 8: Violence occurrence by year

Taking a deeper look at the occurrence of different conflict events over the year, we can see that between 2019 and 2020, there were more protests than in other yearsFigure 9. Let us recall that the COVID-19 pandemic took up the whole world during this period. There was a lot of protest against some government policies implemented to mitigate the pandemic. In earlier years, from 1997 to 2009, Battles and violence against civilians took up most conflict events while protests appeared to be unpopular during that period.

Figure 9: Violence occurrence by year for each event type

Model Development

Model Evaluation

To evaluate the prediction result of our model, we used three regression metrics which are: Mean absolute error, Root mean squared error, and the R2- score.

Model Validation

To prevent overfitting, we used the cross-validation technique using

k-split = 5.


“Those who do not remember the past are condemned to repeat it.”

George Santayana, American philosopher (1863–1952)

The goal of this project is to create an early warning system to detect circumstances that may be susceptible to an escalation in conflict and a surge in casualties by leveraging machine learning tools. We analysed the historical dataset on conflicts in Africa over time and got some insight which included the pattern of violence and number of fatalities over time, the country and region that have suffered the most.

We built a machine-learning model to predict the number of fatalities. Our model evaluation results showed promising applications in detecting and preventing conflict escalation.




Our mission is to develop an army of creative problem solvers using an innovative approach to internships.