HDSC Winter ’22 Premiere Project Presentation: Global Population Estimates

HamoyeHQ
3 min readMar 3, 2022

A Project by Team Clustering

Introduction

There has been tremendous growth in the size of the world’s population in the last half century. Global Population was around 3 billion in 1960. By 1987, in less than three decades, it had surpassed 5 billion and there were around 7.6 billion people in the world in 2018.

This growth varies greatly across regions. Since 1960, the largest relative growth has taken place in Sub-Saharan Africa where the population expanded from 227 million in 1960 to more than 1 billion in 2018 — a nearly fivefold increase.

The second largest growth over the period can be seen in Middle East and North Africa, where the population increased more than 4 times, from 105 million to 449 million. So we need to build a model with higher accuracy to predict the population growth of the year 2050.

Aims and Objectives

This project aims to identify, describe and visualize the population structures of countries across continents. It also predicts populations in the year 2050 on the basis of previous years’ population. Here, we build data mining models with higher prediction accuracy.

Data Collection

Link to dataset:

https://www.kaggle.com/theworldbank/global-population-estimates

This data contains a variety of population estimates from different countries of the world with 44812 rows and 95 columns. The countries are tabled against the population estimates.

Data Cleaning

Data cleaning is imperative in the writing of machine learning algorithms. The success of the training algorithm depends on how well one is able to clean the project data. My group spent enough time cleaning data. We filtered out null values, missing data and separated categorical values from numeric values. After doing this, our data was ready for exploratory data analysis (EDA).

Missing Values in Dataset

We edited null values by changing them using mean, median and zero values.

Plotting heat map before data cleaning
Plotting heat map after data cleaning

Annual World Population Growth Rate

World Population By Selected Region in 1960

World Population By Selected Region in 2050

Conclusion

This project objectives were adequately met and we were able to build a data model that successfully predicted population estimates for the year 2050 across countries.

Thank you for reading.

--

--

HamoyeHQ

Our mission is to develop an army of creative problem solvers using an innovative approach to internships.