HDSC Spring ’24 Premiere Projects

10 min readApr 29, 2024

Countless breakthroughs have transformed the landscape of the machine industry, from artificial neural networks (ANN) and autonomous robots to intelligent chatbots and commercialized AI. These strides wouldn’t have been possible without the bold visionaries who delved deep into the realm of machine learning, paving the way for unprecedented achievements. Today, humanity reaps the rewards of their pioneering work, which continues to evolve and shape our future.

At the core of our mission lies the commitment to foster the next generation of problem solvers. We are dedicated to providing every intern with hands-on experience, allowing them to apply their theoretical knowledge to real-world scenarios. Throughout their tenure at Hamoye, interns are encouraged to embark on practical machine learning projects. This not only reinforces their understanding of key concepts but also sharpens their ability to tackle tangible challenges. As John Kenneth Galbraith once said, “Technology is the systematic application of knowledge to practical tasks.”

HDSC Spring ’ 24 Premiere Projects

Details of the premiere projects are provided here. Find your project details using the project topic assigned to your project group.


Topic: Remittance patterns and economic development (Dataset)

Project Instructions

Featuring the top remittance recipient countries, analyze the economic impact, observable trends, future projections, and key insights. Members focusing on the Gen AI track should consider how AI technologies should enhance these findings, potentially improving the accuracy of predictions and uncovering deeper insights into the data.


Topic: Developmental Stage Classification of Animals (Dataset)

Project Instructions

Focusing on classifying animals based on their developmental stages using machine learning techniques, leverage the dataset collected from nature reserves in South Africa and aim to develop a robust classification model capable of accurately determining the developmental stage of animals observed in the wild. Contribute to wildlife conservation efforts by providing insights into population dynamics and ecological health through non-invasive monitoring methods.

Gen AI interns within the group can utilize the provided dataset to enhance its capabilities in various ways. By analyzing the images of animals and their habitats, the AI can generate new images, augment the dataset, and detect anomalies. It can also learn to segment images into different semantic classes and impute missing data. Through these methods, the Gen AI track students can improve the understanding of wildlife environments and contribute to tasks such as animal detection, habitat mapping, and ecological analysis, ultimately supporting wildlife conservation efforts.


Topic:Predicting High-Risk Zones for Malaria Outbreaks (Dataset)

Project Instructions

In 2022, there were 249 million malaria cases globally, resulting in 608,000 deaths, with 76% affecting children under 5. Leveraging this data, develop predictive models to identify areas prone to malaria outbreaks, contributing to proactive measures in combating the disease.

Gen AI learners will analyze malaria data, develop predictive models, and evaluate their performance. By addressing real-world health challenges, learners gain practical experience in data analysis and machine learning while contributing to efforts to prevent malaria-related deaths.


Topic: Forecasting Antimalarial Drug Needs (Dataset)

Project Instructions

To accurately forecast the demand for antimalarial drugs in Africa, it is essential to acknowledge the stark reality that malaria significantly impacts child mortality, with a child under five dying nearly every minute from the disease. The goal is to provide robust predictions of antimalarial drug requirements across the continent. Members focusing on the Gen AI track should explore how AI technologies could refine these forecasts by enhancing the precision of predictions and providing deeper analytical insights into the data. AI could be instrumental in modeling complex variables that affect drug demand, such as seasonal outbreaks, population growth, and healthcare access improvements.


Topic: Predicting future electrification needs (Dataset)

Project Instructions

Utilize the database containing information on electricity access percentages from various sources across different countries to assess and predict their future electrification needs. Your research should analyze current trends and project future demands, focusing on enhancing the accuracy of your forecasts. With the help of the Gen AI intern(s) incorporate AI technologies to process and analyze large datasets effectively, identify patterns, and make data-driven predictions. This integration of AI will not only refine your analysis but also provide deeper insights, potentially revealing underlying factors that influence electrification needs. Aim to develop actionable strategies that can guide the expansion and optimization of electrical infrastructure based on your findings.


Topic: Predicting School Completion Rates in Emerging Economies (Dataset)

Project Instructions

Utilize the dataset to assess and predict school completion rates in developing countries, focusing on the impact of educational attainment. Begin by segmenting the data according to demographic factors like age, gender, socio-economic status, and location, using statistical analysis to uncover trends in completion rates among educated individuals. With the help of the Gen AI intern(s) integrate AI technologies to enhance this analysis. Your final report should not only outline these findings but also provide strategic recommendations for improving school completion rates, emphasizing how AI can inform more effective educational policies and interventions.


Topic: Malaria in Africa (Dataset)

Project Instructions

For your research on Malaria in Africa, analyze the dataset covering all African countries from 2007 to 2017. Each country is represented with a unique ISO-3 country code, and includes geographical data such as latitude and longitude. The dataset details reported malaria cases and the preventive measures implemented annually across these nations. Employ the help of the Gen AI intern(s) to enhance the analysis of this data. Utilize machine learning algorithms to identify patterns and trends in malaria incidence and the effectiveness of different prevention strategies over the years. This advanced analysis can help predict future outbreaks and the impact of current preventive measures, offering valuable insights for policymakers and health organizations to strategize more effectively against malaria in Africa.


Topic: Forecasting the Probability of a Conflict Arising in an African Nation (Dataset)

Project Instructions

The Armed Conflict Location and Event Data Project (ACLED) is crafted for detailed conflict analysis and crisis mapping. This dataset records the dates and locations of reported political violence and protest events across numerous developing countries in Africa. Political violence and protest events encompass incidents within civil wars, periods of instability, public protests, and regime breakdowns. ACLED spans all African countries from 1997 to the present day.

With the help of the Gen AI intern(s), use AI technologies to process and analyze this data more effectively. Utilize AI to model potential future conflicts by identifying patterns and trends from past data.

Model DB

Topic: Predicting Future Electrification Needs (Dataset)

Project Instructions

Predict future electrification needs using a dataset containing new information on household electrification across 124 non-OECD countries. With data dating back to 1960 for certain nations, including rural, urban, and aggregate electrification rates, forecast future electrification requirements. By leveraging historical trends and demographic factors, develop predictive models that identify areas likely to experience increased demand for electricity in the coming years, facilitating proactive planning and resource allocation to meet future needs.

Generative AI techniques can enhance the project by generating synthetic data to augment the existing dataset. By simulating various electrification scenarios and population growth patterns, generative models can provide insights into potential future electrification trends and help validate the accuracy of predictive models. Additionally, generative AI can assist in generating plausible scenarios for electrification expansion, aiding policymakers in making informed decisions to ensure adequate infrastructure development and access to electricity for all.

Model monitor

Topic: Impact of Maternal Education on Child Malaria Rates (Dataset)

Project Instructions

For your research on the impact of maternal education on child malaria rates, analyze the correlation between a woman’s education level and her children’s health, particularly in terms of malaria prevalence. The dataset should include information on the educational attainment of mothers and corresponding health outcomes for their children, including access to medical care, which is crucial for preventing and treating malaria.

Generative AI intern(s) in this track should deepen the analysis to identify complex patterns and relationships within the data that may not be immediately apparent. AI can help model the direct and indirect effects of maternal education on malaria rates among children, potentially revealing insights into how educational policies could be leveraged to improve health outcomes.

Open AI

Topic: Predictive Modeling of Electricity Access in Africa (Dataset)

Project Instructions

Utilize the World Bank dataset covering electricity access rates from 1990 to 2021 across African countries. Complete an exploratory analysis to understand trends and correlations. Your predictive models will then be developed using statistical and machine learning techniques to forecast future electricity access trends. Advanced AI technologies with the help of the interns in the Gen AI track will be applied to enhance model accuracy and uncover complex patterns influencing electricity access. Where possible, validate the models a subset of the data, and the findings will be used to formulate policy recommendations aimed at improving electricity access in underserved regions. Provide a comprehensive report summarizing the methodology, findings, and recommendations for stakeholders.


Topic: Forecasting Future Education Spending (Dataset)

Project Instructions

Analyze historical data on education financing to predict future spending trends using datasets from the UNESCO Institute for Statistics’ SDG 4 Data Hub. Using statistical and machine learning techniques, develop models to forecast education budgets, incorporating AI to enhance prediction accuracy. Your analysis should identify factors influencing spending trends and assist policymakers and educational planners in strategic funding to meet educational goals.

Power BI

Topic: Impact of natural disasters on remittance flow (Dataset)

Project Instructions

Remittance is a means through which the economy of a nation can be affected positively or negatively via factors such as natural disasters. Explore the dataset taken across several nations prone to different natural disasters. Examine how natural disasters like hurricanes and earthquakes affect remittance inflows to disaster-stricken countries. Using data from the World Bank on remittances as a percentage of GDP, analyze the variations in remittance volumes before, during, and after such events. The project will correlate remittance data with information on natural disasters, apply statistical techniques to highlight impacts, and develop predictive models. Recommend policies to support remittance-dependent communities in times of crisis, enhancing economic resilience.


Topic: Predictive Modeling of Policy Impact on Refugee Numbers (Dataset)

Project Instructions

Investigates the effects of national and international policies on global refugee statistics using the provided dataset from the UN Department of Economic and Social Affairs. Analyze historical data to identify trends, apply machine learning techniques to predict how policy changes might influence refugee flows, and simulate the outcomes of potential policy modifications. Your report could equip policymakers and humanitarian organizations with data-driven insights to craft more effective and responsive refugee policies.


Topic: Animals species detection using Deep learning (Dataset)

Project Instructions

Employ deep learning methods to identify different animal species from a dataset of African wildlife images available from the Kaggle dataset. Consider preparing the images for processing, developing and training convolutional neural networks (CNNs), and integrating advanced AI techniques like transfer learning to improve detection accuracy. Your research will not only enhance biodiversity monitoring but could also significantly aid conservation efforts. The project’s ultimate goal is to develop a robust tool for accurate and rapid animal species classification, which could be adapted for broader ecological applications.


Topic: Forecasting Scholarship Aid Based on Regional Needs (Dataset)

Project Instructions

Provide a dynamic tool for educational stakeholders to allocate scholarships more effectively, ensuring that resources are directed where they are most needed and can have the highest impact. Work on a targeted approach for how scholarship aid can significantly contribute to reducing educational disparities and promoting equitable education opportunities globally. Utilize advanced AI techniques to predict the need for scholarship aid across various regions, leverage on the data from the UNESCO Institute for Statistics. The final goal is to enable policymakers and educational institutions to allocate resources effectively, thus reducing educational disparities

Seldon core

Topic: Animal Count in Images (Dataset)

Project Instructions

Develop a model for accurately counting animals in images captured within South African nature reserves. The dataset, originally collected for real-time animal detection, contains diverse images of wildlife in natural habitats. By leveraging machine learning and computer vision techniques, create a model capable of efficiently and accurately detecting and counting animals. The goal is to provide a valuable tool for wildlife conservationists and researchers to monitor animal populations and assess ecological health in nature reserves.


Topic: Pneumonia Detection Model (Dataset)

Project Instructions

Where possible, employ the use of AI (Gen AI interns, time to shine!) to automate pneumonia diagnosis from chest X-ray images using a dataset from Kaggle. The study should focus on using convolutional neural networks (CNNs) to analyze the images, enhance diagnostic accuracy beyond traditional methods. Include preprocessing images for uniformity, developing and training as normal. The same methodology can be applied to identify signs of other respiratory conditions such as tuberculosis or lung cancer. Moreover, the principles of this model can be transferred to different types of medical imaging, like CT scans or MRIs, to diagnose a broader range of conditions across various body parts.


Topic: Using Diet Analysis to Predict and Prevent Child Malnutrition (Dataset)

Project Instructions

Analyze children’s dietary patterns using information from dataset. The aim is to identify nutritional deficiencies that are predictive of malnutrition by analyzing dietary intake data collected from various global regions. This involves employing machine learning algorithms to recognize dietary trends and correlations with malnutrition cases, thereby facilitating the development of predictive models.

The models are expected to help in forecasting malnutrition risks and suggesting preventative measures. See which AI technology can help to enhance the processing and analysis of the datasets, enabling the identification of at-risk populations more efficiently and accurately. The outcome of this research could provide healthcare providers and policymakers to implement targeted interventions, such as personalized nutrition plans or community-based educational programs, aimed at reducing malnutrition among children and improving public health outcomes.


Topic:Infrastructure Deficit (Dataset, Dataset II)

Project Instructions

Analyze the infrastructure deficit across African countries utilizing the Africa Infrastructure Development Index (AIDI) dataset, produced by the African Development Bank. The AIDI serves several critical purposes, including monitoring and evaluating the status and progress of infrastructure development, assisting in resource allocation, and contributing to policy dialogue within the Bank and between other development organizations.

The AIDI dataset comprises various indices, including the AIDI Index itself, as well as composite indices for transport, electricity, information and communication technology (ICT), and water supply and sanitation (WSS). By leveraging this dataset, assess the infrastructure gaps and deficiencies within African nations across these key sectors.

Through comprehensive analysis and visualization techniques, provide valuable insights into the extent of the infrastructure deficit, identify areas requiring urgent attention and investment, and support evidence-based decision-making for policymakers, development organizations, and stakeholders involved in infrastructure development initiatives across the continent.




Our mission is to develop an army of creative problem solvers using an innovative approach to internships.