[HDSC Spring ’23] Capstone Project — Team Opencv
Fig 1 Animal Species Detection
Roadkill, the result of animals being struck and killed by vehicles, is a significant global issue, causing high wildlife mortality rates. In the United States alone, over 1 million vertebrate animals are killed daily in vehicle collisions. Globally, this number exceeds 5.5 million daily, totaling over 2 billion annually. A recent study has identified vulnerable animal populations, such as leopards (83% increased risk of roadkill-related extinction), Brazilian wolves (34% increased risk), Brazilian cats (0 to 75% increased risk), and Southern African hyenas (0 to 75% increased risk).
Wildlife Vehicle Collisions (WVC) not only threaten animal safety but can also harm vehicle occupants and cause damage. To address this issue, actions like highway fencing, wildlife crossings, escape routes, the use of high beam headlights at night, and animal sensors alongside highways should be implemented by road authorities to ensure the safety of both wildlife and humans.
This project aims to create an efficient computer vision model through Deep Learning algorithms. The model will be deployed as part of detection systems to identify wildlife in urban settings and on highways. It will utilize real-time visuals to alert humans about possible collisions with wildlife, enhancing safety measures.
- Dataset 1 consists of images from the previous cohort of interns, sourced from Kaggle, an open-source data repository. This dataset includes images of four different animal species. Additionally, text files containing annotations in YOLO format are provided for each image file within Dataset 1.
- Dataset 2, provided by Hamoye and hosted on Kaggle, encompasses images of 11 distinct animal species. Notably, this dataset lacks accompanying text files containing annotations, and a significant portion of the images exhibit lower resolutions..
- Dataset 3, acquired from Kaggle, comprises images featuring 80 unique animal species. This dataset includes text files with annotations in Pascal format and offers high-resolution images for analysis.
We improved the image quality by selecting only the high-resolution images from Dataset 2, as many images had a low resolution of approximately 250 pixels.
For annotation, we used the Make Sense tool to annotate the selected high-resolution images from Dataset 2. This was necessary for training the model effectively. We also converted the annotations of images in Dataset 3, but only for the classes that are present in both datasets, from Pascal format to YOLO format.
To ensure data quality, we removed inappropriate data. Specifically, we excluded the leopard class from the dataset as most images in this class contained other classes. Additionally, we incorporated the Tiger class from Dataset 3.
Data Exploratory Analysis
This dataset includes images of 10 different animal species, namely zebras, rhinos, pandas, lions, elephants, buffalos, foxes, tigers, cheetahs, and jaguars. It comprises a total of 2,609 images, each accompanied by a corresponding text file containing annotations for objects within the image.
The dataset is well-balanced, with each animal species being adequately represented. Most images contain only one or two objects, though some exhibit a higher number of objects. Aspect ratio, an important factor in object detection algorithms, shows moderate variability, ranging from 0.56 to 3.34 in terms of standard deviation.
Number of objects per image Statistics of Image Dimensions
Number of images per class
Fig 2 EDA
We employed the You Only Look Once (YOLO) deep learning algorithm, specifically YOLOv8, to construct an animal detection model. YOLOv8 offers enhanced speed and accuracy while providing a unified framework for Object Detection, Instance Segmentation, and Image Classification tasks.
The YOLO model efficiently predicts bounding boxes and class probabilities using a single network during a single evaluation, enabling real-time predictions.
For dataset management, we divided the dataset into training, validation, and test sets with a split ratio of 0.7, 0.15, and 0.15, respectively. YOLOv8 was implemented through a command-line interface using the Ultralytics package. Pre-trained weights were utilized, specifying the task as “detect,” the mode as “train,” and the model as “yolov8n.” The model was trained for 50 epochs to achieve the desired performance.
Fig 3 YOLO
The performance of the object detection algorithm is evaluated by metrics such as Mean Average Precision
(mAP), F1 score, and Confusion Matrix.
Model Performance on the Training Data Set
- The PR Curve below reveals the mean Average precision(AP) of 0.95 across all classes at the 0.5 threshold. Therefore, it suggests that the model is performing well in object detection tasks.
Fig 4 PR curve
- Each training cycle consists of two phases: a training phase and a validation phase.
The YOLOv8 object detection model achieved good performance with high precision, recall, F1-score, and mAP scores across different thresholds.
Fig 5 Training
- In the FI-Confidence Curve below, the training F1 score which is a harmonic mean of precision and recall of 0.93 at a confidence threshold of 0.556, with a recall of 0.915 and precision of 0.944, indicates a balanced performance in accurately identifying and classifying objects in the training dataset.
Fig 6 Confidence Curve
- Training set
Fig 7 Confusion matrix — Training
- Test set
Fig 8 Confusion matrix -Test
The model was deployed using Streamlit on Hugging Face. Streamlit is an open-source app framework.
Fig 9 Application Interface
Fig 10 Interaction with the application
Conclusion and Recommendation
Roads are vital infrastructure for people and supplies, but they can harm wildlife when they intersect with nature. Our model analyzes real-time highway images, triggering sensors and alarms in vehicles to warn drivers of detected objects. To enhance the model’s accuracy, we expand the training dataset to cover a broader range of species. Additionally, this model has applications in education and agriculture.