SYNTHESIZING REALISTIC WILDLIFE IMAGES FOR CONSERVATION AND EDUCATION

HamoyeHQ
10 min readAug 9, 2024

Ridwan Ibidunni, Auwal Ibrahim, Obi Chinyere Mary, Samson Oguntuwase, Dominion Akinrotimi, Rishabh Shrivastava, Niket Kumar, Eguagie-Suyi Precious, Abdulkabir Badru, Adeleke Adekola Emmanuel, Benjamin Muoka, Akinwunmi Toluwani Adebayo, Muhammad Asif, Ade Adesipo, Lolu Zaccheus, Chinazom Enukoha

INTRODUCTION

  • Overview

The African Manatee and African Golden Cat are just two examples of the many endangered species in Africa that remain unfamiliar to most people. This lack of awareness highlights a critical issue in wildlife conservation: the scarcity of data, particularly images, needed to monitor and protect these vulnerable species.

The International Union for Conservation of Nature (IUCN) has identified over 79,000 species on its Red List of Threatened Species[4]. However, the Living Planet Report, which provides the most comprehensive record of global species populations, only covers about 10,300 populations representing approximately 3,000 species. This stark discrepancy — merely 4% coverage of the IUCN-listed endangered species — underscores the magnitude of the data gap in wildlife conservation efforts. [1]

The global decline in wildlife populations, as documented by the IUCN Red List, emphasizes the urgent need for effective conservation and educational initiatives. Despite advances in conservation techniques, the lack of realistic wildlife imagery remains a significant obstacle. This shortage of authentic visual data hinders the development of conservation policies, impedes scientific research, and limits public engagement efforts.

To address this critical gap and enhance conservation strategies while raising public awareness, this paper proposes an innovative solution: the synthesis of realistic wildlife images from text description. By drawing inspiration from groundbreaking methodologies, such as those employed by Berger-Wolf et al. (2017), we aim to explore how artificial intelligence and computer vision can be leveraged to generate high-quality wildlife imagery, thereby supporting conservation efforts and fostering greater public understanding of endangered species.

Literature Review

This section examines key research on using AI and computer vision to generate wildlife images for conservation and education. We’ll review studies that demonstrate how these technologies can address the scarcity of authentic wildlife visual data. This review will inform our approach and highlight the gaps our project aims to fill.

The following literature review will provide insights into the current state of the field and highlight the gaps our project seeks to address.

  • Wildbook: Crowdsourcing, computer vision, and data science for conservation: The paper presents Wildbook, an innovative system that utilizes crowdsourcing and computer vision to enhance wildlife conservation efforts. It aims to collect and analyze wildlife images contributed by citizen scientists and the public, enabling the identification of individual animals and informing evidence-based conservation policies. The system not only facilitates better monitoring of wildlife populations but also engages the public in conservation initiatives, fostering a deeper connection to nature [1]
  • SketchyGAN: Towards Diverse and Realistic Sketch to Image Synthesis: The paper introduces a novel approach for generating realistic images from human-drawn sketches using Generative Adversarial Networks (GANs). Its primary aim is to overcome the challenges of sketch-based image synthesis, enabling the creation of plausible images across 50 diverse categories without requiring extensive artistic skills. Key contributions include a data augmentation technique to enhance training datasets, a GAN model with improved objective functions and network architecture, and the demonstration of superior image quality and diversity compared to existing methods. This research has significant implications for conservation and education by facilitating the visualization of wildlife and natural environments through sketches.[2]
  • Generative Adversarial Text to Image Synthesis: The paper “Generative Adversarial Text to Image Synthesis” focuses on developing a robust model that generates realistic images from detailed textual descriptions using Generative Adversarial Networks (GANs). The primary purpose of this research is to address the challenge of translating human-written descriptions into visually coherent images, thereby enhancing the interaction between natural language and visual content. The authors propose a novel architecture that leverages a character-level text encoder and class-conditional GAN, demonstrating its effectiveness on fine-grained datasets such as birds and flowers. This work contributes significantly to the field by showcasing the model’s ability to produce high-quality images that correspond to specific textual inputs, which can be particularly beneficial for wildlife conservation and education by enabling the visualization of species based on descriptive data. [3]

PROPOSED APPROACH

Our research leverages recent advancements in generative AI to address the scarcity of wildlife imagery in conservation efforts. We propose using pre-trained generative models with fine-tuning, specifically employing transfer learning techniques. This approach capitalizes on the strengths of advanced generative models while reducing the need for extensive computational resources and time required for training from scratch.

By fine-tuning the runwayml/stable-diffusion-v1–5 [6] model using the DreamBooth [5] technique on a specific dataset of wildlife images, we aim to create a powerful tool for synthesizing realistic wildlife imagery. This method allows us to adapt a sophisticated pre-existing model to our specific needs, potentially revolutionizing the availability of diverse and authentic visual data for wildlife conservation, research, and education.

DATA COLLECTION AND PREPROCESSING

The images of wildlife species used in this study are listed on the International Union for Conservation of Nature (IUCN) website as vulnerable or critically endangered. We focused on three specific animal species:

  1. The African manatee (Trichechus senegalensis): This large aquatic mammal, found in the coastal and freshwater systems of West Africa, weighs up to 500 kg and measures about 3 meters in length. Classified as Vulnerable by the IUCN, it faces significant threats from poaching, habitat degradation, and accidental capture in fishing nets. [7]
  2. African Golden Cat (Caracal aurata): A medium-sized wild cat endemic to the tropical rainforests of West and Central Africa. Its fur color varies from reddish-brown to greyish-brown. Classified as Vulnerable due to habitat loss and hunting, it remains one of the least studied felines. [8]
  3. White-backed Vulture (Gyps africanus): A large scavenger bird native to sub-Saharan Africa, with a wingspan of up to 2.25 meters. It plays a crucial role in the ecosystem by consuming carrion. The species is classified as Critically Endangered due to threats such as poisoning, habitat loss, and hunting. [9]

Images of these species were collected and organized into respective folders. For preprocessing, all images were resized to a uniform size of 512 x 512 pixels using the bulk image resizing website “Birme” [10]. This standardization ensures consistency in the input data for our model.

For our model selection, we chose to fine-tune the runwayml/stable-diffusion-v1–5 [6] model using the Dreambooth [5] technique. This pre-trained generative model was selected for its high-quality image generation capabilities. We obtained the pre-trained weights from the Hugging Face Model Hub.

RESULTS & DISCUSSION

Comparing Generated Images with Real Wildlife Photos

r

Figure 1: Real Wildlife Photographs of the African Golden Cat (Caracal aurata)

Figure 2: Generated Photographs of the African Golden Cat (Caracal aurata)

Figure 3: Real Wildlife Photographs of the African Manatee (Trichechus senegalensis)

Figure 4: Generated Photographs of the African Manatee (Trichechus senegalensis)

Figure 5: Real Wildlife Photographs of the White-backed Vulture (Gyps africanus)

Figure 6: Generated Photographs of the White-backed Vulture (Gyps africanus)

Exploring Model Generalization

Generated Images of Untrained Species

  • African Elephant

African Elephant (Generated Image) African Elephant (Real Image)

  • African Wild Dog

African Wild Dog (Generated) African Wild Dog (Real Image)

  • Addax

Addax (Generated Image) Addax (Real Image)

  • Lappet-faced Vulture

Lappet-faced Vulture (Generated) Lappet-faced Vulture (Real Image)

  • African Penguin

African Penguin (Generated) African Penguin (Real Image)

Discussion

The results of our study demonstrate the potential of fine-tuned generative AI models in producing realistic wildlife imagery, particularly for endangered species. The generated images of the African Golden Cat, African Manatee, and White-backed Vulture show remarkable similarity to their real counterparts, capturing key features and characteristics of each species.

For the African Golden Cat, the model successfully reproduced the distinctive coat patterns and facial features, although some generated images show slight variations in coloration. The African Manatee images accurately depict the animal’s aquatic environment and body shape, but fine details like skin texture may require further refinement. The White-backed Vulture generations are particularly impressive, capturing the bird’s distinctive silhouette and head features.

The model’s ability to generalize to untrained species is noteworthy. Generated images of the African Elephant, African Wild Dog, Addax, Lappet-faced Vulture, and African Penguin demonstrate the model’s potential to extend beyond its initial training set. This generalization capability could prove invaluable for creating visual representations of rare or elusive species where photographic documentation is scarce.

However, some limitations are evident. Occasional inconsistencies in anatomical details or environmental contexts suggest room for improvement. These could be addressed through more extensive training data or refined model parameters.

The potential applications of this technology are vast. In conservation education, these generated images could provide vivid, diverse representations of endangered species, enhancing public engagement and awareness. For researchers, the ability to generate images based on textual descriptions could aid in visualizing historical or hypothetical scenarios, supporting conservation planning and species recovery efforts.

Conclusion

Our study demonstrates the viability of using fine-tuned generative AI models to create realistic wildlife imagery, particularly for endangered African species. The model’s ability to produce high-quality images of both trained and untrained species represents a significant step forward in addressing the scarcity of visual data in wildlife conservation.

The implications of this technology are far-reaching. By providing a means to generate diverse, realistic wildlife imagery, we can enhance conservation education, support research efforts, and potentially aid in the documentation and study of elusive or critically endangered species.

Future research should focus on expanding the model’s capabilities to cover a broader range of species, improving the accuracy and consistency of generated images, and exploring integration with other conservation technologies. Additionally, ethical considerations regarding the use of AI-generated imagery in scientific contexts should be thoroughly explored.

In conclusion, while there is room for refinement, the potential of this technology to revolutionize wildlife conservation efforts is clear. By bridging the gap in visual data availability, we can foster greater public engagement, support more informed conservation strategies, and ultimately contribute to the preservation of Earth’s biodiversity.

References

  1. Berger-Wolf, T. Y., Rubenstein, D. I., Stewart, C. V., Holmberg, J. A., Parham, J., Menon, S., . . . Joppa, L. (2017). Wildbook: Crowdsourcing, computer vision, and data science for conservation. ArXiv Preprint ArXiv:1710.08880.
  2. Chen, W., & Hays, J. (2018). Sketchygan: Towards diverse and realistic sketch to image synthesis. In Proceedings of the IEEE conference on computer vision and pattern recognition. (pp. 9416–9425).
  3. Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., & Lee, H. (2016). Generative adversarial text to image synthesis. In International conference on machine learning. Symposium conducted at the meeting of PMLR. (pp. 1060–1069).
  4. International Union for Conservation of Nature. (2024). Search results for African endangered species. The IUCN Red List of Threatened Species. https://www.iucnredlist.org/search?query=africa&searchType=species
  5. Hugging Face. (2024). DreamBooth. Hugging Face Documentation. https://huggingface.co/docs/diffusers/en/training/dreambooth?installation=PyT orch
  6. Runway. (2022). Stable Diffusion v1.5. Hugging Face Model Hub. https://huggingface.co/runwayml/stable-diffusion-v1-5
  7. IUCN SSC Sirenian Specialist Group. (2019). Trichechus senegalensis. The IUCN Red List of Threatened Species. https://www.iucnredlist.org/species/22104/97168578
  8. Bahaa-el-din, L., Mills, D., Hunter, L. & Henschel, P. (2015). Caracal aurata. The IUCN Red List of Threatened Species. https://www.iucnredlist.org/species/18306/50663128
  9. BirdLife International. (2018). Gyps africanus. The IUCN Red List of Threatened Species. https://www.iucnredlist.org/species/22695189/126667006
  10. Birme. (2024). Bulk Image Resizing Made Easy. https://www.birme.net/?image_format=jpeg&rename=africangoldencat_512x51 2&rename_start=55

--

--

HamoyeHQ

Our mission is to develop an army of creative problem solvers using an innovative approach to internships.