Title: AI for species identification that explains like an expert.
Abstract: In this project the aim is to enable large scale biodiversity monitoring by citizen scientists by developing Explainable Machine Learning methods that reason like a taxonomist, explicitly detecting relevant traits on a specimen’s image and reaching a species identification conclusion based on them. This will help scientists obtain valuable data from rare or undescribed species, make use of low quality real-world images and make it easier for everyone to become an amateur naturalist, thus raising awareness about biodiversity and the rapid pace at which we are losing it. We will jointly use Computer Vision (CV) and Natural Language Processing (NLP) methods to extract the relevant visual features and model taxonomic descriptions.
First, we will leverage the vast amount of structured textual species descriptions that are available online, such as in Wikipedia, to train a first NLP model, starting with a pre-trained transformers-based model, that will be used to discriminate between text belonging to species descriptions. This first model will be used to further increase the amount of textual descriptions by parsing additional websites that contain species descriptions.
In a second step, the textual descriptions will be analyzed in terms of part-of-speech in order to understand the different life stages (e.g. egg, hatchling, immature, female adult, male adult, etc.) and parts (e.g. leaf, stem, flower, bark, fruit, etc.) that are being described, extract their corresponding attributes, and identify relative descriptions in case another species is mentioned for comparison.
The last step requires developing a method for linking this knowledge graph to a Computer Vision model. To do this we will leverage the millions of images annotated with the species names, belonging to tens of thousands of species, that are freely available at the Global Biodiversity Information Facility (GBIF).
The presentation will be in English and streamed on Webex.