Cite As:S Badirli, CJ Picard, G Mohler, Z Akata, M Dundar (2021). Data from Classifying the Unknown: Insect Identification by Deep Zero-shot Bayesian Learning.
Published As:S Badirli, CJ Picard, G Mohler, Z Akata, M Dundar (2021). Data from Classifying the Unknown: Insect Identification by Deep Zero-shot Bayesian Learning.
Found At:IUPUI University Library
Sponsorship:M.D. and S.B. were sponsored by the National Science Foundation (NSF) grant IIS-1252648 (CAREER).
G.M. was sponsored by NSF grant ATD-2124313.
Abstract:
Insects represent a large majority of biodiversity on Earth, yet so few species are described. Describing new species typically requires specific taxonomic expertise to identify morphological characters that distinguish it from other known species and DNA-based methods have aided in providing additional evidence of separate species. Machine learning (ML) provides a powerful method in identifying new species given its analytical processing is more sensitive to subtle physical differences in images humans may not process. We develop a Bayesian deep learning method for zero-shot classification of species. The proposed approach forms a Bayesian hierarchy of species around corresponding genera and uses deep embeddings of images and DNA barcodes to identify insects to the lowest taxonomic level possible. To demonstrate this proof of concept, we use a database of 32,848 insect images from 1,040 described species split into training and test data wherein the test data includes 243 species not present in the training data. Our results demonstrate that using DNA sequences and images together, known insects can be classified with 96.66\% accuracy while unknown (to the database) insects have an accuracy of 81.39\% in identifying the correct genus. The proposed deep zero-shot Bayesian model demonstrates a powerful new approach that can be used for the gargantuan task of identifying new insect species.