Uncertainty Estimation in Edge AI

Object Detection Example Oxford Town Centre

Why should we use it? (04.12.2023)

Förderjahr 2023 / Stipendien Call #18 / ProjektID: 6885 / Projekt: Increasing Trustworthiness of Edge AI by Adding Uncertainty Estimation to Object Detection

Current object detection models in machine learning, e.g., for detecting people in images, lack uncertainty expression, making predictions less trustworthy. This work evaluates uncertainty estimation approaches for efficient Edge AI.

Today, object detection is applied in various domains, such as pedestrian detection [1, 3], or also vehicle detection [7, 8], where safety is a critical aspect. However, existing models lack uncertainty expression and probability scores for classes are often too optimistic [9]. This is especially the case for input images outside the distribution of the training dataset. An example application of a regular object detection model (without uncertainty estimation) can be seen in the image below, where the model got trained on the Oxford Town Centre dataset [2] to detect pedestrians.

Object Detection Example of Oxford Town Centre — Predictions on Oxford Town Centre image [2].

In the image above, we can see that the model is very capable in detecting the people on the street, and also gives high classification scores (> 90%) for the Person class. Since this image is from the test split of the dataset the model has been trained on, it is expected to perform well here. However, when faced with inputs outside the distribution of the training data, the behavior of the model can be unexpected, as shown down below.

Object Detection on original vs modified Image — Oxford Town Centre image [2] with predictions and modifications.

In the image snippets above, we can see the original image on the left and a modified version on the right. Some objects (trash cans) have been inserted into the right image to challenge the object detection model with untrained input. Our regular model without uncertainty estimation now tries to detect persons on both images. While on the left, detection is working normally for persons, on the right, the newly inserted trashcans now also get wrongly detected as persons. This is where uncertainty estimation comes into play. Using a model capable of expressing its uncertainty, it would mark the person on the top left with low uncertainty, as the model was trained for these types of detections. However, the green trashcan at the bottom would now get a high uncertainty score, as it is an out-of-distribution entity the model has not seen during training. Training data is always finite, therefore, the knowledge of a model is also limited. Still, given an input, the model will produce an output regardless. Uncertainty estimation allows the model to express itself, when it is overwhelmed with the input. Applications making use of this model could now handle uncertain detections distinctively, e.g., by using fallback safety mechanisms, or letting a human expert take over.

Popular uncertainty estimation methods [4, 6] often have an overhead in computation [5], making them unfeasible for resource-constrained Edge AI applications. Therefore, when evaluating different uncertainty estimation approaches, challenges regarding the following aspects need to be addressed:

Inference time, as in Edge AI applications, such as traffic detection in smart cities, decisions need to happen within milliseconds
Memory utilization, as Edge devices, unlike the Cloud, are limited in computational recourses
Model construction, as different network structures and loss functions need to be implemented, compared to regular models
Detection accuracy, as a model with uncertainty estimation needs to be competitive with unmodified models on regular detections as well

[1] Zahid Ahmed, R Iniyavan, et al. Enhanced vulnerable pedestrian detection using deep learning. In 2019 International Conference on Communication and Signal Processing (ICCSP), pages 0971–0974. IEEE, 2019.

[2] Ben Benfold and I. Reid. Stable multi-target tracking in real-time surveillance video. CVPR 2011, pages 3457–3464, 2011.

[3] Sebastian Cygert and Andrzej Czyzewski. Toward robust pedestrian detection with data augmentation. IEEE Access, 8:136674–136683, 2020.

[4] Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning, pages 1050–1059. PMLR, 2016.

[5] Fredrik K Gustafsson, Martin Danelljan, and Thomas B Schon. Evaluating scalable bayesian deep learning methods for robust computer vision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 318–319, 2020.

[6] Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estimation using deep ensembles. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pages 6405–6416, 2017.

[7] Ivan Lujic, Vincenzo De Maio, Klaus Pollhammer, Ivan Bodrozic, Josip Lasic, and Ivona Brandic. Increasing traffic safety with real-time edge analytics and 5g. In Proceedings of the 4th International Workshop on Edge Systems, Analytics and Networking, pages 19–24, 2021.

[8] Buse Pehlivan, Ceren Kahraman, Deniz Kurtel, Mert Nakip, and Cüneyt Güzelis. Real-time implementation of mini autonomous car based on mobilenet-single shot detector. In 2020 Innovations in Intelligent Systems and Applications Conference (ASYU), pages 1–6. IEEE, 2020.

[9] Murat Sensoy, Lance Kaplan, and Melih Kandemir. Evidential deep learning to quantify classification uncertainty. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, pages 3183–3193, 2018.