Improved Mask R-CNN And Cosine Similarity Using RGBD Segmentation For Occlusion Handling In Multi Object Tracking
Dublin Core
Title
Improved Mask R-CNN And Cosine Similarity Using RGBD Segmentation For Occlusion Handling In Multi Object Tracking
Subject
Occlusion, RGBD, Mask R-CNN, Late fusion, Cosine similarity
Description
In this study, additional depth images were used to enrich the information in each image pixel. Segmentation, by its nature capable to process image up to pixel level. So, it can detect up to the
smallest part of the object, even when it’s overlapped with another object. By using segmentation, the main goal is to be able to maintain the tracking process longer when the object starts to be occluded until it is severely occluded right before it is completely disappeared. Object tracking based on object detection was developed by modifying the Mask R-CNN architecture to process RGBD images. The
detection results feature extracted using HOG, and each of them got compared to the target objects.
The comparison was using cosine similarity calculation, and the maximum value of the detected object would update the target object for the next frame. The evaluation of the model was using mAP
calculation. Mask R-CNN RGBD late fusion had a higher value by 5% than Mask R-CNN RGB. It was
68,234% and 63,668%, respectively. Meanwhile, the tracking evaluation uses the traditional method of calculating the id switching during the tracking process. Out of 295 frames, the original Mask R-CNN method had ten switching ID times. On the other hand, the proposed method Mask R-CNN RGBD had much better tracking results with switching ids close to 0.
smallest part of the object, even when it’s overlapped with another object. By using segmentation, the main goal is to be able to maintain the tracking process longer when the object starts to be occluded until it is severely occluded right before it is completely disappeared. Object tracking based on object detection was developed by modifying the Mask R-CNN architecture to process RGBD images. The
detection results feature extracted using HOG, and each of them got compared to the target objects.
The comparison was using cosine similarity calculation, and the maximum value of the detected object would update the target object for the next frame. The evaluation of the model was using mAP
calculation. Mask R-CNN RGBD late fusion had a higher value by 5% than Mask R-CNN RGB. It was
68,234% and 63,668%, respectively. Meanwhile, the tracking evaluation uses the traditional method of calculating the id switching during the tracking process. Out of 295 frames, the original Mask R-CNN method had ten switching ID times. On the other hand, the proposed method Mask R-CNN RGBD had much better tracking results with switching ids close to 0.
Creator
Siti Hadiyan Pratiwi, Putri Shaniya, Grafika Jati, Wisnu Jatmiko
Source
http://dx.doi.org/10.21609/jiki.v16i1.1073
Publisher
Faculty of Computer Science Universitas Indonesia
Date
2023-02-28
Contributor
Sri Wahyuni
Rights
e-ISSN : 2502-9274 printed ISSN : 2088-7051
Format
PDF
Language
English
Type
Text
Coverage
Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Information)
Files
Collection
Citation
Siti Hadiyan Pratiwi, Putri Shaniya, Grafika Jati, Wisnu Jatmiko, “Improved Mask R-CNN And Cosine Similarity Using RGBD Segmentation For Occlusion Handling In Multi Object Tracking,” Repository Horizon University Indonesia, accessed May 22, 2025, https://repository.horizon.ac.id/items/show/8849.