Improved Mask R-CNN And Cosine Similarity Using RGBD Segmentation For Occlusion Handling In Multi Object Tracking
Dublin Core
Title
Improved Mask R-CNN And Cosine Similarity Using RGBD Segmentation For Occlusion Handling In Multi Object Tracking
            Subject
Occlusion, RGBD, Mask R-CNN, Late fusion, Cosine similarity
            Description
In this study, additional depth images were used to enrich the information in each image pixel. Segmentation, by its nature capable to process image up to pixel level. So, it can detect up to the 
smallest part of the object, even when it’s overlapped with another object. By using segmentation, the main goal is to be able to maintain the tracking process longer when the object starts to be occluded until it is severely occluded right before it is completely disappeared. Object tracking based on object detection was developed by modifying the Mask R-CNN architecture to process RGBD images. The
detection results feature extracted using HOG, and each of them got compared to the target objects.
The comparison was using cosine similarity calculation, and the maximum value of the detected object would update the target object for the next frame. The evaluation of the model was using mAP
calculation. Mask R-CNN RGBD late fusion had a higher value by 5% than Mask R-CNN RGB. It was
68,234% and 63,668%, respectively. Meanwhile, the tracking evaluation uses the traditional method of calculating the id switching during the tracking process. Out of 295 frames, the original Mask R-CNN method had ten switching ID times. On the other hand, the proposed method Mask R-CNN RGBD had much better tracking results with switching ids close to 0.
            smallest part of the object, even when it’s overlapped with another object. By using segmentation, the main goal is to be able to maintain the tracking process longer when the object starts to be occluded until it is severely occluded right before it is completely disappeared. Object tracking based on object detection was developed by modifying the Mask R-CNN architecture to process RGBD images. The
detection results feature extracted using HOG, and each of them got compared to the target objects.
The comparison was using cosine similarity calculation, and the maximum value of the detected object would update the target object for the next frame. The evaluation of the model was using mAP
calculation. Mask R-CNN RGBD late fusion had a higher value by 5% than Mask R-CNN RGB. It was
68,234% and 63,668%, respectively. Meanwhile, the tracking evaluation uses the traditional method of calculating the id switching during the tracking process. Out of 295 frames, the original Mask R-CNN method had ten switching ID times. On the other hand, the proposed method Mask R-CNN RGBD had much better tracking results with switching ids close to 0.
Creator
Siti Hadiyan Pratiwi, Putri Shaniya, Grafika Jati, Wisnu Jatmiko
            Source
http://dx.doi.org/10.21609/jiki.v16i1.1073
            Publisher
Faculty of Computer Science Universitas Indonesia
            Date
2023-02-28
            Contributor
Sri Wahyuni
            Rights
e-ISSN : 2502-9274 printed ISSN : 2088-7051
            Format
PDF
            Language
English
            Type
Text
            Coverage
Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Information)
            Files
Collection
Citation
Siti Hadiyan Pratiwi, Putri Shaniya, Grafika Jati, Wisnu Jatmiko, “Improved Mask R-CNN And Cosine Similarity Using RGBD Segmentation For Occlusion Handling In Multi Object Tracking,” Repository Horizon University Indonesia, accessed October 31, 2025, https://repository.horizon.ac.id/items/show/8849.