A Hybrid Vision Transformer Model for Efficient Waste Classification

Dublin Core

Title

A Hybrid Vision Transformer Model for Efficient Waste Classification

Subject

Deep Learning, Fine-Tuning, Hybrid Approach, ResNet50, Vision Transformers, Waste Classification

Description

The rapid and accurate sorting of municipal waste is essential for efficient recycling and sustainable
resource recovery. Most existing AI solutions focus only on four common materials (plastic, paper,
metal, and glass), overlooking many other routinely encountered waste types and losing accuracy when
applied to the mixed waste compositions seen in operational environments. We introduce HR-ViT, a
hybrid network that combines ResNet50 residual blocks, which capture fine-grained local cues, with
Vision Transformer global self-attention. Trained on a balanced six-class benchmark of about 775
images per class (plastic, paper, organic, metal, glass, batteries), HR-ViT attains 98.27 % accuracy and a macro-averaged F1-score of 0.98, outperforming a pure ViT, VT-MLH-CNN, and Garbage FusionNet by up to five percentage points in both metrics. Gains arise from selective fine-tuning of the last ten ResNet layers, lightweight ViT hyper-parameter optimisation, and targeted data augmentation that mitigates cluttered backgrounds, uneven lighting, and object deformation. These results show that hybrid attention-residual architectures provide reliable predictions under complex imaging conditions.
Future work will extend the method to multi-object scenes and domain-adaptive deployment in smartcity recycling systems.

Creator

Amir Mahmud Husein, Baren Baruna Harahap, Tio Fulalo Simatupang, Karunia Syukur Baeha,Bintang Keitaro Sinambela

Source

DOI: http://dx.doi.org/10.21609/jiki.v18i2.1489

Publisher

Faculty of Computer Science UI

Date

2025-02-26

Contributor

Sri Wahyuni

Rights

ISSN : 2502-9274

Format

PDF

Language

English

Type

Text

Files

Tags

,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon , ,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon , ,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon , ,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon , ,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon ,

Citation

Amir Mahmud Husein, Baren Baruna Harahap, Tio Fulalo Simatupang, Karunia Syukur Baeha,Bintang Keitaro Sinambela, “A Hybrid Vision Transformer Model for Efficient Waste Classification,” Repository Horizon University Indonesia, accessed January 11, 2026, https://repository.horizon.ac.id/items/show/9887.