ANoM STEMMER: Nazief & Andriani Modification for Maduresse Stemming
Dublin Core
Title
ANoM STEMMER: Nazief & Andriani Modification for Maduresse Stemming
Subject
stemming; morphophonemic; understemming; overstemming; madurese language
Description
Madurese is one of the regional languages in Indonesia. This is a cultural property that needs to be preserved. With various
uniqueness and word formation rules, the Madurese language can be used in Information Retrieval, namely stemming. The
Madurese language has a close relationship with the Javanese language, in several studies the stemming method is often used,
such as the modification of the Nazief & Adriani method which has good performance for the Javanese language, but there
has never been any research on the Madurese language and it has not been proven successful. Previous studies also have not
used morphophonemic rules that influence word formation in Madurese. Therefore this research was developed by modifying
Nazief & Adriani's algorithm for Madurese language based on Madurese language morphology by removing affixes, namely
ter-ater (prefix), panoteng (suffix), and morphophonemic rules. Corpus uses 1000 words from the Madurese language
dictionary which have received affixes. The accuracy of the algorithm is 89% with 890 words that match, the prefix has an
accuracy of 93.81%, the suffix has an accuracy of 83.78% and the confix has 80.07%. As for the overall performance, it
produces an accuracy of 89.0% with an error rate of 11%. the understemming is found in 104 words and overstemming in 6 words. the time it takes to compile is 31.31 seconds
uniqueness and word formation rules, the Madurese language can be used in Information Retrieval, namely stemming. The
Madurese language has a close relationship with the Javanese language, in several studies the stemming method is often used,
such as the modification of the Nazief & Adriani method which has good performance for the Javanese language, but there
has never been any research on the Madurese language and it has not been proven successful. Previous studies also have not
used morphophonemic rules that influence word formation in Madurese. Therefore this research was developed by modifying
Nazief & Adriani's algorithm for Madurese language based on Madurese language morphology by removing affixes, namely
ter-ater (prefix), panoteng (suffix), and morphophonemic rules. Corpus uses 1000 words from the Madurese language
dictionary which have received affixes. The accuracy of the algorithm is 89% with 890 words that match, the prefix has an
accuracy of 93.81%, the suffix has an accuracy of 83.78% and the confix has 80.07%. As for the overall performance, it
produces an accuracy of 89.0% with an error rate of 11%. the understemming is found in 104 words and overstemming in 6 words. the time it takes to compile is 31.31 seconds
Creator
Enni Lindrawati, Ema Utami, Ainul Yaqin
Source
http://jurnal.iaii.or.id
Publisher
Professional Organization Ikatan Ahli Informatika Indonesia (IAII)/Indonesian Informatics Experts Association
Date
December 2023
Contributor
Sri Wahyuni
Rights
ISSN Media Electronic: 2580-0760
Format
PDF
Language
English
Type
Text
Files
Collection
Citation
Enni Lindrawati, Ema Utami, Ainul Yaqin, “ANoM STEMMER: Nazief & Andriani Modification for Maduresse Stemming,” Repository Horizon University Indonesia, accessed January 12, 2026, https://repository.horizon.ac.id/items/show/10144.