TELKOMNIKA Telecommunication, Computing, Electronics and Control
Analysis of frequent itemset generation based on trie data structure in Apriori algorithm
Dublin Core
Title
TELKOMNIKA Telecommunication, Computing, Electronics and Control
Analysis of frequent itemset generation based on trie data structure in Apriori algorithm
Analysis of frequent itemset generation based on trie data structure in Apriori algorithm
Subject
Apriori
Hash-node calculation
Level pruning
Multi-thread
Triedata structure
Hash-node calculation
Level pruning
Multi-thread
Triedata structure
Description
Apriori is one technique of data mining association rules that aims to extract
correlations between sets of items in the transaction database. The main
problem with the Apriori algorithm is the process of scanning databases
repeatedly to generate itemset candidates. This research examines the
combination of pruning by using the trieapproach and multi-thread
implementation in three algorithms to obtain frequent itemset. Trie is a data
structure in the form of an ordered tree to store a set of strings where every
node in the tree contains the same prefix. The use of a full combination trie
(different from frequent pattern (FP) tree using links) allows the
implementation of arrays and the hash calculation to achieve the addressing
of itemset combination. In this research, the measure to get the address is
called Hash-node calculation used to update support value. For these three
alternatives, run time processing is analyzed based on the number of itemset
combinations and transaction data at a certain minimum support value. The
experimental results show that an algorithm thatexploits resource capabilities
by applying multi-threadperforms almost seven times betterthanan algorithm
implemented in single-thread in calculating hash-node. The fastest run time
of the multi-thread approach is 43 minutes with 150-itemset combinations on
100,000 transaction data.
correlations between sets of items in the transaction database. The main
problem with the Apriori algorithm is the process of scanning databases
repeatedly to generate itemset candidates. This research examines the
combination of pruning by using the trieapproach and multi-thread
implementation in three algorithms to obtain frequent itemset. Trie is a data
structure in the form of an ordered tree to store a set of strings where every
node in the tree contains the same prefix. The use of a full combination trie
(different from frequent pattern (FP) tree using links) allows the
implementation of arrays and the hash calculation to achieve the addressing
of itemset combination. In this research, the measure to get the address is
called Hash-node calculation used to update support value. For these three
alternatives, run time processing is analyzed based on the number of itemset
combinations and transaction data at a certain minimum support value. The
experimental results show that an algorithm thatexploits resource capabilities
by applying multi-threadperforms almost seven times betterthanan algorithm
implemented in single-thread in calculating hash-node. The fastest run time
of the multi-thread approach is 43 minutes with 150-itemset combinations on
100,000 transaction data.
Creator
Ade Hodijah, Urip Teguh Setijohatmo
Source
http://journal.uad.ac.id/index.php/TELKOMNIKA
Date
Apr 22, 2021
Contributor
peri irawan
Format
pdf
Language
english
Type
text
Files
Collection
Citation
Ade Hodijah, Urip Teguh Setijohatmo, “TELKOMNIKA Telecommunication, Computing, Electronics and Control
Analysis of frequent itemset generation based on trie data structure in Apriori algorithm,” Repository Horizon University Indonesia, accessed April 4, 2025, https://repository.horizon.ac.id/items/show/4199.
Analysis of frequent itemset generation based on trie data structure in Apriori algorithm,” Repository Horizon University Indonesia, accessed April 4, 2025, https://repository.horizon.ac.id/items/show/4199.