A High-Performance Data Accessing and Processing System for Campus
Real-time Power Usage
Dublin Core
Title
A High-Performance Data Accessing and Processing System for Campus
Real-time Power Usage
Real-time Power Usage
Subject
Internet of Things; Big data warehouse; Real-time processing; Spark; Hive; Impala;
1.
1.
Description
With the flourishing of Internet of Things (IoT) technology, ubiquitous power data can be linked to the Internet and be analyzed for real-time
monitoring requirements. Numerous power data would be accumulated to even Tera-byte level as the time goes. To approach a real-time power
monitoring platform on them, an efficient and novel implementation techniques has been developed and formed to be the kernel material of this
thesis. Based on the integration of multiple software subsystems in a layered manner, the proposed power-monitoring platform has been
established and is composed of Ubuntu (as operating system), Hadoop (as storage subsystem), Hive (as data warehouse), and the Spark MLlib
(as data analytics) from bottom to top. The generic power-data source is provided by the so-called smart meters equipped inside factories
located in an enterprise practically. The data collection and storage are handled by the Hadoop subsystem and the data ingestion to Hive data
warehouse is conducted by the Spark unit. On the aspect of system verification, under single-record query, these software modules: HiveQL
and Impala SQL had been tested in terms of query-response efficiency. And for the performance exploration on the full-table query function.
The relevant experiments have been conducted on the same software modules as well. The kernel contributions of this research work can be
highlighted by two parts: the details of building an efficient real-time power-monitoring platform, and the relevant query-response efficiency
for reference.
monitoring requirements. Numerous power data would be accumulated to even Tera-byte level as the time goes. To approach a real-time power
monitoring platform on them, an efficient and novel implementation techniques has been developed and formed to be the kernel material of this
thesis. Based on the integration of multiple software subsystems in a layered manner, the proposed power-monitoring platform has been
established and is composed of Ubuntu (as operating system), Hadoop (as storage subsystem), Hive (as data warehouse), and the Spark MLlib
(as data analytics) from bottom to top. The generic power-data source is provided by the so-called smart meters equipped inside factories
located in an enterprise practically. The data collection and storage are handled by the Hadoop subsystem and the data ingestion to Hive data
warehouse is conducted by the Spark unit. On the aspect of system verification, under single-record query, these software modules: HiveQL
and Impala SQL had been tested in terms of query-response efficiency. And for the performance exploration on the full-table query function.
The relevant experiments have been conducted on the same software modules as well. The kernel contributions of this research work can be
highlighted by two parts: the details of building an efficient real-time power-monitoring platform, and the relevant query-response efficiency
for reference.
Creator
Sheng-Cang Chou 1,*, Chao-Tung Yang
Date
2020
Contributor
peri irawan
Format
pdf
Language
english
Type
text
Files
Collection
Citation
Sheng-Cang Chou 1,*, Chao-Tung Yang, “A High-Performance Data Accessing and Processing System for Campus
Real-time Power Usage,” Repository Horizon University Indonesia, accessed June 5, 2025, https://repository.horizon.ac.id/items/show/9229.
Real-time Power Usage,” Repository Horizon University Indonesia, accessed June 5, 2025, https://repository.horizon.ac.id/items/show/9229.