Research on a Distributed Processing Model Based on Kafka for Large-Scale Seismic Waveform Data

For storage and recovery requirements on large-scale seismic waveform data of the National Earthquake Data Backup Center (NEDBC), a distributed cluster processing model based on Kafka message queues is designed to optimize the inbound efficiency of seismic waveform data stored in HBase at NEDBC. Fir...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2020, Vol.8, p.39971-39981
Main Authors: Chai, Xu-Chao, Wang, Qing-Liang, Chen, Wen-Sheng, Wang, Wen-Qing, Wang, Dan-Ning, Li, Yue
Format: Article
Language:eng
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:For storage and recovery requirements on large-scale seismic waveform data of the National Earthquake Data Backup Center (NEDBC), a distributed cluster processing model based on Kafka message queues is designed to optimize the inbound efficiency of seismic waveform data stored in HBase at NEDBC. Firstly, compare the characteristics of big data storage architectures with that of traditional disk array storage architectures. Secondly, realize seismic waveform data analysis and periodic truncation, and write HBase in NoSQL record form through Spark Streaming cluster. Finally, compare and test the read/write performance of the data processing process of the proposed big data platform with that of traditional storage architectures. Results show that the seismic waveform data processing architecture based on Kafka designed and implemented in this paper has a higher read/write speed than the traditional architecture on the basis of the redundancy capability of NEDBC data backup, which verifies the validity and practicability of the proposed approach.
ISSN:2169-3536
2169-3536