Loading…

Techniques and guidelines for effective migration from RDBMS to NoSQL

Migration from RDBMS to NoSQL has become an important topic in a big data era. This paper provides comprehensive techniques and guidelines for effective migration from RDBMS to NoSQL. We discuss the challenges faced in translating SQL queries; the effects of denormalization, column families, seconda...

Full description

Saved in:
Bibliographic Details
Published in:The Journal of supercomputing 2020-10, Vol.76 (10), p.7936-7950
Main Authors: Kim, Ho-Jun, Ko, Eun-Jeong, Jeon, Young-Ho, Lee, Ki-Hoon
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Migration from RDBMS to NoSQL has become an important topic in a big data era. This paper provides comprehensive techniques and guidelines for effective migration from RDBMS to NoSQL. We discuss the challenges faced in translating SQL queries; the effects of denormalization, column families, secondary indexes, join algorithms, and column name length; and decision support for the migration. We focus on a column-oriented NoSQL, HBase because it is widely used by many Internet enterprises such as Facebook, Twitter, and LinkedIn. Because HBase does not support SQL, we use Apache Phoenix as an SQL layer on top of HBase. Experimental results using TPC-H show that column-level denormalization with atomicity and grouping columns into column families significantly improve query performance; the use of secondary indexes on foreign keys is not as effective as in RDBMSs; the query optimizer of Phoenix is not very sophisticated; shortened column names significantly reduce the database size and improve query performance; and the SVM classifier can predict whether query performance is improved by migration or not. Important open problems in NoSQL research are supporting complex SQL queries, automatic index selection, and optimizing SQL queries for NoSQL.
ISSN:0920-8542
1573-0484
DOI:10.1007/s11227-018-2361-2