Loading…

EDISON‐DATA: A flexible and extensible platform for processing and analysis of computational science data

Summary With the recent emergence of new paradigm, ie, open science and big data, the need for data sharing and collaboration is becoming important in the computational science field as well. The EDISON‐DATA platform aims to provide services that computational simulation data can easily published, p...

Full description

Saved in:
Bibliographic Details
Published in:Software, practice & experience practice & experience, 2019-10, Vol.49 (10), p.1509-1530
Main Authors: Ahn, Sunil, Lee, Jeongcheol, Kim, Jaesung, Lee, JongSuk R.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Summary With the recent emergence of new paradigm, ie, open science and big data, the need for data sharing and collaboration is becoming important in the computational science field as well. The EDISON‐DATA platform aims to provide services that computational simulation data can easily published, preserved, shared, reused, discovered, and analyzed. First, this paper analyzed computational science platform‐related issues, obtained during the development of the EDISON‐DATA platform, regarding the sharing and reusing of the computational science data. These issues include data complexity, diversity, reliability, heterogeneity, etc. To solve the above issues and support data analysis in an efficient and integrated manner, this study proposes various ideas used in the EDISON‐DATA platform. First, we suggested an automated preprocessing framework to handle the complexity of computational science data. Second, to solve the diversity issue, we presented ways to develop preprocessing logic and data presentation logic customized for each data type. Third, to improve the reliability of computational science data, some quality control and provenance management techniques were presented. Fourth, we proposed a way to manage related data in groups. Fifth, to solve data heterogeneity problem and to analyze data in an integrated way, we let the preprocessing framework to use controlled vocabularies to express descriptive metadata. Lastly, we demonstrated feasibility and usability of the proposed ideas in this paper by presenting a case study of building a research portal service in the materials field based on the EDISON‐DATA platform.
ISSN:0038-0644
1097-024X
DOI:10.1002/spe.2732