TY - GEN
T1 - High productivity processing - Engaging in big data around distributed computing
AU - Riedel, Morris
AU - Memon, M.
AU - Memon, A.
AU - Fiameni, G.
AU - Cacciari, C.
AU - Lippert, Thomas
PY - 2013
Y1 - 2013
N2 - The steadily increasing amounts of scientific data and the analysis of big data is a fundamental characteristic in the context of computational simulations that are based on numerical methods or known physical laws. This represents both an opportunity and challenge on different levels for traditional distributed computing approaches, architectures, and infrastructures. On the lowest level data-intensive computing is a challenge since CPU speed has surpassed IO capabilities of HPC resources and on the higher levels complex cross-disciplinary data sharing is envisioned via data infrastructures in order to engage in the fragmented answers to societal challenges. This paper highlights how these levels share the demand for high productivity processing of big data including the sharing and analysis of large-scale science data-sets. The paper will describe approaches such as the high-level European data infrastructure EUDAT as well as low-level requirements arising from HPC simulations used in distributed computing. The paper aims to address the fact that big data analysis methods such as computational steering and visualization, map-reduce, R, and others are around, but a lot of research and evaluations still need to be done to achieve scientific insights with them in the context of traditional distributed computing infrastructures.
AB - The steadily increasing amounts of scientific data and the analysis of big data is a fundamental characteristic in the context of computational simulations that are based on numerical methods or known physical laws. This represents both an opportunity and challenge on different levels for traditional distributed computing approaches, architectures, and infrastructures. On the lowest level data-intensive computing is a challenge since CPU speed has surpassed IO capabilities of HPC resources and on the higher levels complex cross-disciplinary data sharing is envisioned via data infrastructures in order to engage in the fragmented answers to societal challenges. This paper highlights how these levels share the demand for high productivity processing of big data including the sharing and analysis of large-scale science data-sets. The paper will describe approaches such as the high-level European data infrastructure EUDAT as well as low-level requirements arising from HPC simulations used in distributed computing. The paper aims to address the fact that big data analysis methods such as computational steering and visualization, map-reduce, R, and others are around, but a lot of research and evaluations still need to be done to achieve scientific insights with them in the context of traditional distributed computing infrastructures.
UR - https://www.scopus.com/pages/publications/84886931280
M3 - Conference contribution
SN - 9789532330762
T3 - 2013 36th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2013 - Proceedings
SP - 145
EP - 150
BT - 2013 36th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2013 - Proceedings
T2 - 2013 36th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2013
Y2 - 20 May 2013 through 24 May 2013
ER -