Redirigiendo al acceso original de articulo en 16 segundos...
ARTÍCULO
TITULO

Experimental evaluation of the temporal efficiency of big data processing for specified storage formats

V.A. Belov    
E.V. Nikulchev    

Resumen

One of the most important tasks of a modern big data processing platform is the task of choosing data storage formats. The choice of formats is based on various performance criteria, which depend on the class of objects and the requirements. One of the most important criteria is the time spent in various big data processing operations. The paper studies the five most popular formats for storing big data (avro, CSV, JSON, ORC, parquet), proposes an experimental bench for assessing time efficiency, and conducts a comparative analysis of experimental estimates of the characteristics of the formats under consideration. For the experiment, the basic data processing operations were considered using the Apache Spark framework. The format selection algorithm is developed based on the hierarchy analysis method. As a result, a methodology was formed for choosing a format from alternatives based on experimental estimates of parameters and a methodology for analyzing hierarchies for the task of choosing time-efficient basic operations of storage formats for big data in the Apache Hadoop system using Apache Spark.

 Artículos similares

       
 
Donghyun Kang    
Despite the technological achievements of unmanned aerial vehicles (UAVs) growing in academia and industry, there is a lack of studies on the storage devices in UAVs. However, this is an important aspect because the storage devices in UAVs have a limited... ver más
Revista: Aerospace

 
Daniele Granata, Alberto Savino and Alex Zanotti    
The present study aimed to investigate the capability of mid-fidelity aerodynamic solvers in performing a preliminary evaluation of the static and dynamic stability derivatives of aircraft configurations in their design phase. In this work, the mid-fidel... ver más
Revista: Aerospace

 
Chenglin Yang, Dongliang Xu and Xiao Ma    
Due to the increasing severity of network security issues, training corresponding detection models requires large datasets. In this work, we propose a novel method based on generative adversarial networks to synthesize network data traffic. We introduced... ver más
Revista: Applied Sciences

 
Changchang Li, Botao Xu, Zhiwei Chen, Xiaoou Huang, Jing (Selena) He and Xia Xie    
University students, as a special group, face multiple psychological pressures and challenges, making them susceptible to social anxiety disorder. However, there are currently no articles using machine learning algorithms to identify predictors of social... ver más
Revista: Applied Sciences

 
Zheng Li, Xinkai Chen, Jiaqing Fu, Ning Xie and Tingting Zhao    
With the development of electronic game technology, the content of electronic games presents a larger number of units, richer unit attributes, more complex game mechanisms, and more diverse team strategies. Multi-agent deep reinforcement learning shines ... ver más
Revista: Algorithms