REVISTA
Information

TODAS

Inicio / Information / Vol: 10 Par: 7 (2019) / Art�culo

ART�CULO

TITULO

Hadoop Performance Analysis Model with Deep Data Locality

Sungchul Lee

Ju-Yeon Jo and Yoohwan Kim

Resumen

Background: Hadoop has become the base framework on the big data system via the simple concept that moving computation is cheaper than moving data. Hadoop increases a data locality in the Hadoop Distributed File System (HDFS) to improve the performance of the system. The network traffic among nodes in the big data system is reduced by increasing a data-local on the machine. Traditional research increased the data-local on one of the MapReduce stages to increase the Hadoop performance. However, there is currently no mathematical performance model for the data locality on the Hadoop. Methods: This study made the Hadoop performance analysis model with data locality for analyzing the entire process of MapReduce. In this paper, the data locality concept on the map stage and shuffle stage was explained. Also, this research showed how to apply the Hadoop performance analysis model to increase the performance of the Hadoop system by making the deep data locality. Results: This research proved the deep data locality for increasing performance of Hadoop via three tests, such as, a simulation base test, a cloud test and a physical test. According to the test, the authors improved the Hadoop system by over 34% by using the deep data locality. Conclusions: The deep data locality improved the Hadoop performance by reducing the data movement in HDFS.

Palabras claves

MapReduce - Hadoop - data locality - HDFS - deep data locality

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 10 Parte: 7 (2019)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

Applied Sciences
International Journal of Open Information Technologies
Informatics

DOI

https://doi.org/10.3390/info10070222

Art�culos similares

A New Big Data Benchmark for OLAP Cube Design Using Data Pre-Aggregation Techniques

Acceso

Roberto Tard�o, Alejandro Mat� and Juan Trujillo

In recent years, several new technologies have enabled OLAP processing over Big Data sources. Among these technologies, we highlight those that allow data pre-aggregation because of their demonstrated performance in data querying. This is the case of Apa... ver m�s

Revista: Applied Sciences

Recognizing Indonesian Acronym and Expansion Pairs with Supervised Learning and MapReduce

Acceso

Taufik Fuadi Abidin, Amir Mahazir, Muhammad Subianto, Khairul Munadi and Ridha Ferdhiana

During the previous decades, intelligent identification of acronym and expansion pairs from a large corpus has garnered considerable research attention, particularly in the fields of text mining, entity extraction, and information retrieval. Herein, we p... ver m�s

Revista: Information

Distributed Centrality Analysis of Social Network Data Using MapReduce

Acceso

Ranjan Kumar Behera, Santanu Kumar Rath, Sanjay Misra, Robertas Dama?evicius and Rytis Maskeliunas

Analyzing the structure of a social network helps in gaining insights into interactions and relationships among users while revealing the patterns of their online behavior. Network centrality is a metric of importance of a network node in a network, whic... ver m�s

Revista: Algorithms

A Cloud-based Malware Detection Framework

Acceso

Eman Ahmed,Amin A. Sorrour,Mohamed A. Sobh,Ayman M. Bahaa-Eldin P�g. pp. 113 - 127

Malwares are increasing rapidly. The nature of distribution and effects of malwares attacking several applications requires a real-time response. Therefore, a high performance detection platform is required. In this paper, Hadoop is utilized to perform s... ver m�s

Revista: International Journal of Interactive Mobile Technologies (iJIM)

Development of a Prototype Web-Based Decision Support System for Watershed Management

Acceso

Dejian Zhang, Xingwei Chen and Huaxia Yao

Using distributed hydrological models to evaluate the effectiveness of reducing non-point source pollution by applying best management practices (BMPs) is an important support to decision making for watershed management. However, complex interfaces and t... ver m�s

Revista: Water

Revistas destacadas

Acceso directo a los n�meros publicados en la revista Infrastructures

Infrastructures

Acceso directo a los n�meros publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los n�meros publicados en la revista BiT

Acceso directo a los n�meros publicados en la revista Revista de la Construcci�n

Revista de la Construcci�n

Ver todas las revistas