Redirigiendo al acceso original de articulo en 16 segundos...
ARTÍCULO
TITULO

Methods of integration, reduction of sizes and normalization of processing of heterogeneous and multi-scale data

R. A. Bagutdinov    
M. F. Stepanov    

Resumen

The paper analyzes the existing methods for processing big data, which can be applied to the processing of heterogeneous and multi-scale data. In this work, heterogeneous data is understood as any data with high variability of data types, formats and nature of origin. They can be ambiguous and of poor quality due to missing values, high redundancy, or unreliability. As a result, there is a problem of integrating and aggregating this data for further processing or making specific decisions. Of particular interest is the acquisition of knowledge from autonomous, semantically heterogeneous and distributed data sources, query-oriented and approaches to data integration. The lack of integrity of such data is usually associated with invalid data and incomplete data. Data consistency is the most critical issue in continuous auditing systems for big data and relates to interdependent data between applications and the entire organization. Analyzing large, heterogeneous data can be problematic because it often involves collecting and storing mixed data based on different patterns or rules. The context of the data and their description play an important role here. As a result, the authors consider relevant aspects of data processing, the choice of data processing methods, including data cleansing, data integration, size reduction and normalization for heterogeneous data and the corresponding system and analytical analysis, the potential for fusion of heterogeneous data is considered. This paper describes some of the advantages and disadvantages of the most commonly used methods for processing heterogeneous data. The problems of processing heterogeneous and different-scale data are revealed. The tools for processing big data, some traditional methods of data mining, including machine learning are presented.

 Artículos similares

       
 
Moutaz Alazab and Salah Alhyari    
Industry 4.0 has revolutionized manufacturing processes and facilities through the creation of smart and sustainable production facilities. Blockchain technology (BCT) has emerged as an invaluable asset within Industrial Revolution 4.0 (IR4.0), offering ... ver más
Revista: Information

 
Tamás Kegyes, Alex Kummer, Zoltán Süle and János Abonyi    
We analyzed a special class of graph traversal problems, where the distances are stochastic, and the agent is restricted to take a limited range in one go. We showed that both constrained shortest Hamiltonian pathfinding problems and disassembly line bal... ver más
Revista: Information

 
Francesco Fusco, Vittorio Ugo Castrillo, Hernan Maximiliano Roque Giannetta, Marta Albano and Enrico Cavallini    
In the world of space systems and launchers in particular, there is always a strong demand for the reduction of the weight of all components/subsystems that are not related to the payload and simplification of the integration phase. A possible solution t... ver más
Revista: Aerospace

 
Suveshnee Munien, Puspa L. Adhikari, Kimberly Reycraft, Traci J. Mays, Trishan Naidoo, MacKenzie Pruitt, Jacqueline Arena and Sershen    
This systematic review represents one of the first attempts to compare the efficacy of the full suite of management interventions developed to control (prevent or remove) microplastics (MPs) in freshwater bodies, both man-made and natural. The review als... ver más
Revista: Water

 
Haoyu Lin, Pengkun Quan, Zhuo Liang, Dongbo Wei and Shichun Di    
In the context of automatic charging for electric vehicles, collision localization for the end-effector of robots not only serves as a crucial visual complement but also provides essential foundations for subsequent response design. In this scenario, dat... ver más
Revista: Applied Sciences