ARTÍCULO
TITULO

Development of Computational Pipeline Software for Genome/Exome Analysis on the K Computer

Kento Aoyama    
Masanori Kakuta    
Yuri Matsuzaki    
Takashi Ishida    
Masahito Ohue    
Yutaka Akiyama    

Resumen

Pipeline software that comprise tool and application chains for specific data processing have found extensive utilization in the analysis of several data types, such as genome, in bioinformatics research. Recent trends in genome analysis require use of pipeline software for optimum utilization of computational resources, thereby facilitating efficient handling of large-scale biological data accumulated on a daily basis. However, use of pipeline software in bioinformatics tends to be problematic owing to their large memory and storage capacity requirements, increasing number of job submissions, and a wide range of software dependencies. This paper presents a massive parallel genome/exome analysis pipeline software that addresses these difficulties. Additionally, it can be executed on a large number of K computer nodes. The proposed pipeline incorporates workflow management functionality that performs effectively when considering the task-dependency graph of internal executions via extension of the dynamic task distribution framework. Performance results pertaining to the core pipeline functionality, obtained via evaluation experiments performed using an actual exome dataset, demonstrate good scalability when using over a thousand nodes. Additionally, this study proposes several approaches to resolve performance bottlenecks of a pipeline by considering the domain knowledge pertaining to internal pipeline executions as a major challenge facing pipeline parallelization. 

 Artículos similares

       
 
Radoslaw Piotr Katarzyniak, Grzegorz Popek and Marcin Zurawski    
This article presents a model of an architecture of an artificial cognitive agent that performs the function of generating autoepistemic membership statements used to communicate beliefs about the belonging of an observed external object to a category wi... ver más
Revista: Applied Sciences

 
Nguyen Trung Tuan, Philip Moore, Dat Ha Vu Thanh and Hai Van Pham    
ChatGPT plays significant roles in the third decade of the 21st Century. Smart cities applications can be integrated with ChatGPT in various fields. This research proposes an approach for developing large language models using generative artificial intel... ver más
Revista: Applied Sciences

 
Pablo Vera-Soto, Javier Villegas, Sergio Fortes, José Pulido, Vicente Escaño, Rafael Ortiz and Raquel Barco    
Aircraft are composed of many electronic systems: sensors, displays, navigation equipment, and communication elements. These elements require a reliable interconnection, which is a major challenge for communication networks since high reliability and pre... ver más
Revista: Aerospace

 
Jason Cornelius, Sven Schmitz, Jose Palacios, Bernadine Juliano and Richard Heisler    
This work details the development and validation of a methodology for high-resolution rotor models used in hybrid Blade Element Momentum Theory Unsteady Reynolds Averaged Navier?Stokes (BEMT-URANS) CFD. The methodology is shown to accurately predict sing... ver más
Revista: Aerospace

 
Khalid Alnajim and Ahmed A. Abokifa    
In the wake of the terrorist attacks of 11 September 2001, extensive research efforts have been dedicated to the development of computational algorithms for identifying contamination sources in water distribution systems (WDSs). Previous studies have ext... ver más
Revista: Water