Mechanisms and techniques for scheduling in supercomputers

Tesis doctoral de José Antonio Pascual Saiz

This thesis analyzes the performance of the scheduling process in space-shared, large-scale supercomputers. These systems are specifically designed to run fine-grained parallel applications in which the communications/computation ratio is high. The way of using the interconnection network has a significant bearing on applications performance and, therefore, on the overall system performance. The scheduling process can be divided into three stages, driven by a set of policies or strategies. Assuming that users send parallel jobs to a single scheduling queue, (1) a job is selected to run, the (2) the resources (set of nodes) required by the job have to be located in the system and reserved for the job, and (3) job task have to be mapped onto the selected nodes. This dissertation studies ways of improving the performance of the scheduling process focusing on stages 2 (partitioning) and 3 (mapping). In particular we use contiguous partitioning as the strategy to assign partitions to jobs. Contiguous partitioning strategies have a well-known disadvantage: high fragmentation that results in low levels of system utilization. However they provide jobs with a running environment that, due to the locality of communications and the lack of interference with other running jobs, substantially reduce running times. In order to effectively exploit these advantages, an appropriate task-to-node mapping has to be implemented. Through extensive simulation-based experimentation, it is demonstrated that combinations of consecutive partitioning and application-aware mappings achieve excellent job throughput, compared to non-contiguous partitioning alternatives. Although the main topic of our work is contiguous partitioning, we have also explored some aspects of non-contiguous partitioning strategies, due to its common use in production environments. Regarding system topology, we mainly focus on cube-shaped topologies such as meshes and tori, but we also study alternative partitioning strategies for tree topologies.

 

Datos académicos de la tesis doctoral «Mechanisms and techniques for scheduling in supercomputers«

  • Título de la tesis:  Mechanisms and techniques for scheduling in supercomputers
  • Autor:  José Antonio Pascual Saiz
  • Universidad:  País vasco/euskal herriko unibertsitatea
  • Fecha de lectura de la tesis:  28/06/2013

 

Dirección y tribunal

  • Director de la tesis
    • José Antonio Lozano Alonso
  • Tribunal
    • Presidente del tribunal: clemente Rodriguez lafuente
    • Javier Navaridas palma (vocal)
    • José angel Gregorio monasterio (vocal)
    • Francisco Fernández de vega (vocal)

 

Deja un comentario

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *

Scroll al inicio