SLURM

Slurm Workload Manager

The Slurm Workload Manager which name derived from "Simple Linux Utility for Resource Management or "Slurm", is a free and open-source under the GNU General Public License job scheduler for Linux and Unix-like kernels, and is used by many of the world's top supercomputers and HPC and HDPA clusters. It is estimated that Slurm is the workload manager on about 60% of the TOP500 supercomputers. Using a best fit algorithm based on Hilbert curve scheduling or fat tree network topology, Slurm optimizes locality of task assignments on parallel computers.


SLURM WORKLOAD MANGER ENABLES THREE KEY FUNCTIONS:

  • Allocating exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work.
  • Providing a framework for starting, executing, and monitoring work (typically a parallel job such as MPI) on a set of allocated nodes.
  • Arbitrating contention for resources by managing a queue of pending jobs.


What Makes Slurm Workload Manager so Popular?

  • No single point of failure, backup daemons, fault-tolerant job options.
  • Highly scalable (schedules up to 100,000 independent jobs on the 100,000 sockets of IBM Sequoia).
  • High performance (up to 1000 job submissions per second and 600 job executions per second).
  • Free and open-source software (GNU General Public License).
  • Highly configurable with about 100 plugins.
  • Fair-share scheduling with hierarchical bank accounts.
  • Preemptive and gang scheduling (time-slicing of parallel jobs).
  • Integrated with database for accounting and configuration.
  • Resource allocations optimized for network topology and on-node topology (sockets, cores and hyperthreads).
  • Advanced reservation.
  • Idle nodes can be powered down.
  • Different operating systems can be booted for each job.
  • Scheduling for generic resources (e.g. Graphics processing unit).
  • Real-time accounting down to the task level (identify specific tasks with high CPU or memory usage).
  • Resource limits by user or bank account.
  • Accounting for power usage by job.
  • Support of IBM Parallel Environment (PE/POE).
  • Support for job arrays.
  • Job profiling (periodic sampling of each tasks CPU use, memory use, power consumption, network and file system use).
  • Accounting for a job's power consumption.
  • Sophisticated multifactor job prioritization algorithms.
  • Support for MapReduce+.
  • Improved job array data structure and scalability.
  • Support for heterogeneous generic resources.
  • Add user options to set the CPU governor.
  • Automatic job requeue policy based on exit value.
  • Report API use by user, type, count and time consumed.
  • Communication gateway nodes improve scalability.

List of Slurm Commands

View information about Slurm nodes & partitions

sinto [-p partition_name or -M cluster_name]

List example SLURM scripts

ls -p /util/slurm-scripts less

Submit a job script for later execution

sbatch 'script-file

Cancel a pending or running job

scancel jobid

Check the state of a user’s jobs

squeue --user=username

Allocate compute nodes for interactive use

salloc

Run a command on allocated compute nodes

srun

Display node information

snodes [node cluster/partition state]

Launch an interactive job

fisbatch [various sbatch options]

List priorities of queued jobs

sranks

Get the efficiency of a running job

sueff user-name

Get SLURM accounting information for a user’s jobs from start date to now

suacct start-date user-name

Get SLURM accounting and node information for a job

slist jobid

Get resource usage and accounting information for a user’s jobs from start date to now

slogs start-date user-list

Get estimated starting times for queued jobs

stimes [various squeue options]

Monitor performance of a SLURM job

/util/ccrjobvis/slurmjobvis jobid


SchedMD the Company Behind Slurm Workload Manager

SchedMD® is the core company behind the Slurm workload manager software, a free open-source workload manager designed specifically to satisfy the demanding needs of high performance computing. Slurm is in widespread use at government laboratories, universities and companies world wide. As of the June 2017 Top 500 computer list, Slurm was performing workload management on six of the ten most powerful computers in the world including the number 1 system, Sunway TaihuLight with 10,649,600 computing cores, making it the preferred choice for workload management on the top ten computers in the world.

SchedMD distributes and maintains the canonical version of Slurm as well as providing Slurm support, development, training, installation, and configuration.

Slurm is a highly configurable open-source workload manager. In its simplest configuration, it can be installed and configured in a few minutes (see Caos NSA and Perceus: All-in-one Cluster Software Stack by Jeffrey B. Layton). Use of optional plugins provides the functionality needed to satisfy the needs of demanding HPC centers. More complex configurations rely upon a database for archiving accounting records, managing resource limits by user or bank account, and supporting sophisticated scheduling algorithms.