Los Alamos Message-Passing Interface
Application Communications and Performance Team, CCS-1, LANL
23 December 2005
Version 1.5.13
mpirun
- run MPI programs using the Los Alamos Message-Passing Interface, LA-MPI.
Table of Contents
mpirun
[-n|-np
#,...]
-N #
-H hostlist
[...] prog
[args ...]
mpirun
-h|-help
mpirun
-version
mpirun
starts an MPI application compiled and linked against the
LA-MPI implementation of MPI.
LA-MPI is a network-fault-tolerant message-passing system designed for
the challenges of the terascale cluster environment.
Command-line options may start with one or more dashes. Options
may also be specied in a configuration file ($HOME/.lampi.conf)
using the associated variable (see "Advanced options" below).
prog
must be either a single executable or a
comma-delimited list of executables for each host or machine (which is
not supported for all environments). Executable arguments apply to all
specified executables.
- -h, -help
-
Print help and exit.
- -n|-np number of processes
- A single value
specifies the total number of processes, while a comma-delimited
list of numbers specifies the number of processes on each host or
machine.
- -N number of hosts
- The number of hosts to use.
- -H hostlist
- A comma-delimited list of name(s) of
hosts to use.
- -t,-p
- Turn on tagging of standard output and error lines
with either GR[HR.LR.HN] (GR = global rank, HR = host rank, LR =
local rank, HN = host name), or simply HR (on RMS/prun systems).
- -v
- Turn on verbose debugging information.
- -V, -version
-
Print version and exit.
- -local
-
Run on the local host using fork/exec to spawn the processes.
- -threads
- Turn on thread safe code.
- -crc
- Use 32-bit CRCs instead of 32-bit additive checksums
(the LA-MPI default) for application to application data integrity
where applicable.
- -qf,-mf,-ib flaglist
- A comma-delimited list of
keywords that affect operation on, respectively Quadrics,
Myrinet(GM) and InfiniBand networks. The keywords that are supported
are "noack", "ack", "nochecksum", and "checksum". "noack" disables
retransmission of data, but allows data sends to complete upon local
notification by the network adapter. "nochecksum" disables
all LA-MPI CRC/checksum generation and verification; in this mode,
data integrity guarantees are provided only at the network adapter
level. The defaults are "ack", and "checksum" for a LA-MPI library
built with reliability mode enabled.
- -N,-nhosts ARG,...
-
Number of hosts (machines)
Configuration file variable: HostCount
- -H, -hostlist ARG,...
-
List of hosts (machines)
Configuration file variable: HostList
- -machinefile ARG,...
-
File containing list of hosts (machines)
Configuration file variable: MachineFile
- -n, -np, -nprocs ARG,...
-
Number of processes total or on each host (machine)
Configuration file variable: Procs
- -f, -no-arg-check
-
No checking of MPI arguments
Configuration file variable: NoArgCheck
- -t, -p, -output-prefix
-
Prefix standard output and error with helpful information
Configuration file variable: OutputPrefix
- -q, -quiet
-
Suppress start-up messages
Configuration file variable: Quiet
- -v, -verbose
-
Extremely verbose progress information
Configuration file variable: Verbose
- -w, -warn
-
Enable warning messages
Configuration file variable: Warn
- -i, -interface ARG,...
-
List of interface names to be used by TCP/UDP paths
Configuration file variable: InterfaceList
- -ni, -interfaces ARG,...
-
Maximum number of interfaces to be used by TCP/UDP paths
Configuration file variable: Interfaces
- -s, -mpirun-hostname ARG,...
-
A comma-delimited list of preferred IP interface name
fragments (whole, suffix, or prefix) or addresses for TCP/IP
administrative and UDP/IP data traffic.
Configuration file variable: MpirunHostname
- -nolsf
-
No LSF control will be used in job
Configuration file variable: NoLSF
- -cpulist ARG,...
-
Which CPUs to use (when available)
Configuration file variable: CpuList
- -d, -working-dir ARG,...
-
Current directory
Configuration file variable: WorkingDir
- -rusage
-
Print resource usage (when available)
Configuration file variable: PrintRusage
- -dapp ARG,...
-
Absolute path to binary
Configuration file variable: DirectoryOfBinary
- -e, -exe ARG,...
-
Binary name
Configuration file variable: Executable
- -dev ARG,...
-
Type of network. A colon-delimited list of one or more network
paths to use for this MPI job. Currently the choices are some
combination of tcp
(TCP/IP), udp
(UDP/IP), gm
(Myrinet/GM), quadrics
(QsNet/Elan3) and ib
(InfiniBand), depending on your installation.
Configuration file variable: NetworkDevices
- -debug-daemon
-
Wait in LA-MPI daemon for debugger to attach (when applicable)
Configuration file variable: DebugDaemon
- -env ARG,...
-
List environment variables and settings
Configuration file variable: EnvVars
- -retry ARG,...
-
Number of times to retry for resources
Configuration file variable: MaxRetryWhenNoResource
- -local
-
Run on the local host using fork/exec to spawn the processes
Configuration file variable: Local
- -ssh
-
use SSH instead of RSH to create remote processes
Configuration file variable: UseSSH
- -threads
-
Threads used in job
Configuration file variable: NoThreads
- -timeout ARG,...
-
Time in seconds to wait for client applications to connect back
Configuration file variable: ConnectTimeout
- -heartbeat-period ARG,...
-
Period in seconds of heartbeats between mpirun and daemon processes
Configuration file variable: HeartbeatPeriod
- -crc
-
Use CRCs instead of checksums
Configuration file variable: UseCRC
- -list-options
-
Print detailed list of all options and exit
Configuration file variable: ListOptions
- 1. mpirun prog args
- On systems with LSF, this
command can be used to start prog args
on some number of
hosts and processors preassigned by LSF. Note: In some
installations LA-MPI does not interact directly with LSF, so that a
-np NPROCS
option may be required.
- 2. mpirun -n 64 prog args
- On systems with LSF, this
command will start 64 processes of prog args
on either all
or some subset of preassigned hosts and processors. If it is a
subset, then mpirun attempts to use all of the preassigned
processors on a host before allocating from the next reserved host's
allocation. On systems with rsh spawning only, this command will
start 64 processes on the local host or machine.
- 3. mpirun -n 64 -N 40 prog args
- On systems with
standard LSF, this command will start 64 processes on 40 preassigned
hosts as explained in example #2. On systems with LSF/RMS, mpirun
will start 64 processes on 40 hosts using prun(1)
in a block
assignment mode. This command will not work on systems with rsh
spawning only, as there is no specific host information available.
- 4. mpirun -n 66 -H n1,n2,n3,n4 prog args
- This
command will start 66 processes on the 4 named hosts. If the system
supports LSF, then the assignment will be made using the processors
available on each host. If the system only support rsh spawning,
then mpirun will assign as close as possible the same number of
processes per host (in this example, 17 processes will be started on
n1 and n2, and 16 on n3 and n4).
- 5. mpirun -dev tcp -t -np 2,10,30,22 -H n1,n2,n3,n4 prog args
-
On systems with rsh spawning, this command will
start 2 processes on n1, 10 on n2, 30 on n3, and 22 on n4. Prefix
tagging of standard output and standard error will be done, and only
TCP/IP will be used for intercommunication between hosts (if
available).
- mpirun
- Job spawner using LA-MPI.
- libmpi.so
- LA-MPI shared-object library.
- mpi.h
- LA-MPI main C header file.
- mpif.h
- LA-MPI main Fortran header file.
- $MPI_ROOT/etc/lampi.conf
- System LA-MPI configuration file.
- $HOME/.lampi.conf
- User LA-MPI configuration file.
Instead of using command-line options to mpirun, administrators and
users can also set the default behavior of mpirun
by defining
variables in the system and user configuration files,
$MPI_ROOT/lampi.conf
and $HOME/.lampi.conf
file.
For example, the configuration file
Quiet 1
OutputPrefix 1
MyrinetFlags noack,nochecksum
is equivalent to setting mpirun -q -t -mf noack,nochecksum.
The list of available configuration file variables can be
obtained from mpirun -help.
mpirun
exits when all of the processes in the parallel program
have exited successfully (with exit status 0), when one or more
processes exit with non-zero exit status, when one process has been
killed by an unhandled signal, or if an uncorrectable user or system
error is detected.
If all processes exit successfully then the exit status of
mpirun
is 0. If some of the processes exit with non-zero
status, then the exit status of mpirun
is the binary OR of
their individual exit statuses. If one of the processes is killed by
an unhandled signal, the exit status of mpirun
is set to 128
plus the signal number.
A number of special exit codes are returned to indicate LA-MPI
system-specific events:
- 122
- Run-time environment error (e.g. a LA-MPI daemon died)
- 123
- MPI_Abort() was called
- 124
- System error
- 125
- Session expired (e.g. LSF)
- 126
- Resource not available (e.g. asking for more than allocated)
- 127
- Invalid arguments
Note that a few unusual shells treat exit codes of 128 or above
eccentrically. For example the Tru64 csh sets $status to signal
number minus 128, so that if a parallel application is killed with
signal 11 the exit status will be -117, and not 139 as would be the
case with most shells.
On error, LA-MPI attempts to provide detailed diagnostic messages
which are written both to standard error and the file lampi.log
in the application's working directory. Occasionally, if a job dies
unexpectedly, diagnostic error messages may not be delivered to
stderr, but are likely to be successfully written to lampi.log.
Additional warning messages can be enabled with mpirun -w.
Extremely verbose runtime information can be generated with
mpirun -verbose,
which should generally be avoided, but may be
useful for checking the sanity of an installation.
On systems that support job spawning via rsh
or ssh,
make sure that your shell startup environment (e.g., .cshrc, .bashrc
etc.) correctly set the PATH and LD_LIBRARY_PATH environment
variables.
The LA-MPI project has now merged with Open MPI
(http://www.open-mpi.org).
Open MPI version 1.0.0 was released
at SC|05, Seattle in November 2005.
LA-MPI will continue to be maintained until Open MPI has sufficiently
matured for production use on terascale platforms.
Version: 1.5.13 of 23 December 2005.
Please report bugs to lampi-support@lanl.gov.
Copyright 2002-2005. The Regents of the University of
California. This material was produced under U.S. Government contract
W-7405-ENG-36 for Los Alamos National Laboratory, which is operated by
the University of California for the U.S. Department of Energy. The
Government is granted for itself and others acting on its behalf a
paid-up, nonexclusive, irrevocable worldwide license in this material
to reproduce, prepare derivative works, and perform publicly and
display publicly. Beginning five (5) years after October 10,2002
subject to additional five-year worldwide renewals, the Government is
granted for itself and others acting on its behalf a paid-up,
nonexclusive, irrevocable worldwide license in this material to
reproduce, prepare derivative works, distribute copies to the public,
perform publicly and display publicly, and to permit others to do
so. NEITHER THE UNITED STATES NOR THE UNITED STATES DEPARTMENT OF
ENERGY, NOR THE UNIVERSITY OF CALIFORNIA, NOR ANY OF THEIR EMPLOYEES,
MAKES ANY WARRANTY, EXPRESS OR IMPLIED, OR ASSUMES ANY LEGAL LIABILITY
OR RESPONSIBILITY FOR THE ACCURACY, COMPLETENESS, OR USEFULNESS OF ANY
INFORMATION, APPARATUS, PRODUCT, OR PROCESS DISCLOSED, OR REPRESENTS
THAT ITS USE WOULD NOT INFRINGE PRIVATELY OWNED RIGHTS.
Additionally, this program is free software; you can distribute it
and/or modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either version 2
of the License, or any later version. Accordingly, this program is
distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public
License for more details.
LA-MPI has been approved for release with associated LA-CC Number
LA-CC-02-41.
LA-MPI is developed by the Application Communications and Performance
Team, CCS-1 at Los Alamos National Laboratory. Contributors have
included Rob T. Aulwes, Laura Casswell, David J. Daniel, Denis
G. Dimick, Nehal N. Desai, Richard L. Graham, L. Dean Risinger,
Mitchel W. Sukalski, Timothy S. Woodall, and Ginger A. Young
Contact: lampi-support@lanl.gov