COSHELL(1)USER COMMANDSCOSHELL(1)


NAME

coshell - network shell coprocess server

SYNOPSIS

coshell + [ info ... ]

coshell -

coshell -op [ arg ... ]

DESCRIPTION

coshell is a local network shell coprocess server for programs using coshell(3). There is one coshell server per user. This server runs as a daemon on the user's home host, and only processes running on the home host have access to the server. The server controls a background ksh(1) shell process, initiated by rsh(1), on each of the connected hosts. The environment of the local host shell is inherited from the server whereas the environment of remote shells is initialized by .profile and $ENV. The shells run with the ksh bgnice and monitor options on.

Job requests are accepted from user processes on the local host and are executed on the connected hosts. stdout, stderr, FPATH, NPROC (see ENVIRONMENT), PWD, PATH, VPATH, vpath, umask and the environment variables listed in COEXPORT (see ENVIRONMENT) are set to match requesting user values. stdin is set to /dev/null; coshell does not directly support interactive jobs. Job scheduling is based on load and idle time information generated by the ss(1) system status daemon. This information is updated every 60 seconds on average.

The server is started by running coshell +. The command exits after a child server process is forked in the background. The optional info arguments name files containing local network host information which may be generated from two shell scripts genlocal and genshare under the subdirectory bin of the installation root directory. If no files are specified then the default local is used. The local network is comprised of hosts sharing the same file name space.

Attributes used by coshell can be categorized into two types, global and host-specific. The global attributes control coshell and are not associated with any particular host. Attribute value pairs, not including readonly ones, may be specified in the local network host information files, in COATTRIBUTES (see ENVIRONMENT) or may be set/added using coshell -a, and may be referenced in an expression in COATTRIBUTES. Attribute names must match [a-zA-Z_][a-zA-Z_0-9]*. In the following description on these attributes, host may be an actual host name or a comma separated list of attribute value pairs specified in COATTRIBUTES.

The attributes used by coshell are:

auto=n

auto=1 adds the host to the automatic selection pool. Hosts named in the file global-attribute have auto=1 by default.
bias=n.m

The host scheduling bias. Hosts with a high bias are (linearly) least likely to be scheduled for job execution. The default bias is 1.00.
busy=time

(global attribute) The grace period for jobs running on busy hosts. A host is busy when its idle attribute is non-zero and its minimum logged in user idle time is less than the value of busy. For a job running on an idle host, busy is the maximum amount of time the job may run after the host becomes busy. If the job does not finish within busy, the SIGSTOP signal is sent to the job and the job stops. When the host idle time exceeds busy, the SIGCONT signal is sent to the job and the job resumes. The meaningful unit of time may be m(inute) and h(our). The default is busy=2m.
cpu=n

The number of cpus on the host. The default is 1.
file=path

(global attribute) Names a file containing default attributes for machines on the local network. If no directory components are specified then the subdirectory share/lib/cs of the installation root directory is searched. The default attribute file for the local network is share/lib/cs/local.
idle=time

The minimum logged in user idle time before jobs will be scheduled on the host. The meaningful unit of time may be m(inute) and h(our). The default is 0, meaning no idle time restrictions. idle is usually 15m for workstations and is not specified (i.e., always available) for compute servers.
label=string

string labels either the current coshell connection (via coopen in coshell(3)) or the current job (via coexec in coshell(3)). Labels are displayed but are otherwise ignored.
load=n.m

A readonly attribute that evaluates to the host load average.
maxload=n.m

(global attribute) The maximum host load average. No job will be scheduled on a host with load average >= maxload. The default maxload=0 means no load average limit.
name=host

The host name in the local domain (i.e., no .'s in the name). The name= may be omitted. In a host selection context: host may be a sh(1) file match pattern; - matches any host; local matches the local host.
open=fd

A readonly attribute that evaluates to 0 if the host shell is closed, <0 if the host shell is being opened, and >0 if the host shell is open.
percpu=n

(global attribute) The maximum number of concurrent jobs on each cpu. The default is 3.
perserver=n

(global attribute) The maximum number of concurrent jobs run by coshell. perserver has an upper limit that is silently enforced. The limit is the half of the number of file descriptors allowed. perserver=0 queues jobs until perserver>0.
peruser=n

(global attribute) The maximum number of concurrent jobs per user connection to coshell. The default is 12.
pool=n

(global attribute) The number of cpus in the processor pool.
rating=n.m

The host cpu rating, usually in mips relative to the other hosts on the local network. This is usually the observed rating rather than the one in the vendor's advertisements.
type=string

The host type that differentiates different processor types, usually related to the object and executable attributes. The default type is *.
up=n
A readonly attribute that evaluates to the number of seconds the host has been up. If n is less than 0 then it is the number of seconds the host has been down.

Other user-defined attributes may be specified. They may be referenced in COATTRIBUTES expressions, but are otherwise ignored by coshell.

coshell - opens an interactive connection to the running server. The commands are:

a host[,attributes ...]

Set or add attributes for the named hosts.
c host ...

Close the shell connections to the named hosts. The hosts are also removed from the automatic selection pool.
d [level]

Set the server stderr to the stderr of the calling process. If level is specified then the server debug trace level is set to -level. Higher debug levels produce more output on stderr.
f [fd]

This is a debugging option and may not be present in all implementations. If fd is specified then close the internal server file descriptor fd, otherwise list the status of all open file descriptors in the server.
g
List global state.
h
List command help.
j
List the status of all jobs. The status fields are:

JOB
The id assigned to the job by the server. This number may be used as an argument to the k command.
USR
The id assigned to the requesting user by the server.
RID
The id assigned to the job by the requesting user.
PID
The job process id, QUEUE if the shell is in the process of opening, START if the PID has not been determined yet, and WARPED if the job completed before its PID was determined.
TIME
The elapsed time since the job started. * follows the time if the job is about to terminate.
HOST
The host where the job is running. The most recent signal sent to the job follows the host name.
LABEL

The label assigned to the job by the requesting user.

k [ c | k | s | t ] job

Kill the job with the server JOB id job. If no argument is specified then the SIGTERM signal is sent to the job. c sends SIGCONT, k sends SIGKILL, s sends SIGSTOP, and t sends SIGSTERM.
l expr

List all host names matching the attribute expression expr. The names are sorted in scheduling rank order from best to worst. If pool=n is specified in expr then only the first n names (after sorting) are listed.
o host ...

Open a shell connection to the named hosts.
q
Quit the interactive connection.
Q
Kill the server and quit the interactive connection.
r host [ command ]

Run command on host. host may be an attribute expression. If command is omitted then hostname(1) is used.
s [ a | e | l | o | p | s | t ]

List the shell connection status. There is at most one shell connection per host. If no argument is specified then only open connections are listed. a lists the attributes for all shells, e lists all shells, l lists all shells in the processor pool, o lists all open shells, p lists the process id of all open shells, s lists the shell scheduling status (primarily for debugging), and t lists all open shells sorted by the recent job activities running on each host.

The status fields for se and sl are:

CON
The id assigned to the open shell by the server, \@ if the shell is not open and is not in the processor pool, - if the shell is not open, and + if an open is in progress.
JOBS
The number of jobs currently running on the host. * follows the number if any of the jobs are queued pending the completion of an open in progress.
TOTAL

The total number of jobs run on the host.
USER
The accumulated user time (times(2)) of all jobs on the host.
SYS
The accumulated sys time (times(2)) of all jobs on the host.
IDLE
The elapsed time since the most recent logged in user activity. * follows the time if the host does not meet the processor pool idle time requirements.
CPU
The number of cpus on the host.
LOAD
The host load average.
RATING

The host rating, usually in network relative mips.
BIAS
The scheduling bias. Hosts with lower bias are more likely to be scheduled.
TYPE
The host type, usually related to object and executable attributes.
HOST
The host name.

The status fields for so, ss, and st are:


CON
The id assigned to the open shell by the server, \@ if the shell is not open and is not in the processor pool, - if the shell is not open, and + if an open is in progress.
OPEN
The accumulated number of times the server has connected to the host.
USERS

The current number of active users.
UP
The amount of time the host has been up.
CONNECT

The amount of time the server has connected to the host.
UPDATE

The amount of time before the host status information is out-of-date.
OVERRIDE

The amount of time of keeping the host connection followed by the host identification code, 1 for the local host, 0 for other hosts in the network.
IDLE
The specified idle time.
TEMP
A measure of the recent job activities running on the host.
RANK
A measure of the desirability of the host. This takes idle time restriction, load average, and the number of CPU into account. Two digits after the decimal point are random numbers which are used to break ties between different coshell servers. Hosts with lower RANK are more likely to be scheduled.
HOST
The host name.

t
List the accumulated totals. The fields are:

SHELLS

The number of active shell connections followed by the total number of successful shell connections.
USERS

The number of active user connections followed by the total number of successful user connections.
JOBS
The number of active jobs followed by the total number of jobs run.
CMDS
The number of server-user transactions.
UP
The elapsed time since the server started.
REAL
The elapsed time during which the USER and SYS times were accumulated.
USER
The accumulated user time for all jobs on all hosts.
SYS
The accumulated sys time for all jobs on all hosts.
CPU
The number of cpus available on all connected hosts followed by the processor pool cpu limit plus the explicit host override count. An override host is a connected host that does not meet the processor pool idle time requirements.
LOAD
The load average, averaged over all connected hosts.
RATING

The host rating, averaged over all connected hosts.

u
List connected user status. The status fields are:

CON
The id assigned to the user connection by the server.
PID
The user process id.
JOBS
The number of jobs currently running on behalf of the user.
TOTAL

The total number of jobs requested by the user.
TTY
The user process stderr file name.
label

The label assigned to the connection by the requesting user.

v
List the server version stamp.

The interactive commands are useful in terms of tuning some global variable values. For example, one could set NPROC to 100, export it, and control the number of jobs executed using the coshell interactive command:

     	coshell> a local,peruser=10,perserver=40

The interactive commands may be used as options for non-interactive coshell queries. For example, coshell -sl produces a long shell status listing and coshell -c dodo closes the shell connection to the host dodo.

EXAMPLES

The following environment variables must be set if coshell is installed in a non-standard directory (not /bin, /usr/bin, or /usr/local/bin):
root=<coshell-installation-root-directory>
export PATH=$root/bin:$PATH
If coshell is dynamically linked, the LD_LIBRARY_PATH environment variable needs to be set.
export LD_LIBRARY_PATH=$root/lib:$LD_LIBRARY_PATH

The following two commands are used to generate the local network host information which is shared among all the coshell users and only needs to be generated once unless this information needs to be updated. If you run into permission problems, contact your system administrator.

genshare > $root/lib/cs/share
genlocal > $root/share/lib/cs/local
The genshare command is run first to generate information on servers for the network. By default, this information is stored in $root/lib/cs/share. Based on this information, genlocal is run to generate the local host attribute file. By default, this information is stored in $root/share/lib/cs/local. If the share file generated by the genshare command is not stored in the default path, you need to pass its path to the genlocal command using the -f option.

You may also modify the generated files to meet your needs.

A sample local host attribute file follows:

#
# local host attributes
#
local	pool=8	bias=4	busy=1m
server	type=sun4	rating=20
cruncher	type=mips	rating=30	cpu=20
station	type=sun3	rating=6	idle=15m
token	type=3b	rating=0.1	idle=15m
The local entry sets the processor pool size to 8, the local host bias to 4, and the busy host grace period to 1 minute. Compute servers that are available to all users usually have no idle attribute whereas personal workstations are given at least idle=15m out of courtesy to the workstation owner.

The following starts the coshell server. The processor pool size is taken from the local host attribute file.

coshell +
The following instruct programs using coshell(3) to use coshell rather than ksh or sh for command execution and sets the command execution concurrency level to 8.
export COSHELL=coshell
export NPROC=8

The shell function cosh provides a convenient interface for common coshell actions:

export FPATH=$root/fun:$FPATH
# start coshell, export COSHELL,NPROC, and set window title
cosh
coshell (AT&T Bell Laboratories) 10/11/93  [first time only]
# run hostname on best host
on - hostname
dodo
# interact with server ...
cosh -
coshell>

CAVEATS

A coshell connect stream file is created in the /tmp/cs directory. Some systems:
(1)
do not update the times on the connect stream file when it is accessed
(2)
automatically remove stale files from /tmp
(3)
fail to generate a poll(2) or select(2) event when the connect stream file is removed
(4)
do not handle mounted streams or sockets.

In any of these cases, the environment variable CS_MOUNT_LOCAL needs to be set to another file system where all the users have read and write permissions. For example:

export CS_MOUNT_LOCAL=<coshell-installation-root-directory>/tmp

On some systems the server may not detect that its connect stream file has been unlinked, resulting in erroneous `server not running' errors. To handle this situation the server checks and recreates the connect stream file on receipt of a SIGINT signal.

NFS cache inconsistencies may arise for files generated via NFS on remote hosts but serviced via the native file system on the local host. Running coshell from a diskless host avoids the problem.

Host load average and logged in user idle times are used to schedule hosts and jobs. Some terminal lock programs, e.g., xlock(1), inflate the load average, usually doing complex graphics operations on displays that have long since been blanked out by an independent screen saver. A simple lock program that blocks on a read request may open up idle cycles for better use.

ENVIRONMENT


COATTRIBUTES

Host attribute expression, (type@local) by default. Non-numeric valued attributes may appear as the first operand of the comparison operators <, <=, ==, !=, >= and >, where the second operand must be a "..." or '...' string that is compared with the attribute value. For the == and != operators the second operand is taken to be a ksh(1) file match pattern. For example, given the host definitions:
coot	type=sun4	mem=8m	rating=11.0	cad
dodo	type=sun3	mem=4m	rating=2.0
loon	type=mips	mem=16m	rating=20.0
(type=='sun*'&&mem>6m) selects coot, (rating>=11.0) selects coot and loon, and (cad) selects coot. attribute@host represents the attribute value for host. For example, type@local matches the type of the host running the coshell server.
COEXPORT

A colon separated list of environment variables to export to each job. This is to support the rare cases where some environment variables change after the coshell server has been started. For example, some commands use environment variables rather than arguments or options to pass input data.
COSHELL

Set to coshell for the network shell service
COTEMP

Set to a different value for each shell command. It is used for temporary file names. (see Engine Variables in nmake(1)) This variable may be referenced in .profile.
HOMEHOST

Set within each action to the name of the host executing coshell.
HOSTNAME

Set within each action to the name of the host executing the action. This variable may be referenced in .profile.
HOSTTYPE

Set within each action to the type (from the local coshell host attribute file) of the host executing the action. This variable may be referenced in .profile.
NPROC

Default command concurrency level

FILES

share/lib/cs/local

local network host attributes

AUTHOR

Glenn Fowler
glenn.s.fowler@gmail.com
AT&T Bell Laboratories

SEE ALSO

3d(1), ksh(1), nmake(1), rsh(1), ss(1), coshell(3), cs(3)


November 07, 2006