This post was originated by Stephen Phil Hill (Stephen.p.Hill@alcatel-lucent.com) on Oct 8, 2008. As an experienced developer in SU (software upgrade), Phil has spent years on writing KSH and TCL scripts. More ever, this KSH coding standard has been a necessary manual/rules in ALU for all developers who write shell scripts. This post is allowed by Stephen Hill himself. Any question, comment or suggestion, please leave a message here or email to me (dave.tian@alcatel-lucent.com) or Phil.
NOTE: These suggestions, recommendations, and rules are not actual requirements of the project. Instead they are a combination of recommendations that help improve code efficiency and robustness, recommendations that help smooth the code inspection process, and even some style suggestions. Do not be concerned that you must follow these exactly – but be prepared for a slower code inspection if you disregard them totally.
NOTE: These standards assume ksh93 is being used. At the moment all recommendations and code used illustrating them are also compatible with ksh88. Compatibility with bash is not guaranteed but many of the rules apply equally well to bash code.
To determine the ksh version you are using from the cmd-line, simply enter Ctrl-v. This works in either EMACS or VI line-editing mode.
You can also execute:
print ${.sh.version}
for the same result.
If your program needs to verify this at run time one other way is to check on specific functionality:
if [[ $'x' == x ]] ; then echo "is using ksh93 parsing" fi
some very minimal Style recommendations
RULE : indent via tabs, not spaces or a mixture of tabs and spaces.
RECOMMENDATION: an exception – wrapping long lines. It can look very readable if you indent the wrap the same number of tabs as the first line, plus 4 spaces.
Eg:
command -a -b -c -d --arg1 stuff --arg2 stuff --arg3 stuff --arg4 stuff next command
So the first line is indented one tab. The continuation on the second line is by one tab then 4 spaces. The following line returns to one tab indentation.
This preserves a reasonable alignment for people that use different tabstop settings.
RULE : declare functions before ‘main body’ code in your programs. Don’t mix and match.
Exceptions – maybe loading libraries, typesets of variables, etc.
RECOMMENDATION : Label the main body of the program
At the point of the “main body” start put a ksh comment that includes the word “main”. It helps us old-timer C-programmers out. Eg:
# Begin "main()" program body.
RULE : at least some minimal function prologs
RECOMMENDATION : It can be pretty brief. Just a sentence or two on what it does.
RULE : Don’t put too much. You don’t want to give explicit detail there and have to update it every time you change one line of code. Be realistic – it won’t be updated and it will be wrong in no time.
Where to put the then or do
if [[ condition ]] then stuff fi while [[ condition ]] do stuff done
-VS-
if [[ condition ]] ; then stuff fi while [[ condition ]] ; do stuff done
RECOMMENDATION: I have a personal preference for the second/shorter method, but I won’t try to make any rule on that. The separate line version doesn’t seem to add any readability or anything else I can see though…
RULE: Just don’t switch mid-stream. If you are adding 10 lines in an existing function then follow the method used in the rest of that function.
If you are adding a new function then do what you please.
Ordering functions
When creating a library of functions for use by others, order the functions in some logical way – either by functionality of alphabetically.
Coding standards and recommendations
RULE : use the ksh syntax checker
Always test your code with the build-in syntax checker and fix those warnings before you even try the code. Eg:
ksh -n myscript.ksh
RULE : use the ksh syntax checker on bash code too
If writing bash code, try out the ksh syntax checker on it. You will have to discard some false positives but it is still better than the built-in bash syntax checking.
But also try the bash built-in syntax checker.
RULE : Use [ ] sparingly.
Use of single-square-brackets is somewhat deprecated, largely replaced with double-square-brackets for string and double-paren (( )) for math ops.
It should technically still be allowed for simple tests ( eg, if [ -f myfile ] ) but you will find that many reviewers have a religious zeal for eliminating all uses of [ ]. Use at your own risk.
RULE : Use (( )) for arithmetic expressions.
Using (( )) is almost identical to using ‘let’. Use <>=! style comparison operators. Eg:
if (( $i > 3 )) ; then if (( i > 3 )) ; then # note those two forms - using "$i" and using "i" # are identical except using "i" is faster (( i-- )) (( i++ )) cmd if (( $? != 0 )) ; then
Note: I don’t know of any benefit for using (( )) around simple assignments, eg:
i=7
RULE : Use double square brackets for string operations.
( couldn’t get the characters through the wiki formatter )
Eg:
if [[ ${HOST_SIDE} != "A" ]]; then
if [[ -e ${VALDATA} ]]; then
RULE : Do not use the old Bourne shell command substitution syntax
x=`cmd`
Instead use the newer KSH syntax for command substitution:
x=$(cmd)
This is much more flexible, nestable, etc.
RULE : Use the {} in variable references almost all the time.
Wrong: $x Right: ${x} Right: ${1} Also Right: $1
It is hard to justify this rule with technical reasons – in 99.9% of usage it won’t matter at all. It actually comes down to a “Because we said so!” kind of rule, and in SU code you need to match existing style somewhat.
RULE : typeset locals
Prevent collision between the global and function-local namespaces – typeset all your variables in functions.
RULE : global/local variable naming conventions
Make globals distinguishable from locals by name, make global variable names in ALL CAPS.
Locals should be all lower case, though mixed case can be used if you feel it is appropriate.
RULE : no littering the filesystem
no tmp files in random directories – put them somewhere specific, not $CWD
BAD: cmd > out BAD: do_stuff_with out BAD: rm -f out BETTER: outfile=/var/tmp/out.$$ BETTER: cmd > ${outfile} BETTER: do_stuff_with ${outfile} BETTER: rm -f ${outfile}
RULE : clean up tmp files
clean up after yourself, especially in error legs. trap’s are good for this.
trap "rm -f some_tmp_file ; print -u2 'IT FAILED BADLY'" EXIT code more code # done, clean up and exit. Clear trap to avoid error message. rm -f some_tmp_file trap "" EXIT print "ALL DONE" exit 0
RULE : get return codes from rsh’d commands
'rsh'
doesn’t return the remote return code to you. For that reason we have the wrapper function 'remote_sh'
for you to use. There is also the ksh script 'remote_shell'
.
Rule of thumb: if you are using it multiple times in one script, definitely use remote_sh.
If you are only calling it once you can use remote_shell.
RULE : LEARN ABOUT KSH PATTERN MATCHING!!!!!!!!!!!!
*NEVER EVER use*
echo ${ACRNLIST} | grep [b-km-zA-Z].: > /dev/null if [ $? -eq 0 ]; then
instead use ksh pattern matching.
if [[ "${ACRNLIST}" == [b-k][m-z][A-Z]: ]]; then
RULE : LEARN ABOUT KSH PATTERN SUBSTITUTION!
*Don’t do this, with an external sed process to launch*
ACRNLIST=$(echo "${ACRNLIST}" | sed "s/,/ /g")
instead use ksh pattern substitution
ACRNLIST=${ACRNLIST//,/ }
In this example the double-slash means the same as the ‘g’ did for sed. See the manpage.
RULE : LEARN ABOUT KSH SUBSTRING SUPPORT:
*Don’t do this, with an external cut process to launch*
x=$(echo ${blah} | cut -c3-10) y=$(echo ${blah} | cut -c10-)
instead use ksh substring support
x=${blah:2:8} y=${blah:9}
Remember in converting these that it goes from 1-based
to 0-based
, and from start-to-end to start-and-length. See the manpage.
Anecdote: I spotted a script with one of those ‘echo | cut’ statements in it. It was inside a nested loop and was being hit about 4400 times to parse some data at script startup. It consistently took 55 seconds to run just that loop on ihgp. Changing that ONE LINE appropriately made the nested loop complete in less than 1 second.
RULE : Check user input data types
If the user is typing an answer to a prompt and the script is using it in a way that bad data may cause a ksh syntax error, check that the data is valid.
Eg, verify that it is numeric:
/bin/echo -e "Enter number of apples desired:c" read A if [[ "${A}" != +([[:digit:]]) ]];then error "answer was not a number" fi
RULE : Don’t over use ‘rm -fr’
Never use 'rm -fr'
on a file, only use the 'r'
on a directory.
RULE : Exit with a useful value
Scripts should always exit with a meaningful exit value.
RULE : Return with a useful return value
Functions should almost always return with a meaningful return value.
RULE : Case statements should handle default (usually)
Case statements should almost always include a default case ( with * ) to catch unexpected errors.
RULE : Safer string handling
When comparing strings, use quotes to prevent errors due to unusual strings. Eg:
if [[ "${my_var}" == "done" ]]; then
NOTE: quoting changes it from a pattern match to a string comparison. If you have a pattern you must NOT quote it but you must escape special characters – eg, whitespace.
if [[ "${my_var}" == *it is done ]]; then
RULE : checking if a process is alive by PID
When you have the PID of a process and you need to check if it is still running, some people are tempted to construct elaborate pipelines of ps, grep, sed, and cut. Do not! This is done exactly the same way you would in C, with signal 0:
kill -0 ${pid} if (( $? != 0 )) ; then # process is finished wait ${pid} echo "process exited with $?"
RULE : use getopts for argument parsing
Possibly KSH93-specific syntax here?
For parsing arguments, usually use getopts. There are exceptions but most scripts that have their own parsing code end up being messy to use and messy to extend later. A simple example follows illustrating several good points:
- Supports longname aliases for the short names ( eg, -v is the same as –verbose )
- Supports longname options that do not have short names ( eg, –help )
- No
#feature
dependency concentration on the option spec string since it is multi-line
# # getopts options spec. # OPT_SPEC=":" OPT_SPEC+="[-][99:-help] " OPT_SPEC+="[-][98:-examples] " OPT_SPEC+="[-][v:-verbose] " OPT_SPEC+="[-][s:-set]: " OPT_SPEC+="[-][h:-hostname]: " typeset -u SET_ARG typeset -i VERBOSE SET_ARG="" VERBOSE=0 HOST_ARG="" # parse command-line while getopts "${OPT_SPEC}" arg ; do case "${arg}" in 99) # --help print "${USAGE}" scriptexit "${CURR_CMD}" 0 ;; 98) # --examples print "${USAGE}" print "${USAGE2}" scriptexit "${CURR_CMD}" 0 ;; v) ((VERBOSE++)) ;; s) SET_ARG=${OPTARG} SET="Y" ;; h) HOST_ARG=${OPTARG} ;; *) print "ERROR: Invalid Argument" print "${USAGE}" scriptexit "${CURR_CMD}" 1 ;; esac done shift $(($OPTIND-1))
Strangely, your posting of this (and Google) helped me reconnect with a friend I had lost touch with 10 years ago. There was enough specific information here including the use of my nickname ‘Phil’ so he found me easily when he searched.
Thanks!
Very very interesting – what a small world, right:) Anyway, it is nice that your ‘lost’ friend gets back and thank you for sharing this excellent summary again!