Bash rpad, lpad, and “center-pad”

One thing it seems bash has no tool for is padding strings. If you want to left-pad a number (say, 3) with zeros to a certain “string width” (say, 5 characters wide), you could do:

$ echo `printf "%05d" 3`
00003

However, this only works with numbers. Actually, it only works with integers (bash can’t handle floats, apparently). Worse, it only pads on the left with zeros. No right-padding, no “other character” padding.

$ echo `printf "%05d" -`
bash: printf: -: invalid number
00000
$ echo `printf "%05s" -`
-

Breaking out of numbers and into string concatenation, however, makes it a bit easier:

function lpad {
    word="$1"
    while [ ${#word} -lt $2 ]; do
        word="$3$word";
    done;
    echo "$word";
}
$ lpad 3 5 0
00003

“But it’s the same output!” you say? Well, for this input it happens to be, yes.

$ lpad 4 5 -
----4
$ lpad hello 11 -
------hello
$ lpad hello 11 !
!!!!!!hello

We can even pad with whitespace:

$ echo "'`lpad hello 11 \ `'"
'      hello'

Here I’ve printed single quotes around the results to show the space padding, and used the backslash to escape the space in the command, so it would be used as $3 rather than be seen as CLI whitespace (and ignored).

Of course, this function can be switched around to make an rpad function, and extended to make a cpad function:

function rpad {
    word="$1"
    while [ ${#word} -lt $2 ]; do
        word="$word$3";
    done;
    echo "$word";
}
function cpad {
    word="$1"
    while [ ${#word} -lt $2 ]; do
        word="$word$3";
        if [ ${#word} -lt $2 ]; then
            word="$3$word"
        fi;
    done;
    echo "$word";
}
$ rpad "w00t" 15 ^
w00t^^^^^^^^^^^
$ cpad "hello, world" 50 -
-------------------hello, world-------------------

Of course, these functions aren’t perfect, as they make (at least) the following assumptions:

  1. Inputs $1 (word to pad), $2 (length to pad to), and $3 (padding characters) are given.
  2. Input $2 is a number.
  3. Input $3 is one character, and no more.

Assumption #1 means that if, for example, only $1 and $2 are given, the function will go into an infinite loop (or actually exit with an error, “bash: [: 8: unary operator expected"). This can be fixed by checking for those variables' existence, and initializing some internal variables to default values if they are not:

function lpad {
    if [ "$1" ]; then
        word="$1";
    else
        word="";
    fi;

    if [ "$2" ]; then
        len=$2;
    else
        len=${#word};
    fi;

    if [ "$3" ]; then
        padding="$3";
    else
        padding=" ";
    fi;

    while [ ${#word} -lt $len ]; do
        word="$padding$word";
    done;
    echo "$word";
}

Here, the "word" defaults to a blank string, the "length" defaults to the length of the word, and the "padding" defaults to a space. If only the word is given, the word will be returned. If word and length are given, word will be padded with spaces. If all three are given, it works fine.

Assumption #2 means that if $2 is a string containing anything other than digits 0-9, the function will exit with an error ("bash: [: <length>: integer expression expected"). This can be fixed by "filtering" any non-digit characters out of the length string:

    if [ "$2" ]; then
        len=$((`echo $2 | sed 's/[^0-9]//g'`));
    else
        len=${#word};
    fi;

This uses the sed tool, which should be available on most if not all *nix systems, to replace all non-digits ([^0-9], in regex notation) with nothing, then converts the result from a string of numbers to an actual number with $(()). This means that "5uh0oh0" will be turned into 500. We'll get a scary big string in that case, but at least we won't get errors!

Assumption #3 means that if the pad string is more than one character, the result *could* be longer than the length we wanted. For example:

$ string=`cpad hey 50 -`
$ echo ${#string}
50
$ string=`cpad hey 50 -=`
$ echo ${#string}
51

In the second example there, the length of the output string actually has one extra character. This is because the function only adds the padding while the length of the string is less than our specified length; it doesn't actually do any length-checking once we've gone beyond the "length less than" test. This is fixed slightly less simply, and requires different logic for each function:

function rpad {
    ...
    while [ ${#word} -lt $len ]; do
        word="$word$padding";
    done;
    while [ ${#word} -gt $len ]; do
        word=${word:0:$((${#word}-1))}
    done;
    echo "$word";
}
function lpad {
    ...
    while [ ${#word} -lt $len ]; do
        word="$padding$word";
    done;
    while [ ${#word} -gt $len ]; do
        word=${word:1:$((${#word}-1))}
    done;
    echo "$word";
}
function cpad {
    ...
    while [ ${#word} -lt $len ]; do
        word="$word$padding";
        if [ ${#word} -lt $len ]; then
            word="$padding$word"
        fi;
    done;
    while [ ${#word} -gt $len ]; do
        word=${word:0:$((${#word}-1))}
        if [ ${#word} -gt $len ]; then
            word=${word:1:$((${#word}-1))}
        fi;
    done;
    echo "$word";
}

The solution uses the bash substring notation, ${variable:start:length}, to strip off extra characters at the end/beginning when the resulting string is too long. Of course, this isn't a perfect fix; if you specify a shorter length than your input string, the string will simply be truncated to that length. Also, if you use a long string for padding with cpad, you'll get unbalanced output.

$ lpad "hello, world" 5
world
$ rpad "hello, world" 5
hello
$ cpad "hello, world" 5
lo, w
$ cpad hello 50 ---=--=---
-=--=------=--=---hello---=--=------=--=------=--=

You could easily continue to refine this function to make it obey all sorts of preconceptions, but I'm going to stop here. If you use the functions as they're meant to be used -- numerical lengths and single-character paddings, etc -- it'll all be fine.

Though if you do find a nice, concise way to make these behave "properly", feel free to share in the comments!

1 Comment:

  1. Left-aligned padding
    printf “%-50s %-50s\n” “hello” “world”

No Pingbacks