testregex - regex(3) test harness

  testregex [ options ]

  testregex reads regex(3) test specifications, one per line, from the
  standard input and writes one output line for each failed test. A
  summary line is written after all tests are done. Each successful
  test is run again with REG_NOSUB. Unsupported features are noted
  before the first test, and tests requiring these features are
  silently ignored.

  -c	catch signals and non-terminating calls
  -e	ignore error return mismatches
  -h	list help on standard error
  -n	do not repeat successful tests with regnexec()
  -o	ignore match[] overrun errors
  -p	ignore negative position mismatches
  -s	use stack instead of malloc
  -x	do not repeat successful tests with REG_NOSUB
  -v	list each test line
  -A	list failed test lines with actual answers
  -B	list all test lines with actual answers
  -F	list failed test lines
  -P	list passed test lines
  -S	output one summary line

  Input lines may be blank, a comment beginning with #, or a test
  specification. A specification is five fields separated by one
  or more tabs. NULL denotes the empty string and NIL denotes the
  0 pointer.

  Field 1: the regex(3) flags to apply, one character per REG_feature
  flag. The test is skipped if REG_feature is not supported by the
  implementation. If the first character is not [BEASKLP] then the
  specification is a global control line. One or more of [BEASKLP] may be
  specified; the test will be repeated for each mode.

    B 	basic			BRE	(grep, ed, sed)
    E 	REG_EXTENDED		ERE	(egrep)
    A	REG_AUGMENTED		ARE	(egrep with negation)
    S	REG_SHELL		SRE	(sh glob)
    L	REG_LITERAL		LRE	(fgrep)

    a	REG_LEFT|REG_RIGHT	implicit ^...$
    b	REG_NOTBOL		lhs does not match ^
    c	REG_COMMENT		ignore space and #...\n
    d	REG_SHELL_DOT		explicit leading . match
    e	REG_NOTEOL		rhs does not match $
    f	REG_MULTIPLE		multiple \n separated patterns
    g	FNM_LEADING_DIR		testfnmatch only -- match until /
    h	REG_MULTIREF		multiple digit backref
    i	REG_ICASE		ignore case
    j	REG_SPAN		. matches \n
    k	REG_ESCAPE		\ to ecape [...] delimiter
    l	REG_LEFT		implicit ^...
    m	REG_MINIMAL		minimal match
    n	REG_NEWLINE		explicit \n match
    o	REG_ENCLOSED		(|&) magic inside [@|&](...)
    p	REG_SHELL_PATH		explicit / match
    q	REG_DELIMITED		delimited pattern
    r	REG_RIGHT		implicit ...$
    s	REG_SHELL_ESCAPED	\ not special
    t	REG_MUSTDELIM		all delimiters must be specified
    u	standard unspecified behavior -- errors not counted
    v	REG_CLASS_ESCAPE	\ special inside [...]
    w	REG_NOSUB		no subexpression match array
    x	REG_LENIENT		let some errors slide
    y	REG_LEFT		regexec() implicit ^...
    z	REG_NULL		NULL subexpressions ok
    $	                        expand C \c escapes in fields 2 and 3
    /	                        field 2 is a regsubcomp() expression
    =	                        field 3 is a regdecomp() expression

  Field 1 control lines:

    C		set LC_COLLATE and LC_CTYPE to locale in field 2

    ?test ...	output field 5 if passed and != EXPECTED, silent otherwise
    &test ...	output field 5 if current and previous passed
    |test ...	output field 5 if current passed and previous failed
    ; ...	output field 2 if previous failed
    {test ...	skip if failed until }
    }		end of skip

    : comment		comment copied as output NOTE
    :comment:test	:comment: ignored
    N[OTE] comment	comment copied as output NOTE
    T[EST] comment	comment

    number		use number for nmatch (20 by default)

  Field 2: the regular expression pattern; SAME uses the pattern from
    the previous specification.

  Field 3: the string to match.

  Field 4: the test outcome. This is either one of the posix error
    codes (with REG_ omitted) or the match array, a list of (m,n)
    entries with m and n being first and last+1 positions in the
    field 3 string, or NULL if REG_NOSUB is in effect and success
    is expected. BADPAT is acceptable in place of any regcomp(3)
    error code. The match[] array is initialized to (-2,-2) before
    each test. All array elements from 0 to nmatch-1 must be specified
    in the outcome. Unspecified endpoints (offset -1) are denoted by ?.
    Unset endpoints (offset -2) are denoted by X. {x}(o:n) denotes a
    matched (?{...}) expression, where x is the text enclosed by {...},
    o is the expression ordinal counting from 1, and n is the length of
    the unmatched portion of the subject string. If x starts with a
    number then that is the return value of re_execf(), otherwise 0 is

  Field 5: optional comment appended to the report.

    If a regex implementation misbehaves with memory then all bets are off.

  Glenn Fowler        (ksh strmatch, regex extensions)
  David Korn        (ksh glob matcher)
  Doug McIlroy       (ast regex/testre in C++)
  Tom Lord            (rx tests)
  Henry Spencer       (original public regex)
  Andrew Hume     (gre tests)
  John Maddock (regex++ tests)
  Philip Hazel              (pcre tests)
  Ville Laurikari                   (libtre tests)