PCREGREP(1)                                                        PCREGREP(1)


NAME
       pcregrep - a grep with Perl-compatible regular expressions.


SYNOPSIS
       pcregrep [options] [long options] [pattern] [path1 path2 ...]


DESCRIPTION

       pcregrep  searches  files  for  character  patterns, in the same way as
       other grep commands do, but it uses the PCRE regular expression library
       to support patterns that are compatible with the regular expressions of
       Perl 5. See pcrepattern for a full description of syntax and  semantics
       of the regular expressions that PCRE supports.

       Patterns,  whether  supplied on the command line or in a separate file,
       are given without delimiters. For example:

         pcregrep Thursday /etc/motd

       If you attempt to use delimiters (for example, by surrounding a pattern
       with  slashes,  as  is common in Perl scripts), they are interpreted as
       part of the pattern. Quotes can of course be used on the  command  line
       because they are interpreted by the shell, and indeed they are required
       if a pattern contains white space or shell metacharacters.

       The first argument that follows any option settings is treated  as  the
       single  pattern  to be matched when neither -e nor -f is present.  Con-
       versely, when one or both of these options are  used  to  specify  pat-
       terns, all arguments are treated as path names. At least one of -e, -f,
       or an argument pattern must be provided.

       If no files are specified, pcregrep reads the standard input. The stan-
       dard  input  can  also  be  referenced by a name consisting of a single
       hyphen.  For example:

         pcregrep some-pattern /file1 - /file3

       By default, each line that matches the pattern is copied to  the  stan-
       dard  output, and if there is more than one file, the file name is out-
       put at the start of each line. However,  there  are  options  that  can
       change how pcregrep behaves. In particular, the -M option makes it pos-
       sible to search for patterns that span line boundaries.

       Patterns are limited to 8K  or  BUFSIZ  characters,  whichever  is  the
       greater.  BUFSIZ is defined in <stdio.h>.

       If  the  LC_ALL  or LC_CTYPE environment variable is set, pcregrep uses
       the value to set a locale when calling the PCRE library.  The  --locale
       option can be used to override this.


OPTIONS

       --        This  terminate the list of options. It is useful if the next
                 item on the command line starts with a hyphen but is  not  an
                 option.  This allows for the processing of patterns and file-
                 names that start with hyphens.

       -A number, --after-context=number
                 Output number lines of context after each matching  line.  If
                 filenames and/or line numbers are being output, a hyphen sep-
                 arator is used instead of a colon for the  context  lines.  A
                 line  containing  "--" is output between each group of lines,
                 unless they are in fact contiguous in  the  input  file.  The
                 value  of number is expected to be relatively small. However,
                 pcregrep guarantees to have up to 8K of following text avail-
                 able for context output.

       -B number, --before-context=number
                 Output  number lines of context before each matching line. If
                 filenames and/or line numbers are being output, a hyphen sep-
                 arator  is  used  instead of a colon for the context lines. A
                 line containing "--" is output between each group  of  lines,
                 unless  they  are  in  fact contiguous in the input file. The
                 value of number is expected to be relatively small.  However,
                 pcregrep guarantees to have up to 8K of preceding text avail-
                 able for context output.

       -C number, --context=number
                 Output number lines of context both  before  and  after  each
                 matching  line.  This is equivalent to setting both -A and -B
                 to the same value.

       -c, --count
                 Do not output individual lines; instead just output  a  count
                 of the number of lines that would otherwise have been output.
                 If several files are given, a count is  output  for  each  of
                 them. In this mode, the -A, -B, and -C options are ignored.

       --colour, --color
                 If this option is given without any data, it is equivalent to
                 "--colour=auto".  If data is required, it must  be  given  in
                 the same shell item, separated by an equals sign.

       --colour=value, --color=value
                 This  option specifies under what circumstances the part of a
                 line that matched a pattern should be coloured in the output.
                 The  value may be "never" (the default), "always", or "auto".
                 In the latter case, colouring happens only  if  the  standard
                 output  is  connected to a terminal. The colour can be speci-
                 fied by setting the environment variable  PCREGREP_COLOUR  or
                 PCREGREP_COLOR. The value of this variable should be a string
                 of two numbers, separated by a semicolon.   They  are  copied
                 directly into the control string for setting colour on a ter-
                 minal, so it is your responsibility to ensure that they  make
                 sense.  If  neither  of the environment variables is set, the
                 default is "1;31", which gives red.

       -D action, --devices=action
                 If an input path is  not  a  regular  file  or  a  directory,
                 "action"  specifies  how  it is to be processed. Valid values
                 are "read" (the default) or "skip" (silently skip the  path).

       -d action, --directories=action
                 If an input path is a directory, "action" specifies how it is
                 to be processed.  Valid  values  are  "read"  (the  default),
                 "recurse"  (equivalent to the -r option), or "skip" (silently
                 skip the path). In the default case, directories are read  as
                 if  they  were  ordinary files. In some operating systems the
                 effect of reading a directory like this is an immediate  end-
                 of-file.

       -e pattern, --regex=pattern,
                 --regexp=pattern Specify a pattern to be matched. This option
                 can be used multiple times in order to specify  several  pat-
                 terns.  It  can  also be used as a way of specifying a single
                 pattern that starts with a hyphen. When -e is used, no  argu-
                 ment  pattern  is  taken from the command line; all arguments
                 are treated as file names. There is an overall maximum of 100
                 patterns. They are applied to each line in the order in which
                 they are defined until one matches (or fails to match  if  -v
                 is  used).  If  -f is used with -e, the command line patterns
                 are matched first, followed by the patterns  from  the  file,
                 independent  of  the  order in which these options are speci-
                 fied. Note that multiple use of -e is not the same as a  sin-
                 gle  pattern  with  alternatives.  For example, X|Y finds the
                 first character in a line that is X or Y, whereas if the  two
                 patterns  are  given  separately,  pcregrep  finds X if it is
                 present, even if it follows Y in the line. It finds Y only if
                 there  is  no  X in the line. This really matters only if you
                 are using -o to show the portion of the line that matched.

       --exclude=pattern
                 When pcregrep is searching the files in a directory as a con-
                 sequence of the -r (recursive search) option, any files whose
                 names match the pattern are excluded. The pattern is  a  PCRE
                 regular expression. If a file name matches both --include and
                 --exclude, it is excluded. There is no short  form  for  this
                 option.

       -F, --fixed-strings
                 Interpret  each pattern as a list of fixed strings, separated
                 by newlines, instead of  as  a  regular  expression.  The  -w
                 (match  as  a  word) and -x (match whole line) options can be
                 used with -F. They apply to each of the fixed strings. A line
                 is selected if any of the fixed strings are found in it (sub-
                 ject to -w or -x, if present).

       -f filename, --file=filename
                 Read a number of patterns from the file, one  per  line,  and
                 match  them against each line of input. A data line is output
                 if any of the patterns match it. The filename can be given as
                 "-" to refer to the standard input. When -f is used, patterns
                 specified on the command line using -e may also  be  present;
                 they are tested before the file's patterns. However, no other
                 pattern is taken from the command  line;  all  arguments  are
                 treated  as  file  names.  There is an overall maximum of 100
                 patterns. Trailing white space is removed from each line, and
                 blank  lines  are ignored. An empty file contains no patterns
                 and therefore matches nothing.

       -H, --with-filename
                 Force the inclusion of the filename at the  start  of  output
                 lines  when searching a single file. By default, the filename
                 is not shown in this case. For matching lines,  the  filename
                 is  followed  by  a  colon  and a space; for context lines, a
                 hyphen separator is used. If a line number is also being out-
                 put, it follows the file name without a space.

       -h, --no-filename
                 Suppress  the output filenames when searching multiple files.
                 By default, filenames  are  shown  when  multiple  files  are
                 searched.  For  matching lines, the filename is followed by a
                 colon and a space; for context lines, a hyphen  separator  is
                 used.  If  a line number is also being output, it follows the
                 file name without a space.

       --help    Output a brief help message and exit.

       -i, --ignore-case
                 Ignore upper/lower case distinctions during comparisons.

       --include=pattern
                 When pcregrep is searching the files in a directory as a con-
                 sequence  of  the  -r  (recursive  search) option, only those
                 files whose names match the pattern are included. The pattern
                 is  a  PCRE  regular  expression. If a file name matches both
                 --include and --exclude, it is excluded. There  is  no  short
                 form for this option.

       -L, --files-without-match
                 Instead  of  outputting lines from the files, just output the
                 names of the files that do not contain any lines  that  would
                 have  been  output. Each file name is output once, on a sepa-
                 rate line.

       -l, --files-with-matches
                 Instead of outputting lines from the files, just  output  the
                 names of the files containing lines that would have been out-
                 put. Each file name is  output  once,  on  a  separate  line.
                 Searching  stops  as  soon  as  a matching line is found in a
                 file.

       --label=name
                 This option supplies a name to be used for the standard input
                 when file names are being output. If not supplied, "(standard
                 input)" is used. There is no short form for this option.

       --locale=locale-name
                 This option specifies a locale to be used for pattern  match-
                 ing.  It  overrides the value in the LC_ALL or LC_CTYPE envi-
                 ronment variables.  If  no  locale  is  specified,  the  PCRE
                 library's  default (usually the "C" locale) is used. There is
                 no short form for this option.

       -M, --multiline
                 Allow patterns to match more than one line. When this  option
                 is given, patterns may usefully contain literal newline char-
                 acters and internal occurrences of ^ and  $  characters.  The
                 output  for  any one match may consist of more than one line.
                 When this option is set, the PCRE library is called in  "mul-
                 tiline"  mode.   There is a limit to the number of lines that
                 can be matched, imposed by the way that pcregrep buffers  the
                 input  file as it scans it. However, pcregrep ensures that at
                 least 8K characters or the rest of the document (whichever is
                 the  shorter)  are  available for forward matching, and simi-
                 larly the previous 8K characters (or all the previous charac-
                 ters,  if  fewer  than 8K) are guaranteed to be available for
                 lookbehind assertions.

       -n, --line-number
                 Precede each output line by its line number in the file, fol-
                 lowed  by  a colon and a space for matching lines or a hyphen
                 and a space for context lines. If the filename is also  being
                 output, it precedes the line number.

       -o, --only-matching
                 Show  only  the  part  of the line that matched a pattern. In
                 this mode, no context is shown. That is, the -A, -B,  and  -C
                 options are ignored.

       -q, --quiet
                 Work quietly, that is, display nothing except error messages.
                 The exit status indicates whether or  not  any  matches  were
                 found.

       -r, --recursive
                 If  any given path is a directory, recursively scan the files
                 it contains, taking note of any --include and --exclude  set-
                 tings.  By  default, a directory is read as a normal file; in
                 some operating systems this gives an  immediate  end-of-file.
                 This  option  is  a  shorthand  for  setting the -d option to
                 "recurse".

       -s, --no-messages
                 Suppress error  messages  about  non-existent  or  unreadable
                 files.  Such  files  are quietly skipped. However, the return
                 code is still 2, even if matches were found in other files.

       -u, --utf-8
                 Operate in UTF-8 mode. This option is available only if  PCRE
                 has  been compiled with UTF-8 support. Both patterns and sub-
                 ject lines must be valid strings of UTF-8 characters.

       -V, --version
                 Write the version numbers of pcregrep and  the  PCRE  library
                 that is being used to the standard error stream.

       -v, --invert-match
                 Invert  the  sense  of  the match, so that lines which do not
                 match any of the patterns are the ones that are found.

       -w, --word-regex, --word-regexp
                 Force the patterns to match only whole words. This is equiva-
                 lent to having \b at the start and end of the pattern.

       -x, --line-regex, --line-regexp
                 Force  the  patterns to be anchored (each must start matching
                 at the beginning of a line) and in addition, require them  to
                 match  entire  lines.  This  is  equivalent to having ^ and $
                 characters at the start and end of each alternative branch in
                 every pattern.


ENVIRONMENT VARIABLES

       The  environment  variables  LC_ALL  and LC_CTYPE are examined, in that
       order, for a locale. The first one that is set is  used.  This  can  be
       overridden  by  the  --locale  option.  If  no  locale is set, the PCRE
       library's default (usually the "C" locale) is used.


OPTIONS COMPATIBILITY

       The majority of short and long forms of pcregrep's options are the same
       as  in  the  GNU grep program. Any long option of the form --xxx-regexp
       (GNU terminology) is also available as --xxx-regex (PCRE  terminology).
       However,  the  --locale,  -M,  --multiline, -u, and --utf-8 options are
       specific to pcregrep.


OPTIONS WITH DATA

       There are four different ways in which an option with data can be spec-
       ified.   If  a  short  form option is used, the data may follow immedi-
       ately, or in the next command line item. For example:

         -f/some/file
         -f /some/file

       If a long form option is used, the data may appear in the same  command
       line item, separated by an equals character, or (with one exception) it
       may appear in the next command line item. For example:

         --file=/some/file
         --file /some/file

       Note, however, that if you want to supply a file name beginning with  ~
       as  data  in  a  shell  command,  and have the shell expand ~ to a home
       directory, you must separate the file name from the option, because the
       shell  does not treat ~ specially unless it is at the start of an item.

       The exception to the above is the --colour  (or  --color)  option,  for
       which  the  data is optional. If this option does have data, it must be
       given in the first form, using an equals character. Otherwise  it  will
       be assumed that it has no data.


MATCHING ERRORS

       It  is  possible  to supply a regular expression that takes a very long
       time to fail to match certain lines.  Such  patterns  normally  involve
       nested  indefinite repeats, for example: (a+)*\d when matched against a
       line of a's with no final digit.  The  PCRE  matching  function  has  a
       resource  limit that causes it to abort in these circumstances. If this
       happens, pcregrep outputs an error message and the line that caused the
       problem  to  the  standard error stream. If there are more than 20 such
       errors, pcregrep gives up.


DIAGNOSTICS

       Exit status is 0 if any matches were found, 1 if no matches were found,
       and  2 for syntax errors and non-existent or inacessible files (even if
       matches were found in other files) or too many matching  errors.  Using
       the  -s  option to suppress error messages about inaccessble files does
       not affect the return code.


AUTHOR

       Philip Hazel
       University Computing Service
       Cambridge CB2 3QG, England.

Last updated: 23 January 2006
Copyright (c) 1997-2006 University of Cambridge.
