1 \input texinfo @c -*-texinfo-*-
5 @settitle GNU Grep @value{VERSION}
17 This manual is for @command{grep}, a pattern matching engine.
19 Copyright @copyright{} 1999-2002, 2005, 2008-2011 Free Software Foundation, Inc.
22 Permission is granted to copy, distribute and/or modify this document
23 under the terms of the GNU Free Documentation License, Version 1.3 or
24 any later version published by the Free Software Foundation; with no
25 Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
26 Texts. A copy of the license is included in the section entitled
27 ``GNU Free Documentation License''.
31 @dircategory Text creation and manipulation
33 * grep: (grep). Print lines matching a pattern.
37 @title GNU Grep: Print lines matching a pattern
38 @subtitle version @value{VERSION}, @value{UPDATED}
39 @author Alain Magloire et al.
41 @vskip 0pt plus 1filll
52 @command{grep} prints lines that match a pattern.
54 This manual is for version @value{VERSION} of GNU Grep.
60 * Introduction:: Introduction.
61 * Invoking:: Command-line options, environment, exit status.
62 * Regular Expressions:: Regular Expressions.
64 * Reporting Bugs:: Reporting Bugs.
65 * Copying:: License terms for this manual.
66 * Index:: Combined index.
73 @cindex searching for a pattern
75 @command{grep} searches the input files
76 for lines containing a match to a given pattern list.
77 When it finds a match in a line,
78 it copies the line to standard output (by default),
79 or produces whatever other sort of output you have requested with options.
81 Though @command{grep} expects to do the matching on text,
82 it has no limits on input line length other than available memory,
83 and it can match arbitrary characters within a line.
84 If the final byte of an input file is not a newline,
85 @command{grep} silently supplies one.
86 Since newline is also a separator for the list of patterns,
87 there is no way to match newline characters in a text.
91 @chapter Invoking @command{grep}
93 The general synopsis of the @command{grep} command line is
96 grep @var{options} @var{pattern} @var{input_file_names}
100 There can be zero or more @var{options}.
101 @var{pattern} will only be seen as such
102 (and not as an @var{input_file_name})
103 if it wasn't already specified within @var{options}
104 (by using the @samp{-e@ @var{pattern}}
105 or @samp{-f@ @var{file}} options).
106 There can be zero or more @var{input_file_names}.
109 * Command-line Options:: Short and long names, grouped by category.
110 * Environment Variables:: POSIX, GNU generic, and GNU grep specific.
111 * Exit Status:: Exit status returned by @command{grep}.
112 * grep Programs:: @command{grep} programs.
115 @node Command-line Options
116 @section Command-line Options
118 @command{grep} comes with a rich set of options:
119 some from @sc{posix.2} and some being @sc{gnu} extensions.
120 Long option names are always a @sc{gnu} extension,
121 even for options that are from @sc{posix} specifications.
122 Options that are specified by @sc{posix},
123 under their short names,
124 are explicitly marked as such
125 to facilitate @sc{posix}-portable programming.
126 A few option names are provided
127 for compatibility with older or more exotic implementations.
130 * Generic Program Information::
132 * General Output Control::
133 * Output Line Prefix Control::
134 * Context Line Control::
135 * File and Directory Selection::
139 Several additional options control
140 which variant of the @command{grep} matching engine is used.
141 @xref{grep Programs}.
143 @node Generic Program Information
144 @subsection Generic Program Information
150 @cindex usage summary, printing
151 Print a usage message briefly summarizing the command-line options
152 and the bug-reporting address, then exit.
158 @cindex version, printing
159 Print the version number of @command{grep} to the standard output stream.
160 This version number should be included in all bug reports.
164 @node Matching Control
165 @subsection Matching Control
169 @item -e @var{pattern}
170 @itemx --regexp=@var{pattern}
172 @opindex --regexp=@var{pattern}
174 Use @var{pattern} as the pattern.
175 This can be used to specify multiple search patterns,
176 or to protect a pattern beginning with a @samp{-}.
177 (@samp{-e} is specified by @sc{posix}.)
180 @itemx --file=@var{file}
183 @cindex pattern from file
184 Obtain patterns from @var{file}, one per line.
185 The empty file contains zero patterns, and therefore matches nothing.
186 (@samp{-f} is specified by @sc{posix}.)
193 @opindex --ignore-case
194 @cindex case insensitive search
195 Ignore case distinctions in both the pattern and the input files.
196 @samp{-y} is an obsolete synonym that is provided for compatibility.
197 (@samp{-i} is specified by @sc{posix}.)
200 @itemx --invert-match
202 @opindex --invert-match
203 @cindex invert matching
204 @cindex print non-matching lines
205 Invert the sense of matching, to select non-matching lines.
206 (@samp{-v} is specified by @sc{posix}.)
211 @opindex --word-regexp
212 @cindex matching whole words
213 Select only those lines containing matches that form whole words.
214 The test is that the matching substring must either
215 be at the beginning of the line,
216 or preceded by a non-word constituent character.
218 it must be either at the end of the line
219 or followed by a non-word constituent character.
220 Word-constituent characters are letters, digits, and the underscore.
225 @opindex --line-regexp
226 @cindex match the whole line
227 Select only those matches that exactly match the whole line.
228 (@samp{-x} is specified by @sc{posix}.)
232 @node General Output Control
233 @subsection General Output Control
241 @cindex counting lines
242 Suppress normal output;
243 instead print a count of matching lines for each input file.
244 With the @samp{-v}, @samp{--invert-match} option,
245 count non-matching lines.
246 (@samp{-c} is specified by @sc{posix}.)
248 @item --color[=@var{WHEN}]
249 @itemx --colour[=@var{WHEN}]
252 @cindex highlight, color, colour
253 Surround the matched (non-empty) strings, matching lines, context lines,
254 file names, line numbers, byte offsets, and separators (for fields and
255 groups of context lines) with escape sequences to display them in color
257 The colors are defined by the environment variable @var{GREP_COLORS}
258 and default to @samp{ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36}
259 for bold red matched text, magenta file names, green line numbers,
260 green byte offsets, cyan separators, and default terminal colors otherwise.
261 The deprecated environment variable @var{GREP_COLOR} is still supported,
262 but its setting does not have priority;
263 it defaults to `01;31' (bold red)
264 which only covers the color for matched text.
265 @var{WHEN} is @samp{never}, @samp{always}, or @samp{auto}.
268 @itemx --files-without-match
270 @opindex --files-without-match
271 @cindex files which don't match
272 Suppress normal output;
273 instead print the name of each input file from which
274 no output would normally have been printed.
275 The scanning of every file will stop on the first match.
278 @itemx --files-with-matches
280 @opindex --files-with-matches
281 @cindex names of matching files
282 Suppress normal output;
283 instead print the name of each input file from which
284 output would normally have been printed.
285 The scanning of every file will stop on the first match.
286 (@samp{-l} is specified by @sc{posix}.)
289 @itemx --max-count=@var{num}
293 Stop reading a file after @var{num} matching lines.
294 If the input is standard input from a regular file,
295 and @var{num} matching lines are output,
296 @command{grep} ensures that the standard input is positioned
297 just after the last matching line before exiting,
298 regardless of the presence of trailing context lines.
299 This enables a calling process to resume a search.
300 For example, the following shell script makes use of it:
303 while grep -m 1 PATTERN
309 But the following probably will not work because a pipe is not a regular
313 # This probably will not work.
315 while grep -m 1 PATTERN
321 When @command{grep} stops after @var{num} matching lines,
322 it outputs any trailing context lines.
323 Since context does not include matching lines,
324 @command{grep} will stop when it encounters another matching line.
325 When the @samp{-c} or @samp{--count} option is also used,
326 @command{grep} does not output a count greater than @var{num}.
327 When the @samp{-v} or @samp{--invert-match} option is also used,
328 @command{grep} stops after outputting @var{num} non-matching lines.
331 @itemx --only-matching
333 @opindex --only-matching
334 @cindex only matching
335 Print only the matched (non-empty) parts of matching lines,
336 with each such part on a separate output line.
344 @cindex quiet, silent
345 Quiet; do not write anything to standard output.
346 Exit immediately with zero status if any match is found,
347 even if an error was detected.
348 Also see the @samp{-s} or @samp{--no-messages} option.
349 (@samp{-q} is specified by @sc{posix}.)
354 @opindex --no-messages
355 @cindex suppress error messages
356 Suppress error messages about nonexistent or unreadable files.
358 unlike @sc{gnu} @command{grep},
359 7th Edition Unix @command{grep} did not conform to @sc{posix},
360 because it lacked @samp{-q}
361 and its @samp{-s} option behaved like
362 @sc{gnu} @command{grep}'s @samp{-q} option.
363 @sc{usg}-style @command{grep} also lacked @samp{-q}
364 but its @samp{-s} option behaved like @sc{gnu} @command{grep}'s.
365 Portable shell scripts should avoid both
366 @samp{-q} and @samp{-s} and should redirect
367 standard and error output to @file{/dev/null} instead.
368 (@samp{-s} is specified by @sc{posix}.)
372 @node Output Line Prefix Control
373 @subsection Output Line Prefix Control
375 When several prefix fields are to be output,
376 the order is always file name, line number, and byte offset,
377 regardless of the order in which these options were specified.
384 @opindex --byte-offset
386 Print the 0-based byte offset within the input file
387 before each line of output.
388 If @samp{-o} (@samp{--only-matching}) is specified,
389 print the offset of the matching part itself.
390 When @command{grep} runs on @sc{ms-dos} or @sc{ms}-Windows,
391 the printed byte offsets depend on whether
392 the @samp{-u} (@samp{--unix-byte-offsets}) option is used;
396 @itemx --with-filename
398 @opindex --with-filename
399 @cindex with filename prefix
400 Print the file name for each match.
401 This is the default when there is more than one file to search.
406 @opindex --no-filename
407 @cindex no filename prefix
408 Suppress the prefixing of file names on output.
409 This is the default when there is only one file
410 (or only standard input) to search.
412 @item --label=@var{LABEL}
414 @cindex changing name of standard input
415 Display input actually coming from standard input
416 as input coming from file @var{LABEL}. This is
417 especially useful when implementing tools like
418 @command{zgrep}; e.g.:
421 gzip -cd foo.gz | grep --label=foo -H something
427 @opindex --line-number
428 @cindex line numbering
429 Prefix each line of output with the 1-based line number within its input file.
430 (@samp{-n} is specified by @sc{posix}.)
435 @opindex --initial-tab
436 @cindex tab-aligned content lines
437 Make sure that the first character of actual line content lies on a tab stop,
438 so that the alignment of tabs looks normal.
439 This is useful with options that prefix their output to the actual content:
440 @samp{-H}, @samp{-n}, and @samp{-b}.
441 In order to improve the probability that lines
442 from a single file will all start at the same column,
443 this also causes the line number and byte offset (if present)
444 to be printed in a minimum-size field width.
447 @itemx --unix-byte-offsets
449 @opindex --unix-byte-offsets
450 @cindex @sc{ms-dos}/@sc{ms}-Windows byte offsets
451 @cindex byte offsets, on @sc{ms-dos}/@sc{ms}-Windows
452 Report Unix-style byte offsets.
453 This option causes @command{grep} to report byte offsets
454 as if the file were a Unix-style text file,
455 i.e., the byte offsets ignore the @code{CR} characters that were stripped.
456 This will produce results identical
457 to running @command{grep} on a Unix machine.
458 This option has no effect unless the @samp{-b} option is also used;
459 it has no effect on platforms other than @sc{ms-dos} and @sc{ms}-Windows.
465 @cindex zero-terminated file names
466 Output a zero byte (the @sc{ascii} @code{NUL} character)
467 instead of the character that normally follows a file name.
469 @samp{grep -lZ} outputs a zero byte after each file name
470 instead of the usual newline.
471 This option makes the output unambiguous,
472 even in the presence of file names containing unusual characters like newlines.
473 This option can be used with commands like
474 @samp{find -print0}, @samp{perl -0}, @samp{sort -z}, and @samp{xargs -0}
475 to process arbitrary file names,
476 even those that contain newline characters.
480 @node Context Line Control
481 @subsection Context Line Control
483 Regardless of how these options are set,
484 @command{grep} will never print any given line more than once.
485 If the @samp{-o} or @samp{--only-matching} option is specified,
486 these options have no effect and a warning is given upon their use.
491 @itemx --after-context=@var{num}
493 @opindex --after-context
494 @cindex after context
495 @cindex context lines, after match
496 Print @var{num} lines of trailing context after matching lines.
499 @itemx --before-context=@var{num}
501 @opindex --before-context
502 @cindex before context
503 @cindex context lines, before match
504 Print @var{num} lines of leading context before matching lines.
508 @itemx --context=@var{num}
513 Print @var{num} lines of leading and trailing output context.
515 @item --group-separator=@var{string}
516 @opindex --group-separator
517 @cindex group separator
518 When @option{-A}, @option{-B} or @option{-C} are in use,
519 print @var{string} instead of @samp{--} around disjoint groups
522 @item --no-group-separator
523 @opindex --group-separator
524 @cindex group separator
525 When @option{-A}, @option{-B} or @option{-C} are in use,
526 print disjoint groups of lines adjacent to each other.
530 Matching lines normally use @samp{:} as a separator
531 between prefix fields and actual line content.
532 Context (i.e., non-matching) lines use @samp{-} instead.
533 When no context is specified,
534 matching lines are simply output one right after another.
535 When nonzero context is specified,
536 lines that are adjacent in the input form a group
537 and are output one right after another, while
538 a separator appears by default between disjoint groups on a line
539 of its own and without any prefix. The default separator
540 is @samp{--}, however whether to include it and its appearance
541 can be changed with the options above. Each group may contain
542 several matching lines when they are close enough to each other
543 that two otherwise adjacent but divided groups connect
544 and can just merge into a single contiguous one.
546 @node File and Directory Selection
547 @subsection File and Directory Selection
555 @cindex suppress binary data
557 Process a binary file as if it were text;
558 this is equivalent to the @samp{--binary-files=text} option.
560 @itemx --binary-files=@var{type}
561 @opindex --binary-files
563 If the first few bytes of a file indicate that the file contains binary data,
564 assume that the file is of type @var{type}.
565 By default, @var{type} is @samp{binary},
566 and @command{grep} normally outputs either
567 a one-line message saying that a binary file matches,
568 or no message if there is no match.
569 If @var{type} is @samp{without-match},
570 @command{grep} assumes that a binary file does not match;
571 this is equivalent to the @samp{-I} option.
572 If @var{type} is @samp{text},
573 @command{grep} processes a binary file as if it were text;
574 this is equivalent to the @samp{-a} option.
575 @emph{Warning:} @samp{--binary-files=text} might output binary garbage,
576 which can have nasty side effects
577 if the output is a terminal and
578 if the terminal driver interprets some of it as commands.
580 @item -D @var{action}
581 @itemx --devices=@var{action}
584 @cindex device search
585 If an input file is a device, FIFO, or socket, use @var{action} to process it.
586 By default, @var{action} is @samp{read},
587 which means that devices are read just as if they were ordinary files.
588 If @var{action} is @samp{skip},
589 devices, FIFOs, and sockets are silently skipped.
591 @item -d @var{action}
592 @itemx --directories=@var{action}
594 @opindex --directories
595 @cindex directory search
596 If an input file is a directory, use @var{action} to process it.
597 By default, @var{action} is @samp{read},
598 which means that directories are read just as if they were ordinary files
599 (some operating systems and file systems disallow this,
600 and will cause @command{grep}
601 to print error messages for every directory or silently skip them).
602 If @var{action} is @samp{skip}, directories are silently skipped.
603 If @var{action} is @samp{recurse},
604 @command{grep} reads all files under each directory, recursively;
605 this is equivalent to the @samp{-r} option.
607 @item --exclude=@var{glob}
609 @cindex exclude files
610 @cindex searching directory trees
611 Skip files whose base name matches @var{glob}
612 (using wildcard matching).
613 A file-name glob can use
614 @samp{*}, @samp{?}, and @samp{[}...@samp{]} as wildcards,
615 and @code{\} to quote a wildcard or backslash character literally.
617 @item --exclude-from=@var{file}
618 @opindex --exclude-from
619 @cindex exclude files
620 @cindex searching directory trees
621 Skip files whose base name matches any of the file-name globs
622 read from @var{file} (using wildcard matching as described
623 under @samp{--exclude}).
625 @item --exclude-dir=@var{dir}
626 @opindex --exclude-dir
627 @cindex exclude directories
628 Exclude directories matching the pattern @var{dir} from recursive
632 Process a binary file as if it did not contain matching data;
633 this is equivalent to the @samp{--binary-files=without-match} option.
635 @item --include=@var{glob}
637 @cindex include files
638 @cindex searching directory trees
639 Search only files whose base name matches @var{glob}
640 (using wildcard matching as described under @samp{--exclude}).
647 @cindex recursive search
648 @cindex searching directory trees
649 For each directory mentioned on the command line,
650 read and process all files in that directory, recursively.
651 This is the same as the @samp{--directories=recurse} option.
656 @opindex --only-files
657 @cindex ignoring special files
658 @cindex ignoring symlinked directories
659 Ignore all special files, except for symlinks.
660 When recursing into directories, ignore symlinked directories as well.
665 @subsection Other Options
669 @item --line-buffered
670 @opindex --line-buffered
671 @cindex line buffering
672 Use line buffering on output.
673 This can cause a performance penalty.
677 @cindex memory mapped input
678 This option is ignored for backwards compatibility. It used to read
679 input with the @code{mmap} system call, instead of the default @code{read}
680 system call. On modern systems, @code{mmap} would rarely if ever yield
687 @cindex @sc{ms-dos}/@sc{ms}-Windows binary files
688 @cindex binary files, @sc{ms-dos}/@sc{ms}-Windows
689 Treat the file(s) as binary.
690 By default, under @sc{ms-dos} and @sc{ms}-Windows,
691 @command{grep} guesses the file type
692 by looking at the contents of the first 32kB read from the file.
693 If @command{grep} decides the file is a text file,
694 it strips the @code{CR} characters from the original file contents
695 (to make regular expressions with @code{^} and @code{$} work correctly).
696 Specifying @samp{-U} overrules this guesswork,
697 causing all files to be read and passed to the matching mechanism verbatim;
698 if the file is a text file with @code{CR/LF} pairs at the end of each line,
699 this will cause some regular expressions to fail.
700 This option has no effect
701 on platforms other than @sc{ms-dos} and @sc{ms}-Windows.
707 @cindex zero-terminated lines
708 Treat the input as a set of lines, each terminated by a zero byte (the
709 @sc{ascii} @code{NUL} character) instead of a newline.
710 Like the @samp{-Z} or @samp{--null} option,
711 this option can be used with commands like
712 @samp{sort -z} to process arbitrary file names.
716 @node Environment Variables
717 @section Environment Variables
719 The behavior of @command{grep} is affected
720 by the following environment variables.
722 The locale for category @w{@code{LC_@var{foo}}}
723 is specified by examining the three environment variables
724 @env{LC_ALL}, @w{@env{LC_@var{foo}}}, and @env{LANG},
726 The first of these variables that is set specifies the locale.
727 For example, if @env{LC_ALL} is not set,
728 but @env{LC_MESSAGES} is set to @samp{pt_BR},
729 then the Brazilian Portuguese locale is used
730 for the @code{LC_MESSAGES} category.
731 The @samp{C} locale is used if none of these environment variables are set,
732 if the locale catalog is not installed,
733 or if @command{grep} was not compiled
734 with national language support (@sc{nls}).
736 @cindex environment variables
741 @vindex GREP_OPTIONS @r{environment variable}
742 @cindex default options environment variable
743 This variable specifies default options to be placed in front of any
745 For example, if @code{GREP_OPTIONS} is
746 @samp{--binary-files=without-match --directories=skip}, @command{grep}
747 behaves as if the two options @samp{--binary-files=without-match} and
748 @samp{--directories=skip} had been specified before
749 any explicit options.
750 Option specifications are separated by
752 A backslash escapes the next character, so it can be used to
753 specify an option containing whitespace or a backslash.
756 @vindex GREP_COLOR @r{environment variable}
757 @cindex highlight markers
758 This variable specifies the color used to highlight matched (non-empty) text.
759 It is deprecated in favor of @code{GREP_COLORS}, but still supported.
760 The @samp{mt}, @samp{ms}, and @samp{mc} capabilities of @code{GREP_COLORS}
761 have priority over it.
762 It can only specify the color used to highlight
763 the matching non-empty text in any matching line
764 (a selected line when the @samp{-v} command-line option is omitted,
765 or a context line when @samp{-v} is specified).
766 The default is @samp{01;31},
767 which means a bold red foreground text on the terminal's default background.
770 @vindex GREP_COLORS @r{environment variable}
771 @cindex highlight markers
772 This variable specifies the colors and other attributes
773 used to highlight various parts of the output.
774 Its value is a colon-separated list of capabilities
775 that defaults to @samp{ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36}
776 with the @samp{rv} and @samp{ne} boolean capabilities omitted (i.e., false).
777 Supported capabilities are as follows.
781 @vindex sl GREP_COLORS @r{capability}
782 SGR substring for whole selected lines
784 matching lines when the @samp{-v} command-line option is omitted,
785 or non-matching lines when @samp{-v} is specified).
786 If however the boolean @samp{rv} capability
787 and the @samp{-v} command-line option are both specified,
788 it applies to context matching lines instead.
789 The default is empty (i.e., the terminal's default color pair).
792 @vindex cx GREP_COLORS @r{capability}
793 SGR substring for whole context lines
795 non-matching lines when the @samp{-v} command-line option is omitted,
796 or matching lines when @samp{-v} is specified).
797 If however the boolean @samp{rv} capability
798 and the @samp{-v} command-line option are both specified,
799 it applies to selected non-matching lines instead.
800 The default is empty (i.e., the terminal's default color pair).
803 @vindex rv GREP_COLORS @r{capability}
804 Boolean value that reverses (swaps) the meanings of
805 the @samp{sl=} and @samp{cx=} capabilities
806 when the @samp{-v} command-line option is specified.
807 The default is false (i.e., the capability is omitted).
810 @vindex mt GREP_COLORS @r{capability}
811 SGR substring for matching non-empty text in any matching line
813 a selected line when the @samp{-v} command-line option is omitted,
814 or a context line when @samp{-v} is specified).
815 Setting this is equivalent to setting both @samp{ms=} and @samp{mc=}
816 at once to the same value.
817 The default is a bold red text foreground over the current line background.
820 @vindex ms GREP_COLORS @r{capability}
821 SGR substring for matching non-empty text in a selected line.
822 (This is only used when the @samp{-v} command-line option is omitted.)
823 The effect of the @samp{sl=} (or @samp{cx=} if @samp{rv}) capability
824 remains active when this kicks in.
825 The default is a bold red text foreground over the current line background.
828 @vindex mc GREP_COLORS @r{capability}
829 SGR substring for matching non-empty text in a context line.
830 (This is only used when the @samp{-v} command-line option is specified.)
831 The effect of the @samp{cx=} (or @samp{sl=} if @samp{rv}) capability
832 remains active when this kicks in.
833 The default is a bold red text foreground over the current line background.
836 @vindex fn GREP_COLORS @r{capability}
837 SGR substring for file names prefixing any content line.
838 The default is a magenta text foreground over the terminal's default background.
841 @vindex ln GREP_COLORS @r{capability}
842 SGR substring for line numbers prefixing any content line.
843 The default is a green text foreground over the terminal's default background.
846 @vindex bn GREP_COLORS @r{capability}
847 SGR substring for byte offsets prefixing any content line.
848 The default is a green text foreground over the terminal's default background.
851 @vindex fn GREP_COLORS @r{capability}
852 SGR substring for separators that are inserted
853 between selected line fields (@samp{:}),
854 between context line fields (@samp{-}),
855 and between groups of adjacent lines
856 when nonzero context is specified (@samp{--}).
857 The default is a cyan text foreground over the terminal's default background.
860 @vindex ne GREP_COLORS @r{capability}
861 Boolean value that prevents clearing to the end of line
862 using Erase in Line (EL) to Right (@samp{\33[K})
863 each time a colorized item ends.
864 This is needed on terminals on which EL is not supported.
865 It is otherwise useful on terminals
866 for which the @code{back_color_erase}
867 (@code{bce}) boolean terminfo capability does not apply,
868 when the chosen highlight colors do not affect the background,
869 or when EL is too slow or causes too much flicker.
870 The default is false (i.e., the capability is omitted).
873 Note that boolean capabilities have no @samp{=}... part.
874 They are omitted (i.e., false) by default and become true when specified.
876 See the Select Graphic Rendition (SGR) section
877 in the documentation of your text terminal
878 for permitted values and their meaning as character attributes.
879 These substring values are integers in decimal representation
880 and can be concatenated with semicolons.
881 @command{grep} takes care of assembling the result
882 into a complete SGR sequence (@samp{\33[}...@samp{m}).
883 Common values to concatenate include
885 @samp{4} for underline,
887 @samp{7} for inverse,
888 @samp{39} for default foreground color,
889 @samp{30} to @samp{37} for foreground colors,
890 @samp{90} to @samp{97} for 16-color mode foreground colors,
891 @samp{38;5;0} to @samp{38;5;255}
892 for 88-color and 256-color modes foreground colors,
893 @samp{49} for default background color,
894 @samp{40} to @samp{47} for background colors,
895 @samp{100} to @samp{107} for 16-color mode background colors,
896 and @samp{48;5;0} to @samp{48;5;255}
897 for 88-color and 256-color modes background colors.
902 @vindex LC_ALL @r{environment variable}
903 @vindex LC_COLLATE @r{environment variable}
904 @vindex LANG @r{environment variable}
905 @cindex character type
906 @cindex national language support
908 These variables specify the locale for the @code{LC_COLLATE} category,
909 which determines the collating sequence
910 used to interpret range expressions like @samp{[a-z]}.
915 @vindex LC_ALL @r{environment variable}
916 @vindex LC_CTYPE @r{environment variable}
917 @vindex LANG @r{environment variable}
918 These variables specify the locale for the @code{LC_CTYPE} category,
919 which determines the type of characters,
920 e.g., which characters are whitespace.
925 @vindex LC_ALL @r{environment variable}
926 @vindex LC_MESSAGES @r{environment variable}
927 @vindex LANG @r{environment variable}
928 @cindex language of messages
929 @cindex message language
930 @cindex national language support
931 @cindex translation of message language
932 These variables specify the locale for the @code{LC_MESSAGES} category,
933 which determines the language that @command{grep} uses for messages.
934 The default @samp{C} locale uses American English messages.
936 @item POSIXLY_CORRECT
937 @vindex POSIXLY_CORRECT @r{environment variable}
938 If set, @command{grep} behaves as @sc{posix.2} requires; otherwise,
939 @command{grep} behaves more like other @sc{gnu} programs.
941 requires that options that
942 follow file names must be treated as file names;
944 such options are permuted to the front of the operand list
945 and are treated as options.
946 Also, @code{POSIXLY_CORRECT} disables special handling of an
947 invalid bracket expression. @xref{invalid-bracket-expr}.
949 @item _@var{N}_GNU_nonoption_argv_flags_
950 @vindex _@var{N}_GNU_nonoption_argv_flags_ @r{environment variable}
951 (Here @code{@var{N}} is @command{grep}'s numeric process ID.)
952 If the @var{i}th character of this environment variable's value is @samp{1},
953 do not consider the @var{i}th operand of @command{grep} to be an option,
954 even if it appears to be one.
955 A shell can put this variable in the environment for each command it runs,
956 specifying which operands are the results of file name wildcard expansion
957 and therefore should not be treated as options.
958 This behavior is available only with the @sc{gnu} C library,
959 and only when @code{POSIXLY_CORRECT} is not set.
967 @cindex return status
969 Normally, the exit status is 0 if selected lines are found and 1 otherwise.
970 But the exit status is 2 if an error occurred, unless the @option{-q} or
971 @option{--quiet} or @option{--silent} option is used and a selected line
973 Note, however, that @sc{posix} only mandates,
974 for programs such as @command{grep}, @command{cmp}, and @command{diff},
975 that the exit status in case of error be greater than 1;
976 it is therefore advisable, for the sake of portability,
977 to use logic that tests for this general condition
978 instead of strict equality with@ 2.
982 @section @command{grep} Programs
983 @cindex @command{grep} programs
984 @cindex variants of @command{gerp}
986 @command{grep} searches the named input files
987 (or standard input if no files are named,
988 or the file name @file{-} is given)
989 for lines containing a match to the given pattern.
990 By default, @command{grep} prints the matching lines.
991 There are four major variants of @command{grep},
992 controlled by the following options.
997 @itemx --basic-regexp
999 @opindex --basic-regexp
1000 @cindex matching basic regular expressions
1001 Interpret the pattern as a basic regular expression (BRE).
1002 This is the default.
1005 @itemx --extended-regexp
1007 @opindex --extended-regexp
1008 @cindex matching extended regular expressions
1009 Interpret the pattern as an extended regular expression (ERE).
1010 (@samp{-E} is specified by @sc{posix}.)
1013 @itemx --fixed-strings
1015 @opindex --fixed-strings
1016 @cindex matching fixed strings
1017 Interpret the pattern as a list of fixed strings, separated
1018 by newlines, any of which is to be matched.
1019 (@samp{-F} is specified by @sc{posix}.)
1022 @itemx --perl-regexp
1024 @opindex --perl-regexp
1025 @cindex matching Perl regular expressions
1026 Interpret the pattern as a Perl regular expression.
1027 This is highly experimental and
1028 @samp{grep@ -P} may warn of unimplemented features.
1033 two variant programs @command{egrep} and @command{fgrep} are available.
1034 @command{egrep} is the same as @samp{grep@ -E}.
1035 @command{fgrep} is the same as @samp{grep@ -F}.
1036 Direct invocation as either
1037 @command{egrep} or @command{fgrep} is deprecated,
1038 but is provided to allow historical applications
1039 that rely on them to run unmodified.
1042 @node Regular Expressions
1043 @chapter Regular Expressions
1044 @cindex regular expressions
1046 A @dfn{regular expression} is a pattern that describes a set of strings.
1047 Regular expressions are constructed analogously to arithmetic expressions,
1048 by using various operators to combine smaller expressions.
1049 @command{grep} understands
1050 three different versions of regular expression syntax:
1051 ``basic,'' (BRE) ``extended'' (ERE) and ``perl''.
1052 In @sc{gnu} @command{grep},
1053 there is no difference in available functionality between basic and
1055 In other implementations, basic regular expressions are less powerful.
1056 The following description applies to extended regular expressions;
1057 differences for basic regular expressions are summarized afterwards.
1058 Perl regular expressions give additional functionality, and are
1059 documented in pcresyntax(3) and pcrepattern(3), but may not be
1060 available on every system.
1063 * Fundamental Structure::
1064 * Character Classes and Bracket Expressions::
1065 * The Backslash Character and Special Expressions::
1067 * Back-references and Subexpressions::
1068 * Basic vs Extended::
1071 @node Fundamental Structure
1072 @section Fundamental Structure
1074 The fundamental building blocks are the regular expressions that match
1076 Most characters, including all letters and digits,
1077 are regular expressions that match themselves.
1079 with special meaning may be quoted by preceding it with a backslash.
1081 A regular expression may be followed by one of several
1082 repetition operators:
1090 The period @samp{.} matches any single character.
1094 @cindex question mark
1095 @cindex match expression at most once
1096 The preceding item is optional and will be matched at most once.
1101 @cindex match expression zero or more times
1102 The preceding item will be matched zero or more times.
1107 @cindex match expression one or more times
1108 The preceding item will be matched one or more times.
1111 @opindex @{@var{n}@}
1112 @cindex braces, one argument
1113 @cindex match expression @var{n} times
1114 The preceding item is matched exactly @var{n} times.
1117 @opindex @{@var{n},@}
1118 @cindex braces, second argument omitted
1119 @cindex match expression @var{n} or more times
1120 The preceding item is matched @var{n} or more times.
1123 @opindex @{,@var{m}@}
1124 @cindex braces, first argument omitted
1125 @cindex match expression at most @var{m} times
1126 The preceding item is matched at most @var{m} times.
1128 @item @{@var{n},@var{m}@}
1129 @opindex @{@var{n},@var{m}@}
1130 @cindex braces, two arguments
1131 @cindex match expression from @var{n} to @var{m} times
1132 The preceding item is matched at least @var{n} times, but not more than
1137 Two regular expressions may be concatenated;
1138 the resulting regular expression
1139 matches any string formed by concatenating two substrings
1140 that respectively match the concatenated expressions.
1142 Two regular expressions may be joined by the infix operator @samp{|};
1143 the resulting regular expression
1144 matches any string matching either alternalte expression.
1146 Repetition takes precedence over concatenation,
1147 which in turn takes precedence over alternation.
1148 A whole expression may be enclosed in parentheses
1149 to override these precedence rules and form a subexpression.
1151 @node Character Classes and Bracket Expressions
1152 @section Character Classes and Bracket Expressions
1154 @cindex bracket expression
1155 @cindex character class
1156 A @dfn{bracket expression} is a list of characters enclosed by @samp{[} and
1158 It matches any single character in that list;
1159 if the first character of the list is the caret @samp{^},
1160 then it matches any character @strong{not} in the list.
1161 For example, the regular expression
1162 @samp{[0123456789]} matches any single digit.
1164 @cindex range expression
1165 Within a bracket expression, a @dfn{range expression} consists of two
1166 characters separated by a hyphen.
1167 It matches any single character that
1168 sorts between the two characters, inclusive, using the locale's
1169 collating sequence and character set.
1170 For example, in the default C
1171 locale, @samp{[a-d]} is equivalent to @samp{[abcd]}.
1173 characters in dictionary order, and in these locales @samp{[a-d]} is
1174 typically not equivalent to @samp{[abcd]};
1175 it might be equivalent to @samp{[aBbCcDd]}, for example.
1176 To obtain the traditional interpretation
1177 of bracket expressions, you can use the @samp{C} locale by setting the
1178 @env{LC_ALL} environment variable to the value @samp{C}.
1180 Finally, certain named classes of characters are predefined within
1181 bracket expressions, as follows.
1182 Their interpretation depends on the @code{LC_CTYPE} locale;
1183 the interpretation below is that of the @samp{C} locale,
1184 which is the default if no @code{LC_CTYPE} locale is specified.
1186 @cindex classes of characters
1187 @cindex character classes
1191 @opindex alnum @r{character class}
1192 @cindex alphanumeric characters
1193 Alphanumeric characters:
1194 @samp{[:alpha:]} and @samp{[:digit:]}.
1197 @opindex alpha @r{character class}
1198 @cindex alphabetic characters
1199 Alphabetic characters:
1200 @samp{[:lower:]} and @samp{[:upper:]}.
1203 @opindex blank @r{character class}
1204 @cindex blank characters
1209 @opindex cntrl @r{character class}
1210 @cindex control characters
1212 In @sc{ascii}, these characters have octal codes 000
1213 through 037, and 177 (@code{DEL}).
1214 In other character sets, these are
1215 the equivalent characters, if any.
1218 @opindex digit @r{character class}
1219 @cindex digit characters
1220 @cindex numeric characters
1221 Digits: @code{0 1 2 3 4 5 6 7 8 9}.
1224 @opindex graph @r{character class}
1225 @cindex graphic characters
1226 Graphical characters:
1227 @samp{[:alnum:]} and @samp{[:punct:]}.
1230 @opindex lower @r{character class}
1231 @cindex lower-case letters
1233 @code{a b c d e f g h i j k l m n o p q r s t u v w x y z}.
1236 @opindex print @r{character class}
1237 @cindex printable characters
1238 Printable characters:
1239 @samp{[:alnum:]}, @samp{[:punct:]}, and space.
1242 @opindex punct @r{character class}
1243 @cindex punctuation characters
1244 Punctuation characters:
1245 @code{!@: " # $ % & ' ( ) * + , - .@: / : ; < = > ?@: @@ [ \ ] ^ _ ` @{ | @} ~}.
1248 @opindex space @r{character class}
1249 @cindex space characters
1250 @cindex whitespace characters
1252 tab, newline, vertical tab, form feed, carriage return, and space.
1253 @xref{Usage}, for more discussion of matching newlines.
1256 @opindex upper @r{character class}
1257 @cindex upper-case letters
1259 @code{A B C D E F G H I J K L M N O P Q R S T U V W X Y Z}.
1262 @opindex xdigit @r{character class}
1263 @cindex xdigit class
1264 @cindex hexadecimal digits
1266 @code{0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f}.
1269 For example, @samp{[[:alnum:]]} means @samp{[0-9A-Za-z]}, except the latter
1270 depends upon the @samp{C} locale and the @sc{ascii} character
1271 encoding, whereas the former is independent of locale and character set.
1272 (Note that the brackets in these class names are
1273 part of the symbolic names, and must be included in addition to
1274 the brackets delimiting the bracket expression.)
1276 @anchor{invalid-bracket-expr}
1277 If you mistakenly omit the outer brackets, and search for say, @samp{[:upper:]},
1278 GNU @command{grep} prints a diagnostic and exits with status 2, on
1279 the assumption that you did not intend to search for the nominally
1280 equivalent regular expression: @samp{[:epru]}.
1281 Set the @code{POSIXLY_CORRECT} environment variable to disable this feature.
1283 Most meta-characters lose their special meaning inside bracket expressions.
1287 ends the bracket expression if it's not the first list item.
1288 So, if you want to make the @samp{]} character a list item,
1289 you must put it first.
1292 represents the open collating symbol.
1295 represents the close collating symbol.
1298 represents the open equivalence class.
1301 represents the close equivalence class.
1304 represents the open character class symbol, and should be followed by a valid character class name.
1307 represents the close character class symbol.
1310 represents the range if it's not first or last in a list or the ending point
1314 represents the characters not in the list.
1315 If you want to make the @samp{^}
1316 character a list item, place it anywhere but first.
1320 @node The Backslash Character and Special Expressions
1321 @section The Backslash Character and Special Expressions
1324 The @samp{\} character,
1325 when followed by certain ordinary characters,
1326 takes a special meaning:
1331 Match the empty string at the edge of a word.
1334 Match the empty string provided it's not at the edge of a word.
1337 Match the empty string at the beginning of word.
1340 Match the empty string at the end of word.
1343 Match word constituent, it is a synonym for @samp{[[:alnum:]]}.
1346 Match non-word constituent, it is a synonym for @samp{[^[:alnum:]]}.
1349 Match whitespace, it is a synonym for @samp{[[:space:]]}.
1352 Match non-whitespace, it is a synonym for @samp{[^[:space:]]}.
1356 For example, @samp{\brat\b} matches the separate word @samp{rat},
1357 @samp{\Brat\B} matches @samp{crate} but not @samp{furry rat}.
1363 The caret @samp{^} and the dollar sign @samp{$} are meta-characters that
1364 respectively match the empty string at the beginning and end of a line.
1366 @node Back-references and Subexpressions
1367 @section Back-references and Subexpressions
1368 @cindex subexpression
1369 @cindex back-reference
1371 The back-reference @samp{\@var{n}}, where @var{n} is a single digit, matches
1372 the substring previously matched by the @var{n}th parenthesized subexpression
1373 of the regular expression.
1374 For example, @samp{(a)\1} matches @samp{aa}.
1375 When used with alternation, if the group does not participate in the match then
1376 the back-reference makes the whole match fail.
1377 For example, @samp{a(.)|b\1}
1378 will not match @samp{ba}.
1379 When multiple regular expressions are given with
1380 @samp{-e} or from a file (@samp{-f file}),
1381 back-references are local to each expression.
1383 @node Basic vs Extended
1384 @section Basic vs Extended Regular Expressions
1385 @cindex basic regular expressions
1387 In basic regular expressions the meta-characters @samp{?}, @samp{+},
1388 @samp{@{}, @samp{|}, @samp{(}, and @samp{)} lose their special meaning;
1389 instead use the backslashed versions @samp{\?}, @samp{\+}, @samp{\@{},
1390 @samp{\|}, @samp{\(}, and @samp{\)}.
1392 @cindex interval specifications
1393 Traditional @command{egrep} did not support the @samp{@{} meta-character,
1394 and some @command{egrep} implementations support @samp{\@{} instead, so
1395 portable scripts should avoid @samp{@{} in @samp{grep@ -E} patterns and
1396 should use @samp{[@{]} to match a literal @samp{@{}.
1398 @sc{gnu} @command{grep@ -E} attempts to support traditional usage by
1399 assuming that @samp{@{} is not special if it would be the start of an
1400 invalid interval specification.
1401 For example, the command
1402 @samp{grep@ -E@ '@{1'} searches for the two-character string @samp{@{1}
1403 instead of reporting a syntax error in the regular expression.
1404 @sc{posix.2} allows this behavior as an extension, but portable scripts
1411 @cindex usage, examples
1412 Here is an example command that invokes @sc{gnu} @command{grep}:
1415 grep -i 'hello.*world' menu.h main.c
1419 This lists all lines in the files @file{menu.h} and @file{main.c} that
1420 contain the string @samp{hello} followed by the string @samp{world};
1421 this is because @samp{.*} matches zero or more characters within a line.
1422 @xref{Regular Expressions}.
1423 The @samp{-i} option causes @command{grep}
1424 to ignore case, causing it to match the line @samp{Hello, world!}, which
1425 it would not otherwise match.
1426 @xref{Invoking}, for more details about
1427 how to invoke @command{grep}.
1429 @cindex using @command{grep}, Q&A
1430 @cindex FAQ about @command{grep} usage
1431 Here are some common questions and answers about @command{grep} usage.
1436 How can I list just the names of matching files?
1443 lists the names of all C files in the current directory whose contents
1444 mention @samp{main}.
1447 How do I search directories recursively?
1450 grep -r 'hello' /home/gigi
1454 searches for @samp{hello} in all files
1455 under the @file{/home/gigi} directory.
1456 For more control over which files are searched,
1457 use @command{find}, @command{grep}, and @command{xargs}.
1458 For example, the following command searches only C files:
1461 find /home/gigi -name '*.c' -print0 | xargs -0r grep -H 'hello'
1464 This differs from the command:
1467 grep -rH 'hello' *.c
1470 which merely looks for @samp{hello} in all files in the current
1471 directory whose names end in @samp{.c}.
1472 Here the @option{-r} is
1473 probably unnecessary, as recursion occurs only in the unlikely event
1474 that one of @samp{.c} files is a directory.
1475 The @samp{find ...} command line above is more similar to the command:
1478 grep -rH --include='*.c' 'hello' /home/gigi
1482 What if a pattern has a leading @samp{-}?
1485 grep -e '--cut here--' *
1489 searches for all lines matching @samp{--cut here--}.
1491 @command{grep} would attempt to parse @samp{--cut here--} as a list of
1495 Suppose I want to search for a whole word, not a part of a word?
1502 searches only for instances of @samp{hello} that are entire words;
1503 it does not match @samp{Othello}.
1504 For more control, use @samp{\<} and
1505 @samp{\>} to match the start and end of words.
1513 searches only for words ending in @samp{hello}, so it matches the word
1517 How do I output context around the matching lines?
1524 prints two lines of context around each matching line.
1527 How do I force @command{grep} to print the name of the file?
1529 Append @file{/dev/null}:
1532 grep 'eli' /etc/passwd /dev/null
1538 /etc/passwd:eli:x:2098:1000:Eli Smith:/home/eli:/bin/bash
1541 Alternatively, use @samp{-H}, which is a @sc{gnu} extension:
1544 grep -H 'eli' /etc/passwd
1548 Why do people use strange regular expressions on @command{ps} output?
1551 ps -ef | grep '[c]ron'
1554 If the pattern had been written without the square brackets, it would
1555 have matched not only the @command{ps} output line for @command{cron},
1556 but also the @command{ps} output line for @command{grep}.
1557 Note that on some platforms,
1558 @command{ps} limits the output to the width of the screen;
1559 @command{grep} does not have any limit on the length of a line
1560 except the available memory.
1563 Why does @command{grep} report ``Binary file matches''?
1565 If @command{grep} listed all matching ``lines'' from a binary file, it
1566 would probably generate output that is not useful, and it might even
1567 muck up your display.
1568 So @sc{gnu} @command{grep} suppresses output from
1569 files that appear to be binary files.
1570 To force @sc{gnu} @command{grep}
1571 to output lines even from files that appear to be binary, use the
1572 @samp{-a} or @samp{--binary-files=text} option.
1574 ``Binary file matches'' messages, use the @samp{-I} or
1575 @samp{--binary-files=without-match} option.
1578 Why doesn't @samp{grep -lv} print non-matching file names?
1580 @samp{grep -lv} lists the names of all files containing one or more
1581 lines that do not match.
1582 To list the names of all files that contain no
1583 matching lines, use the @samp{-L} or @samp{--files-without-match}
1587 I can do @sc{or} with @samp{|}, but what about @sc{and}?
1590 grep 'paul' /etc/motd | grep 'franc,ois'
1594 finds all lines that contain both @samp{paul} and @samp{franc,ois}.
1597 How can I search in both standard input and in files?
1599 Use the special file name @samp{-}:
1602 cat /etc/passwd | grep 'alain' - /etc/motd
1607 How to express palindromes in a regular expression?
1609 It can be done by using back-references;
1611 a palindrome of 4 characters can be written with a BRE:
1614 grep -w -e '\(.\)\(.\).\2\1' file
1617 It matches the word "radar" or "civic".
1619 Guglielmo Bondioni proposed a single RE
1620 that finds all palindromes up to 19 characters long
1621 using @w{9 subexpressions} and @w{9 back-references}:
1624 grep -E -e '^(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?).?\9\8\7\6\5\4\3\2\1$' file
1627 Note this is done by using @sc{gnu} ERE extensions;
1628 it might not be portable to other implementations of @command{grep}.
1631 Why is this back-reference failing?
1634 echo 'ba' | grep -E '(a)\1|b\1'
1637 This gives no output, because the first alternate @samp{(a)\1} does not match,
1638 as there is no @samp{aa} in the input, so the @samp{\1} in the second alternate
1639 has nothing to refer back to, meaning it will never match anything.
1640 (The second alternate in this example can only match
1641 if the first alternate has matched---making the second one superfluous.)
1644 How can I match across lines?
1646 Standard grep cannot do this, as it is fundamentally line-based.
1647 Therefore, merely using the @code{[:space:]} character class does not
1648 match newlines in the way you might expect. However, if your grep is
1649 compiled with Perl patterns enabled, the Perl @samp{s}
1650 modifier (which makes @code{.} match newlines) can be used:
1653 printf 'foo\nbar\n' | grep -P '(?s)foo.*?bar'
1656 With the GNU @command{grep} option @code{-z} (@pxref{File and
1657 Directory Selection}), the input is terminated by null bytes. Thus,
1658 you can match newlines in the input, but the output will be the whole
1659 file, so this is really only useful to determine if the pattern is
1663 printf 'foo\nbar\n' | grep -z -q 'foo[[:space:]]\+bar'
1666 Failing either of those options, you need to transform the input
1667 before giving it to @command{grep}, or turn to @command{awk},
1668 @command{sed}, @command{perl}, or many other utilities that are
1669 designed to operate across lines.
1672 What do @command{grep}, @command{fgrep}, and @command{egrep} stand for?
1674 The name @command{grep} comes from the way line editing was done on Unix.
1676 @command{ed} uses the following syntax
1677 to print a list of matching lines on the screen:
1680 global/regular expression/print
1684 @command{fgrep} stands for Fixed @command{grep};
1685 @command{egrep} stands for Extended @command{grep}.
1690 @node Reporting Bugs
1691 @chapter Reporting bugs
1693 @cindex bugs, reporting
1694 Email bug reports to @email{bug-grep@@gnu.org},
1695 a mailing list whose web page is
1696 @url{http://lists.gnu.org/mailman/listinfo/bug-grep}.
1697 The Savannah bug tracker for @command{grep} is located at
1698 @url{http://savannah.gnu.org/bugs/?group=grep}.
1703 Large repetition counts in the @samp{@{n,m@}} construct may cause
1704 @command{grep} to use lots of memory.
1705 In addition, certain other
1706 obscure regular expressions require exponential time and
1707 space, and may cause @command{grep} to run out of memory.
1709 Back-references are very slow, and may require exponential time.
1716 GNU grep is licensed under the GNU GPL, which makes it @dfn{free
1719 The ``free'' in ``free software'' refers to liberty, not price. As
1720 some GNU project advocates like to point out, think of ``free speech''
1721 rather than ``free beer''. In short, you have the right (freedom) to
1722 run and change grep and distribute it to other people, and---if you
1723 want---charge money for doing either. The important restriction is
1724 that you have to grant your recipients the same rights and impose the
1727 This general method of licensing software is sometimes called
1728 @dfn{open source}. The GNU project prefers the term ``free software''
1729 for reasons outlined at
1730 @url{http://www.gnu.org/philosophy/open-source-misses-the-point.html}.
1732 This manual is free documentation in the same sense. The
1733 documentation license is included below. The license for the program
1734 is available with the source code, or at
1735 @url{http://www.gnu.org/licenses/gpl.html}.
1738 * GNU Free Documentation License::
1741 @node GNU Free Documentation License
1742 @section GNU Free Documentation License