summaryrefslogtreecommitdiff
path: root/grep.h
AgeCommit message (Collapse)AuthorFilesLines
2011-12-16grep: enable threading with -p and -W using lazy attribute lookupLibravatar Thomas Rast1-0/+10
Lazily load the userdiff attributes in match_funcname(). Use a separate mutex around this loading to protect the (not thread-safe) attributes machinery. This lets us re-enable threading with -p and -W while reducing the overhead caused by looking up attributes. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-20Use kwset in grepLibravatar Fredrik Kuivinen1-0/+2
Benchmarks for the hot cache case: before: $ perf stat --repeat=5 git grep qwerty > /dev/null Performance counter stats for 'git grep qwerty' (5 runs): 3,478,085 cache-misses # 2.322 M/sec ( +- 2.690% ) 11,356,177 cache-references # 7.582 M/sec ( +- 2.598% ) 3,872,184 branch-misses # 0.363 % ( +- 0.258% ) 1,067,367,848 branches # 712.673 M/sec ( +- 2.622% ) 3,828,370,782 instructions # 0.947 IPC ( +- 0.033% ) 4,043,832,831 cycles # 2700.037 M/sec ( +- 0.167% ) 8,518 page-faults # 0.006 M/sec ( +- 3.648% ) 847 CPU-migrations # 0.001 M/sec ( +- 3.262% ) 6,546 context-switches # 0.004 M/sec ( +- 2.292% ) 1497.695495 task-clock-msecs # 3.303 CPUs ( +- 2.550% ) 0.453394396 seconds time elapsed ( +- 0.912% ) after: $ perf stat --repeat=5 git grep qwerty > /dev/null Performance counter stats for 'git grep qwerty' (5 runs): 2,989,918 cache-misses # 3.166 M/sec ( +- 5.013% ) 10,986,041 cache-references # 11.633 M/sec ( +- 4.899% ) (scaled from 95.06%) 3,511,993 branch-misses # 1.422 % ( +- 0.785% ) 246,893,561 branches # 261.433 M/sec ( +- 3.967% ) 1,392,727,757 instructions # 0.564 IPC ( +- 0.040% ) 2,468,142,397 cycles # 2613.494 M/sec ( +- 0.110% ) 7,747 page-faults # 0.008 M/sec ( +- 3.995% ) 897 CPU-migrations # 0.001 M/sec ( +- 2.383% ) 6,535 context-switches # 0.007 M/sec ( +- 1.993% ) 944.384228 task-clock-msecs # 3.177 CPUs ( +- 0.268% ) 0.297257643 seconds time elapsed ( +- 0.450% ) So we gain about 35% by using the kwset code. As a side effect of using kwset two grep tests are fixed by this patch. The first is fixed because kwset can deal with case-insensitive search containing NULs, something strcasestr cannot do. The second one is fixed because we consider patterns containing NULs as fixed strings (regcomp cannot accept patterns with NULs). Signed-off-by: Fredrik Kuivinen <frekui@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-01grep: add option to show whole function as contextLibravatar René Scharfe1-0/+1
Add a new option, -W, to show the whole surrounding function of a match. It uses the same regular expressions as -p and diff to find the beginning of sections. Currently it will not display comments in front of a function, but those that are following one. Despite this shortcoming it is already useful, e.g. to simply see a more complete applicable context or to extract whole functions. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-05grep: add --headingLibravatar René Scharfe1-0/+1
With --heading, the filename is printed once before matches from that file instead of at the start of each line, giving more screen space to the actual search results. This option is taken from ack (http://betterthangrep.com/). And now git grep can dress up like it: $ git config alias.ack "grep --break --heading --line-number" $ git ack -e --heading Documentation/git-grep.txt 154:--heading:: t/t7810-grep.sh 785:test_expect_success 'grep --heading' ' 786: git grep --heading -e char -e lo_w hello.c hello_world >actual && 808: git grep --break --heading -n --color \ Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-05grep: add --breakLibravatar René Scharfe1-0/+1
With --break, an empty line is printed between matches from different files, increasing readability. This option is taken from ack (http://betterthangrep.com/). Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-09git-grep: Learn PCRELibravatar Michał Kiedrowicz1-0/+9
This patch teaches git-grep the --perl-regexp/-P options (naming borrowed from GNU grep) in order to allow specifying PCRE regexes on the command line. PCRE has a number of features which make them more handy to use than POSIX regexes, like consistent escaping rules, extended character classes, ungreedy matching etc. git isn't build with PCRE support automatically. USE_LIBPCRE environment variable must be enabled (like `make USE_LIBPCRE=YesPlease`). Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-13log --author: take union of multiple "author" requestsLibravatar Junio C Hamano1-0/+2
In the olden days, log --author=me --committer=him --grep=this --grep=that used to be turned into: (OR (HEADER-AUTHOR me) (HEADER-COMMITTER him) (PATTERN this) (PATTERN that)) showing my patches that do not have any "this" nor "that", which was totally useless. 80235ba ("log --author=me --grep=it" should find intersection, not union, 2010-01-17) improved it greatly to turn the same into: (ALL-MATCH (HEADER-AUTHOR me) (HEADER-COMMITTER him) (OR (PATTERN this) (PATTERN that))) That is, "show only patches by me and committed by him, that have either this or that", which is a lot more natural thing to ask. We however need to be a bit more clever when the user asks more than one "author" (or "committer"); because a commit has only one author (and one committer), they ought to be interpreted as asking for union to be useful. The current implementation simply added another author/committer pattern at the same top-level for ALL-MATCH to insist on matching all, finding nothing. Turn log --author=me --author=her \ --committer=him --committer=you \ --grep=this --grep=that into (ALL-MATCH (OR (HEADER-AUTHOR me) (HEADER-AUTHOR her)) (OR (HEADER-COMMITTER him) (HEADER-COMMITTER you)) (OR (PATTERN this) (PATTERN that))) instead. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-21Merge branch 'gv/portable'Libravatar Junio C Hamano1-4/+4
* gv/portable: test-lib: use DIFF definition from GIT-BUILD-OPTIONS build: propagate $DIFF to scripts Makefile: Tru64 portability fix Makefile: HP-UX 10.20 portability fixes Makefile: HPUX11 portability fixes Makefile: SunOS 5.6 portability fix inline declaration does not work on AIX Allow disabling "inline" Some platforms lack socklen_t type Make NO_{INET_NTOP,INET_PTON} configured independently Makefile: some platforms do not have hstrerror anywhere git-compat-util.h: some platforms with mmap() lack MAP_FAILED definition test_cmp: do not use "diff -u" on platforms that lack one fixup: do not unconditionally disable "diff -u" tests: use "test_cmp", not "diff", when verifying the result Do not use "diff" found on PATH while building and installing enums: omit trailing comma for portability Makefile: -lpthread may still be necessary when libc has only pthread stubs Rewrite dynamic structure initializations to runtime assignment Makefile: pass CPPFLAGS through to fllow customization Conflicts: Makefile wt-status.h
2010-05-31enums: omit trailing comma for portabilityLibravatar Gary V. Vaughan1-4/+4
Without this patch at least IBM VisualAge C 5.0 (I have 5.0.2) on AIX 5.1 fails to compile git. enum style is inconsistent already, with some enums declared on one line, some over 3 lines with the enum values all on the middle line, sometimes with 1 enum value per line... and independently of that the trailing comma is sometimes present and other times absent, often mixing with/without trailing comma styles in a single file, and sometimes in consecutive enum declarations. Clearly, omitting the comma is the more portable style, and this patch changes all enum declarations to use the portable omitted dangling comma style consistently. Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-24grep: support NUL chars in search strings for -FLibravatar René Scharfe1-0/+2
Search patterns in a file specified with -f can contain NUL characters. The current code ignores all characters on a line after a NUL. Pass the actual length of the line all the way from the pattern file to fixmatch() and use it for case-sensitive fixed string matching. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-20Merge branch 'ml/color-grep'Libravatar Junio C Hamano1-0/+6
* ml/color-grep: grep: Colorize selected, context, and function lines grep: Colorize filename, line number, and separator Add GIT_COLOR_BOLD_* and GIT_COLOR_BG_*
2010-03-08grep: Colorize selected, context, and function linesLibravatar Mark Lodato1-0/+3
Colorize non-matching text of selected lines, context lines, and function name lines. The default for all three is no color, but they can be configured using color.grep.<slot>. The first two are similar to the corresponding options in GNU grep, except that GNU grep applies the color to the entire line, not just non-matching text. Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-08grep: Colorize filename, line number, and separatorLibravatar Mark Lodato1-0/+3
Colorize the filename, line number, and separator in git grep output, as GNU grep does. The colors are customizable through color.grep.<slot>. The default is to only color the separator (in cyan), since this gives the biggest legibility increase without overwhelming the user with colors. GNU grep also defaults cyan for the separator, but defaults to magenta for the filename and to green for the line number, as well. There is one difference from GNU grep: When a binary file matches without -a, GNU grep does not color the <file> in "Binary file <file> matches", but we do. Like GNU grep, if --null is given, the null separators are not colored. For config.txt, use a a sub-list to describe the slots, rather than a single paragraph with parentheses, since this is much more readable. Remove the cast to int for `rm_eo - rm_so` since it is not necessary. Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-02Merge branch 'jc/grep-author-all-match-implicit'Libravatar Junio C Hamano1-0/+2
* jc/grep-author-all-match-implicit: "log --author=me --grep=it" should find intersection, not union
2010-01-26Threaded grepLibravatar Fredrik Kuivinen1-0/+6
Make git grep use threads when it is available. The results below are best of five runs in the Linux repository (on a box with two cores). With the patch: git grep qwerty 1.58user 0.55system 0:01.16elapsed 183%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+800outputs (0major+5774minor)pagefaults 0swaps Without: git grep qwerty 1.59user 0.43system 0:02.02elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+800outputs (0major+3716minor)pagefaults 0swaps And with a pattern with quite a few matches: With the patch: $ /usr/bin/time git grep void 5.61user 0.56system 0:03.44elapsed 179%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+800outputs (0major+5587minor)pagefaults 0swaps Without: $ /usr/bin/time git grep void 5.36user 0.51system 0:05.87elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+800outputs (0major+3693minor)pagefaults 0swaps In either case we gain about 40% by the threading. Signed-off-by: Fredrik Kuivinen <frekui@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-25"log --author=me --grep=it" should find intersection, not unionLibravatar Junio C Hamano1-0/+2
Historically, any grep filter in "git log" family of commands were taken as restricting to commits with any of the words in the commit log message. However, the user almost always want to find commits "done by this person on that topic". With "--all-match" option, a series of grep patterns can be turned into a requirement that all of them must produce a match, but that makes it impossible to ask for "done by me, on either this or that" with: log --author=me --committer=him --grep=this --grep=that because it will require both "this" and "that" to appear. Change the "header" parser of grep library to treat the headers specially, and parse it as: (all-match-OR (HEADER-AUTHOR me) (HEADER-COMMITTER him) (OR (PATTERN this) (PATTERN that) ) ) Even though the "log" command line parser doesn't give direct access to the extended grep syntax to group terms with parentheses, this change will cover the majority of the case the users would want. This incidentally revealed that one test in t7002 was bogus. It ran: log --author=Thor --grep=Thu --format='%s' and expected (wrongly) "Thu" to match "Thursday" in the author/committer date, but that would never match, as the timestamp in raw commit buffer does not have the name of the day-of-the-week. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-13grep: rip out support for external grepLibravatar Junio C Hamano1-1/+0
We still allow people to pass --[no-]ext-grep on the command line, but the option is ignored. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-16grep: Allow case insensitive search of fixed-stringsLibravatar Brian Collins1-0/+2
"git grep" currently an error when you combine the -F and -i flags. This isn't in line with how GNU grep handles it. This patch allows the simultaneous use of those flags. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Brian Collins <bricollins@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-13Merge branch 'maint'Libravatar Junio C Hamano1-0/+1
* maint: GIT 1.6.4.3 svn: properly escape arguments for authors-prog http.c: remove verification of remote packs grep: accept relative paths outside current working directory grep: fix exit status if external_grep() punts Conflicts: GIT-VERSION-GEN RelNotes
2009-09-13Merge branch 'cb/maint-1.6.3-grep-relative-up' into maintLibravatar Junio C Hamano1-0/+1
* cb/maint-1.6.3-grep-relative-up: grep: accept relative paths outside current working directory grep: fix exit status if external_grep() punts Conflicts: t/t7002-grep.sh
2009-09-07grep: accept relative paths outside current working directoryLibravatar Clemens Buchacher1-0/+1
"git grep" would barf at relative paths pointing outside the current working directory (or subdirectories thereof). Use quote_path_relative(), which can handle such cases just fine. [jc: added tests.] Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-22grep: Add --max-depth option.Libravatar Michał Kiedrowicz1-0/+1
It is useful to grep directories non-recursively, e.g. when one wants to look for all files in the toplevel directory, but not in any subdirectory, or in Documentation/, but not in Documentation/technical/. This patch adds support for --max-depth <depth> option to git-grep. If it is given, git-grep descends at most <depth> levels of directories below paths specified on the command line. Note that if path specified on command line contains wildcards, this option makes no sense, e.g. $ git grep -l --max-depth 0 GNU -- 'contrib/*' (note the quotes) will search all files in contrib/, even in subdirectories, because '*' matches all files. Documentation updates, bash-completion and simple test cases are also provided. Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01grep -p: support user defined regular expressionsLibravatar René Scharfe1-0/+1
Respect the userdiff attributes and config settings when looking for lines with function definitions in git grep -p. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01grep: add option -p/--show-functionLibravatar René Scharfe1-0/+1
The new option -p instructs git grep to print the previous function definition as a context line, similar to diff -p. Such context lines are marked with an equal sign instead of a dash. This option complements the existing context options -A, -B, -C. Function definitions are detected using the same heuristic that diff uses. User defined regular expressions are not supported, yet. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01grep: print context hunk marks between filesLibravatar René Scharfe1-0/+1
Print a hunk mark before matches from a new file are shown, in addition to the current behaviour of printing them if lines have been skipped. The result is easier to read, as (presumably unrelated) matches from different files are separated by a hunk mark. GNU grep does the same. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01grep: move context hunk mark handling into show_line()Libravatar René Scharfe1-0/+1
Move last_shown into struct grep_opt, to make it available in show_line(), and then make the function handle the printing of hunk marks for context lines in a central place. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-09grep: use parseoptLibravatar René Scharfe1-14/+14
Convert git-grep to parseopt. The bitfields in struct grep_opt are converted to full ints, increasing its size. This shouldn't be a problem as there is only a single instance in memory. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07grep: add support for coloring with external grepsLibravatar René Scharfe1-0/+1
Add the config variable color.grep.external, which can be used to switch on coloring of external greps. To enable auto coloring with GNU grep, one needs to set color.grep.external to --color=always to defeat the pager started by git grep. The value of the config variable will be passed to the external grep only if it would colorize internal grep's output, so automatic terminal detected works. The default is to not pass any option, because the external grep command could be a program without color support. Also set the environment variables GREP_COLOR and GREP_COLORS to pass the configured color for matches to the external grep. This works with GNU grep; other variables could be added as needed. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07grep: color patterns in outputLibravatar René Scharfe1-0/+3
Coloring matches makes them easier to spot in the output. Add two options and two parameters: color.grep (to turn coloring on or off), color.grep.match (to set the color of matches), --color and --no-color (to turn coloring on or off, respectively). The output of external greps is not changed. This patch is based on earlier ones by Nguyễn Thái Ngọc Duy and Thiago Alves. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07grep: remove grep_opt argument from match_expr_eval()Libravatar René Scharfe1-0/+1
The only use of the struct grep_opt argument of match_expr_eval() is to pass the option word_regexp to match_one_pattern(). By adding a pattern flag for it we can reduce the number of function arguments of these two functions, as a cleanup and preparation for adding more in the next patch. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-09grep: don't call regexec() for fixed stringsLibravatar René Scharfe1-0/+1
Add the new flag "fixed" to struct grep_pat and set it if the pattern is doesn't contain any regex control characters in addition to if the flag -F/--fixed-strings was specified. This gives a nice speed up on msysgit, where regexec() seems to be extra slow. Before (best of five runs): $ time git grep grep v1.6.1 >/dev/null real 0m0.552s user 0m0.000s sys 0m0.000s $ time git grep -F grep v1.6.1 >/dev/null real 0m0.170s user 0m0.000s sys 0m0.015s With the patch: $ time git grep grep v1.6.1 >/dev/null real 0m0.173s user 0m0.000s sys 0m0.000s The difference is much smaller on Linux, but still measurable. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-01git grep: Add "-z/--null" option as in GNU's grep.Libravatar Raphael Zimmerer1-0/+1
Here's a trivial patch that adds "-z" and "--null" options to "git grep". It was discussed on the mailing-list that git's "-z" convention should be used instead of GNU grep's "-Z". So things like 'git grep -l -z "$FOO" | xargs -0 sed -i "s/$FOO/$BOO/"' do work now. Signed-off-by: Raphael Zimmerer <killekulla@rdrz.de> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-09-04log --author/--committer: really match only with name partLibravatar Junio C Hamano1-0/+7
When we tried to find commits done by AUTHOR, the first implementation tried to pattern match a line with "^author .*AUTHOR", which later was enhanced to strip leading caret and look for "^author AUTHOR" when the search pattern was anchored at the left end (i.e. --author="^AUTHOR"). This had a few problems: * When looking for fixed strings (e.g. "git log -F --author=x --grep=y"), the regexp internally used "^author .*x" would never match anything; * To match at the end (e.g. "git log --author='google.com>$'"), the generated regexp has to also match the trailing timestamp part the commit header lines have. Also, in order to determine if the '$' at the end means "match at the end of the line" or just a literal dollar sign (probably backslash-quoted), we would need to parse the regexp ourselves. An earlier alternative tried to make sure that a line matches "^author " (to limit by field name) and the user supplied pattern at the same time. While it solved the -F problem by introducing a special override for matching the "^author ", it did not solve the trailing timestamp nor tail match problem. It also would have matched every commit if --author=author was asked for, not because the author's email part had this string, but because every commit header line that talks about the author begins with that field name, regardleses of who wrote it. Instead of piling more hacks on top of hacks, this rethinks the grep machinery that is used to look for strings in the commit header, and makes sure that (1) field name matches literally at the beginning of the line, followed by a SP, and (2) the user supplied pattern is matched against the remainder of the line, excluding the trailing timestamp data. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2006-09-27grep --all-matchLibravatar Junio C Hamano1-0/+2
This lets you say: git grep --all-match -e A -e B -e C to find lines that match A or B or C but limit the matches from the files that have all of A, B and C. This is different from git grep -e A --and -e B --and -e C in that the latter looks for a single line that has all of these at the same time. Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-27grep: free expressions and patterns when done.Libravatar Junio C Hamano1-0/+1
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-20Update grep internal for grepping only in head/bodyLibravatar Junio C Hamano1-0/+7
This further updates the built-in grep engine so that we can say something like "this pattern should match only in head". This can be used to simplify grepping in the log messages. Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-20builtin-grep: make pieces of it available as library.Libravatar Junio C Hamano1-0/+71
This makes three functions and associated option structures from builtin-grep available from other parts of the system. * options to drive built-in grep engine is stored in struct grep_opt; * pattern strings and extended grep expressions are added to struct grep_opt with append_grep_pattern(); * when finished calling append_grep_pattern(), call compile_grep_patterns() to prepare for execution; * call grep_buffer() to find matches in the in-core buffer. This also adds an internal option "status_only" to grep_opt, which suppresses any output from grep_buffer(). Callers of the function as library can use it to check if there is a match without producing any output. Signed-off-by: Junio C Hamano <junkio@cox.net>