Age | Commit message (Collapse) | Author | Files | Lines |
|
The command "git grep -w ''" dies as soon as it encounters an empty line,
reporting (wrongly) that "regexp returned nonsense". The first hunk of
this patch relaxes the sanity check that is responsible for that,
allowing matches to start at the end.
The second hunk complements it by making sure that empty matches are
rejected if -w was specified, as they are not really words.
GNU grep does the same:
$ echo foo | grep -c ''
1
$ echo foo | grep -c -w ''
0
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
If a zero-length match is encountered, break out of loop and show the rest
of the line uncoloured. Otherwise we'd be looping forever, trying to make
progress by advancing the pointer by zero characters.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
After bol is forwarded, it doesn't represent the beginning of the line
any more. This means that the beginning-of-line marker (^) mustn't match,
i.e. the regex flag REG_NOTBOL needs to be set.
This bug was introduced by fb62eb7fab97cea880ea7fe4f341a4dfad14ab48
("grep -w: forward to next possible position after rejected match").
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As noticed by Dmitry Gryazin: When a pattern is found but it doesn't
start and end at word boundaries, bol is forwarded to after the match and
the pattern is searched again. When a pattern is finally found between
word boundaries, the match offsets are off by the number of characters
that have been skipped.
This patch corrects the offsets to be relative to the value of bol as
passed to match_one_pattern() by its caller.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* maint:
grep: fix segfault when "git grep '('" is given
Documentation: fix a grammatical error in api-builtin.txt
builtin-merge: fix a typo in an error message
|
|
* maint-1.6.1:
grep: fix segfault when "git grep '('" is given
Documentation: fix a grammatical error in api-builtin.txt
builtin-merge: fix a typo in an error message
|
|
* maint-1.6.0:
grep: fix segfault when "git grep '('" is given
Documentation: fix a grammatical error in api-builtin.txt
builtin-merge: fix a typo in an error message
|
|
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Avoid a segfault when the command
git log --all-match
was issued, by ignoring the option.
Signed-off-by: Michele Ballabio <barra_cuda@katamail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
On some systems, regoff_t that is the type of rm_so/rm_eo members are
wider than int; %.*s precision specifier expects an int, so use an explicit
cast.
A breakage reported on Darwin by Brian Gernhardt should be fixed with
this patch.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Coloring matches makes them easier to spot in the output.
Add two options and two parameters: color.grep (to turn coloring on
or off), color.grep.match (to set the color of matches), --color
and --no-color (to turn coloring on or off, respectively).
The output of external greps is not changed.
This patch is based on earlier ones by Nguyễn Thái Ngọc Duy and
Thiago Alves.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Push pmatch and eflags to the callers of match_one_pattern(), which
allows them to specify regex execution flags and to get the location
of a match.
Since we only use the first element of the matches array and aren't
interested in submatches, no provision is made for callers to
provide a larger array.
eflags are ignored for fixed patterns, but that's OK, since they
only have a meaning in connection with regular expressions
containing ^ or $.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The only use of the struct grep_opt argument of match_expr_eval()
is to pass the option word_regexp to match_one_pattern(). By adding
a pattern flag for it we can reduce the number of function arguments
of these two functions, as a cleanup and preparation for adding more
in the next patch.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In addition to returning if an expression matches a line,
match_expr_eval() updates the expression's hit flag if the parameter
collect_hits is set. It never sets collect_hits for children of AND
nodes, though, so their hit flag will never be updated. Because of
that we can return early if the first child didn't match, no matter
if collect_hits is set or not.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add is_regex_special(), a character class macro for chars that have a
special meaning in regular expressions.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Replace isspecial() by the new macro is_glob_special(), which is more,
well, specialized. The former included the NUL char in its character
class, while the letter only included characters that are special to
file name globbing.
The new name contains underscores because they enhance readability
considerably now that it's made up of three words. Renaming the
function is necessary to document its changed scope.
The call sites of isspecial() are updated to check explicitly for NUL.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add the new flag "fixed" to struct grep_pat and set it if the pattern
is doesn't contain any regex control characters in addition to if the
flag -F/--fixed-strings was specified.
This gives a nice speed up on msysgit, where regexec() seems to be
extra slow. Before (best of five runs):
$ time git grep grep v1.6.1 >/dev/null
real 0m0.552s
user 0m0.000s
sys 0m0.000s
$ time git grep -F grep v1.6.1 >/dev/null
real 0m0.170s
user 0m0.000s
sys 0m0.015s
With the patch:
$ time git grep grep v1.6.1 >/dev/null
real 0m0.173s
user 0m0.000s
sys 0m0.000s
The difference is much smaller on Linux, but still measurable.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
grep -w accepts matches between non-word characters, only. If a match
from regexec() doesn't meet this criteria, grep continues its search
after the first character of that match.
We can be a bit smarter here and skip all positions that follow a word
character first, as they can't match our criteria. This way we can
consume characters quite cheaply and don't need to special-case the
handling of the beginning of a line.
Here's a contrived example command on msysgit (best of five runs):
$ time git grep -w ...... v1.6.1 >/dev/null
real 0m1.611s
user 0m0.000s
sys 0m0.015s
With the patch it's quite a bit faster:
$ time git grep -w ...... v1.6.1 >/dev/null
real 0m1.179s
user 0m0.000s
sys 0m0.015s
More common search patterns will gain a lot less, but it's a nice clean
up anyway.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
LF at the end of format strings given to die() is redundant because
die already adds one on its own.
Signed-off-by: Alexander Potashev <aspotashev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* maint:
Fix non-literal format in printf-style calls
git-submodule: Avoid printing a spurious message.
git ls-remote: make usage string match manpage
Makefile: help people who run 'make check' by mistake
|
|
These were found using gcc 4.3.2-1ubuntu11 with the warning:
warning: format not a string literal and no format arguments
Incorporated suggestions from Brandon Casey <casey@nrlssc.navy.mil>.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Here's a trivial patch that adds "-z" and "--null" options to "git
grep". It was discussed on the mailing-list that git's "-z"
convention should be used instead of GNU grep's "-Z".
So things like 'git grep -l -z "$FOO" | xargs -0 sed -i "s/$FOO/$BOO/"'
do work now.
Signed-off-by: Raphael Zimmerer <killekulla@rdrz.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
When we tried to find commits done by AUTHOR, the first implementation
tried to pattern match a line with "^author .*AUTHOR", which later was
enhanced to strip leading caret and look for "^author AUTHOR" when the
search pattern was anchored at the left end (i.e. --author="^AUTHOR").
This had a few problems:
* When looking for fixed strings (e.g. "git log -F --author=x --grep=y"),
the regexp internally used "^author .*x" would never match anything;
* To match at the end (e.g. "git log --author='google.com>$'"), the
generated regexp has to also match the trailing timestamp part the
commit header lines have. Also, in order to determine if the '$' at
the end means "match at the end of the line" or just a literal dollar
sign (probably backslash-quoted), we would need to parse the regexp
ourselves.
An earlier alternative tried to make sure that a line matches "^author "
(to limit by field name) and the user supplied pattern at the same time.
While it solved the -F problem by introducing a special override for
matching the "^author ", it did not solve the trailing timestamp nor tail
match problem. It also would have matched every commit if --author=author
was asked for, not because the author's email part had this string, but
because every commit header line that talks about the author begins with
that field name, regardleses of who wrote it.
Instead of piling more hacks on top of hacks, this rethinks the grep
machinery that is used to look for strings in the commit header, and makes
sure that (1) field name matches literally at the beginning of the line,
followed by a SP, and (2) the user supplied pattern is matched against the
remainder of the line, excluding the trailing timestamp data.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We already have two instances where we want to determine if a buffer
contains binary data as opposed to text.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This is a mechanical clean-up of the way *.c files include
system header files.
(1) sources under compat/, platform sha-1 implementations, and
xdelta code are exempt from the following rules;
(2) the first #include must be "git-compat-util.h" or one of
our own header file that includes it first (e.g. config.h,
builtin.h, pkt-line.h);
(3) system headers that are included in "git-compat-util.h"
need not be included in individual C source files.
(4) "git-compat-util.h" does not have to include subsystem
specific header files (e.g. expat.h).
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
This lets you say:
git grep --all-match -e A -e B -e C
to find lines that match A or B or C but limit the matches from
the files that have all of A, B and C.
This is different from
git grep -e A --and -e B --and -e C
in that the latter looks for a single line that has all of these
at the same time.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
"git grep --fixed-strings -e GIT --and -e VERSION .gitignore"
misbehaved because we did not notice this needs to grab lines
that have the given two fixed strings at the same time.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
This further updates the built-in grep engine so that we can say
something like "this pattern should match only in head". This
can be used to simplify grepping in the log messages.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
This makes three functions and associated option structures from
builtin-grep available from other parts of the system.
* options to drive built-in grep engine is stored in struct
grep_opt;
* pattern strings and extended grep expressions are added to
struct grep_opt with append_grep_pattern();
* when finished calling append_grep_pattern(), call
compile_grep_patterns() to prepare for execution;
* call grep_buffer() to find matches in the in-core buffer.
This also adds an internal option "status_only" to grep_opt,
which suppresses any output from grep_buffer(). Callers of the
function as library can use it to check if there is a match
without producing any output.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|