summaryrefslogtreecommitdiff
path: root/builtin/grep.c
AgeCommit message (Collapse)AuthorFilesLines
2017-07-13Merge branch 'ab/grep-lose-opt-regflags'Libravatar Junio C Hamano1-2/+0
Code cleanup. * ab/grep-lose-opt-regflags: grep: remove redundant REG_NEWLINE when compiling fixed regex grep: remove regflags from the public grep_opt API grep: remove redundant and verbose re-assignments to 0 grep: remove redundant "fixed" field re-assignment to 0 grep: adjust a redundant grep pattern type assignment grep: remove redundant double assignment to 0
2017-07-12Merge branch 'rs/use-div-round-up'Libravatar Junio C Hamano1-1/+1
Code cleanup. * rs/use-div-round-up: use DIV_ROUND_UP
2017-07-10use DIV_ROUND_UPLibravatar René Scharfe1-1/+1
Convert code that divides and rounds up to use DIV_ROUND_UP to make the intent clearer and reduce the number of magic constants. Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-05Merge branch 'bw/repo-object'Libravatar Junio C Hamano1-1/+2
Introduce a "repository" object to eventually make it easier to work in multiple repositories (the primary focus is to work with the superproject and its submodules) in a single process. * bw/repo-object: ls-files: use repository object repository: enable initialization of submodules submodule: convert is_submodule_initialized to work on a repository submodule: add repo_read_gitmodules submodule-config: store the_submodule_cache in the_repository repository: add index_state to struct repo config: read config from a repository object path: add repo_worktree_path and strbuf_repo_worktree_path path: add repo_git_path and strbuf_repo_git_path path: worktree_git_path() should not use file relocation path: convert do_git_path to take a 'struct repository' path: convert strbuf_git_common_path to take a 'struct repository' path: always pass in commondir to update_common_dir path: create path.h environment: store worktree in the_repository environment: place key repository state in the_repository repository: introduce the repository object environment: remove namespace_len variable setup: add comment indicating a hack setup: don't perform lazy initialization of repository state
2017-06-30grep: remove regflags from the public grep_opt APILibravatar Ævar Arnfjörð Bjarmason1-2/+0
Refactor calls to the grep machinery to always pass opt.ignore_case & opt.extended_regexp_option instead of setting the equivalent regflags bits. The bug fixed when making -i work with -P in commit 9e3cbc59d5 ("log: make --regexp-ignore-case work with --perl-regexp", 2017-05-20) was really just plastering over the code smell which this change fixes. The reason for adding the extensive commentary here is that I discovered some subtle complexity in implementing this that really should be called out explicitly to future readers. Before this change we'd rely on the difference between `extended_regexp_option` and `regflags` to serve as a membrane between our preliminary parsing of grep.extendedRegexp and grep.patternType, and what we decided to do internally. Now that those two are the same thing, it's necessary to unset `extended_regexp_option` just before we commit in cases where both of those config variables are set. See 84befcd0a4 ("grep: add a grep.patternType configuration setting", 2012-08-03) for the code and documentation related to that. The explanation of why the if/else branches in grep_commit_pattern_type() are ordered the way they are exists in that commit message, but I think it's worth calling this subtlety out explicitly with a comment for future readers. Even though grep_commit_pattern_type() is the only caller of grep_set_pattern_type_option() it's simpler to reset the extended_regexp_option flag in the latter, since 2/3 branches in the former would otherwise need to reset it, this way we can do it in one place. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-27Spelling fixesLibravatar Ville Skyttä1-1/+1
Signed-off-by: Ville Skyttä <ville.skytta@iki.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24Merge branch 'bw/config-h'Libravatar Junio C Hamano1-0/+1
Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h
2017-06-23submodule: convert is_submodule_initialized to work on a repositoryLibravatar Brandon Williams1-1/+2
Convert 'is_submodule_initialized()' to take a repository object and while we're at it, lets rename the function to 'is_submodule_active()' and remove the NEEDSWORK comment. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23Merge branches 'bw/ls-files-sans-the-index' and 'bw/config-h' into ↵Libravatar Junio C Hamano1-0/+1
bw/repo-object * bw/ls-files-sans-the-index: ls-files: factor out tag calculation ls-files: factor out debug info into a function ls-files: convert show_files to take an index ls-files: convert show_ce_entry to take an index ls-files: convert prune_cache to take an index ls-files: convert ce_excluded to take an index ls-files: convert show_ru_info to take an index ls-files: convert show_other_files to take an index ls-files: convert show_killed_files to take an index ls-files: convert write_eolinfo to take an index ls-files: convert overlay_tree_on_cache to take an index tree: convert read_tree to take an index parameter convert: convert renormalize_buffer to take an index convert: convert convert_to_git to take an index convert: convert convert_to_git_filter_fd to take an index convert: convert crlf_to_git to take an index convert: convert get_cached_convert_stats_ascii to take an index * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h alias: use the early config machinery to expand aliases t7006: demonstrate a problem with aliases in subdirectories t1308: relax the test verifying that empty alias values are disallowed help: use early config when autocorrecting aliases config: report correct line number upon error discover_git_directory(): avoid setting invalid git_dir
2017-06-19Merge branch 'bw/object-id'Libravatar Junio C Hamano1-11/+11
Conversion from uchar[20] to struct object_id continues. * bw/object-id: (33 commits) diff: rename diff_fill_sha1_info to diff_fill_oid_info diffcore-rename: use is_empty_blob_oid tree-diff: convert path_appendnew to object_id tree-diff: convert diff_tree_paths to struct object_id tree-diff: convert try_to_follow_renames to struct object_id builtin/diff-tree: cleanup references to sha1 diff-tree: convert diff_tree_sha1 to struct object_id notes-merge: convert write_note_to_worktree to struct object_id notes-merge: convert verify_notes_filepair to struct object_id notes-merge: convert find_notes_merge_pair_ps to struct object_id notes-merge: convert merge_from_diffs to struct object_id notes-merge: convert notes_merge* to struct object_id tree-diff: convert diff_root_tree_sha1 to struct object_id combine-diff: convert find_paths_* to struct object_id combine-diff: convert diff_tree_combined to struct object_id diff: convert diff_flush_patch_id to struct object_id patch-ids: convert to struct object_id diff: finish conversion for prepare_temp_file to struct object_id diff: convert reuse_worktree_file to struct object_id diff: convert fill_filespec to struct object_id ...
2017-06-19Merge branch 'ab/pcre-v2'Libravatar Junio C Hamano1-3/+13
Update "perl-compatible regular expression" support to enable JIT and also allow linking with the newer PCRE v2 library. * ab/pcre-v2: grep: add support for PCRE v2 grep: un-break building with PCRE >= 8.32 without --enable-jit grep: un-break building with PCRE < 8.20 grep: un-break building with PCRE < 8.32 grep: add support for the PCRE v1 JIT API log: add -P as a synonym for --perl-regexp grep: skip pthreads overhead when using one thread grep: don't redundantly compile throwaway patterns under threading
2017-06-15config: don't include config.h by defaultLibravatar Brandon Williams1-0/+1
Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13Merge branch 'sb/submodule-blanket-recursive'Libravatar Junio C Hamano1-0/+3
Many commands learned to pay attention to submodule.recurse configuration. * sb/submodule-blanket-recursive: builtin/fetch.c: respect 'submodule.recurse' option builtin/push.c: respect 'submodule.recurse' option builtin/grep.c: respect 'submodule.recurse' option Introduce 'submodule.recurse' option for worktree manipulators submodule loading: separate code path for .gitmodules and config overlay reset/checkout/read-tree: unify config callback for submodule recursion submodule test invocation: only pass additional arguments submodule recursing: do not write a config variable twice
2017-06-02Merge branch 'ab/grep-preparatory-cleanup'Libravatar Junio C Hamano1-4/+19
The internal implementation of "git grep" has seen some clean-up. * ab/grep-preparatory-cleanup: (31 commits) grep: assert that threading is enabled when calling grep_{lock,unlock} grep: given --threads with NO_PTHREADS=YesPlease, warn pack-objects: fix buggy warning about threads pack-objects & index-pack: add test for --threads warning test-lib: add a PTHREADS prerequisite grep: move is_fixed() earlier to avoid forward declaration grep: change internal *pcre* variable & function names to be *pcre1* grep: change the internal PCRE macro names to be PCRE1 grep: factor test for \0 in grep patterns into a function grep: remove redundant regflags assignments grep: catch a missing enum in switch statement perf: add a comparison test of log --grep regex engines with -F perf: add a comparison test of log --grep regex engines perf: add a comparison test of grep regex engines with -F perf: add a comparison test of grep regex engines perf: emit progress output when unpacking & building perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do grep: add tests to fix blind spots with \0 patterns grep: prepare for testing binary regexes containing rx metacharacters grep: add a test helper function for less verbose -f \0 tests ...
2017-06-02Merge branch 'jk/diff-blob'Libravatar Junio C Hamano1-1/+3
The result from "git diff" that compares two blobs, e.g. "git diff $commit1:$path $commit2:$path", used to be shown with the full object name as given on the command line, but it is more natural to use the $path in the output and use it to look up .gitattributes. * jk/diff-blob: diff: use blob path for blob/file diffs diff: use pending "path" if it is available diff: use the word "path" instead of "name" for blobs diff: pass whole pending entry in blobinfo handle_revision_arg: record paths for pending objects handle_revision_arg: record modes for "a..b" endpoints t4063: add tests of direct blob diffs get_sha1_with_context: dynamically allocate oc->path get_sha1_with_context: always initialize oc->symlink_path sha1_name: consistently refer to object_context as "oc" handle_revision_arg: add handle_dotdot() helper handle_revision_arg: hoist ".." check out of range parsing handle_revision_arg: stop using "dotdot" as a generic pointer handle_revision_arg: simplify commit reference lookups handle_revision_arg: reset "dotdot" consistently
2017-06-02grep: convert to struct object_idLibravatar Brandon Williams1-11/+11
Convert the remaining parts of grep to use struct object_id. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-01builtin/grep.c: respect 'submodule.recurse' optionLibravatar Stefan Beller1-0/+3
In builtin/grep.c we parse the config before evaluating the command line options. This makes the task of teaching grep to respect the new config option 'submodule.recurse' very easy by just parsing that option. As an alternative I had implemented a similar structure to treat submodules as the fetch/push command have, including * aligning the meaning of the 'recurse_submodules' to possible submodule values RECURSE_SUBMODULES_* as defined in submodule.h. * having a callback to parse the value and * reacting to the RECURSE_SUBMODULES_DEFAULT state that was the initial state. However all this is not needed for a true boolean value, so let's keep it simple. However this adds another place where "submodule.recurse" is parsed. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-29Merge branch 'bc/object-id'Libravatar Junio C Hamano1-1/+1
Conversion from uchar[20] to struct object_id continues. * bc/object-id: (53 commits) object: convert parse_object* to take struct object_id tree: convert parse_tree_indirect to struct object_id sequencer: convert do_recursive_merge to struct object_id diff-lib: convert do_diff_cache to struct object_id builtin/ls-tree: convert to struct object_id merge: convert checkout_fast_forward to struct object_id sequencer: convert fast_forward_to to struct object_id builtin/ls-files: convert overlay_tree_on_cache to object_id builtin/read-tree: convert to struct object_id sha1_name: convert internals of peel_onion to object_id upload-pack: convert remaining parse_object callers to object_id revision: convert remaining parse_object callers to object_id revision: rename add_pending_sha1 to add_pending_oid http-push: convert process_ls_object and descendants to object_id refs/files-backend: convert many internals to struct object_id refs: convert struct ref_update to use struct object_id ref-filter: convert some static functions to struct object_id Convert struct ref_array_item to struct object_id Convert the verify_pack callback to struct object_id Convert lookup_tag to struct object_id ...
2017-05-26grep: skip pthreads overhead when using one threadLibravatar Ævar Arnfjörð Bjarmason1-0/+2
Skip the administrative overhead of using pthreads when only using one thread. Instead take the non-threaded path which would be taken under NO_PTHREADS. The threading support was initially added in commit 5b594f457a ("Threaded grep", 2010-01-25) with a hardcoded compile-time number of 8 threads. Later the number of threads was made configurable in commit 89f09dd34e ("grep: add --threads=<num> option and grep.threads configuration", 2015-12-15). That change did not add any special handling for --threads=1. Now we take a slightly faster path by skipping thread handling entirely when 1 thread is requested. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26grep: don't redundantly compile throwaway patterns under threadingLibravatar Ævar Arnfjörð Bjarmason1-3/+11
Change the pattern compilation logic under threading so that grep doesn't compile a pattern it never ends up using on the non-threaded code path, only to compile it again N times for N threads which will each use their own copy, ignoring the initially compiled pattern. This redundant compilation dates back to the initial introduction of the threaded grep in commit 5b594f457a ("Threaded grep", 2010-01-25). There was never any reason for doing this redundant work other than an oversight in the initial commit. Jeff King suggested on-list in <20170414212325.fefrl3qdjigwyitd@sigill.intra.peff.net> that this might be needed to check the pattern for sanity before threaded execution commences. That's not the case. The pattern is compiled under threading in start_threads() before any concurrent execution has started by calling pthread_create(), so if the pattern contains an error we still do the right thing. I.e. die with one error before any threaded execution has commenced, instead of e.g. spewing out an error for each N threads, which could be a regression a change like this might inadvertently introduce. This change is not meant as an optimization, any performance gains from this are in the hundreds to thousands of nanoseconds at most. If we wanted more performance here we could just re-use the compiled patterns in multiple threads (regcomp(3) is thread-safe), or partially re-use them and the associated structures in the case of later PCRE JIT changes. Rather, it's just to make the code easier to reason about. It's confusing to debug this under threading & non-threading when the threading codepaths redundantly compile a pattern which is never used. The reason the patterns are recompiled is as a side-effect of duplicating the whole grep_opt structure, which is not thread safe, writable, and munged during execution. The grep_opt structure then points to the grep_pat structure where pattern or patterns are stored. I looked into e.g. splitting the API into some "do & alloc threadsafe stuff", "spawn thread", "do and alloc non-threadsafe stuff", but the execution time of grep_opt_dup() & pattern compilation is trivial compared to actually executing the grep, so there was no point. Even with the more expensive JIT changes to follow the most expensive PCRE patterns take something like 0.0X milliseconds to compile at most[1]. The undocumented --debug mode added in commit 17bf35a3c7 ("grep: teach --debug option to dump the parse tree", 2012-09-13) still works properly with this change. It only emits debugging info during pattern compilation, which is now dumped by the pattern compiled just before the first thread is started. 1. http://sljit.sourceforge.net/pcre.html Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26grep: assert that threading is enabled when calling grep_{lock,unlock}Libravatar Ævar Arnfjörð Bjarmason1-4/+4
Change the grep_{lock,unlock} functions to assert that num_threads is true, instead of only locking & unlocking the pthread mutex lock when it is. These functions are never called when num_threads isn't true, this logic has gone through multiple iterations since the initial introduction of grep threading in commit 5b594f457a ("Threaded grep", 2010-01-25), but ever since then they'd only be called if num_threads was true, so this check made the code confusing to read. Replace the check with an assertion, so that it's clear to the reader that this code path is never taken unless we're spawning threads. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26grep: given --threads with NO_PTHREADS=YesPlease, warnLibravatar Ævar Arnfjörð Bjarmason1-0/+13
Add a warning about missing thread support when grep.threads or --threads is set to a non 0 (default) or 1 (no parallelism) value under NO_PTHREADS=YesPlease. This is for consistency with the index-pack & pack-objects commands, which also take a --threads option & are configurable via pack.threads, and have long warned about the same under NO_PTHREADS=YesPlease. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26grep: catch a missing enum in switch statementLibravatar Ævar Arnfjörð Bjarmason1-0/+2
Add a die(...) to a default case for the switch statement selecting between grep pattern types under --recurse-submodules. Normally this would be caught by -Wswitch, but the grep_pattern_type type is converted to int by going through parse_options(). Changing the argument type passed to compile_submodule_options() won't work, the value will just get coerced. The -Wswitch-default warning will warn about it, but that produces a lot of noise across the codebase, this potential issue would be drowned in that noise. Thus catching this at runtime is the least bad option. This won't ever trigger in practice, but if a new pattern type were to be added this catches an otherwise silent bug during development. See commit 0281e487fd ("grep: optionally recurse into submodules", 2016-12-16) for the initial addition of this code. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-24get_sha1_with_context: dynamically allocate oc->pathLibravatar Jeff King1-1/+3
When a sha1 lookup returns the tree path via "struct object_context", it just copies it into a fixed-size buffer. This means the result can be truncated, and it means our "struct object_context" consumes a lot of stack space. Instead, let's allocate a string on the heap. Because most callers don't care about this information, we'll avoid doing it by default (so they don't all have to start calling free() on the result). There are basically two options for the caller to signal to us that it's interested: 1. By setting a pointer to storage in the object_context. 2. By passing a flag in another parameter. Doing (1) would match the way that sha1_object_info_extended() works. But it would mean that every caller would have to initialize the object_context, which they don't currently have to do. This patch does (2), and adds a new bit to the function's flags field. All of the callers that look at the "path" field are updated to pass the new flag. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08object: convert parse_object* to take struct object_idLibravatar brian m. carlson1-1/+1
Make parse_object, parse_object_or_die, and parse_object_buffer take a pointer to struct object_id. Remove the temporary variables inserted earlier, since they are no longer necessary. Transform all of the callers using the following semantic patch: @@ expression E1; @@ - parse_object(E1.hash) + parse_object(&E1) @@ expression E1; @@ - parse_object(E1->hash) + parse_object(E1) @@ expression E1, E2; @@ - parse_object_or_die(E1.hash, E2) + parse_object_or_die(&E1, E2) @@ expression E1, E2; @@ - parse_object_or_die(E1->hash, E2) + parse_object_or_die(E1, E2) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1.hash, E2, E3, E4, E5) + parse_object_buffer(&E1, E2, E3, E4, E5) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1->hash, E2, E3, E4, E5) + parse_object_buffer(E1, E2, E3, E4, E5) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06dir: convert fill_directory to take an indexLibravatar Brandon Williams1-1/+1
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-19Merge branch 'ab/grep-plug-pathspec-leak'Libravatar Junio C Hamano1-0/+1
Call clear_pathspec() to release resources immediately before the cmd_grep() function returns. * ab/grep-plug-pathspec-leak: grep: plug a trivial memory leak
2017-04-16grep: plug a trivial memory leakLibravatar Ævar Arnfjörð Bjarmason1-0/+1
Change the cleanup phase for the grep command to free the pathspec struct that's allocated earlier in the same block, and used just a few lines earlier. With "grep hi README.md" valgrind reports a loss of 239 bytes now, down from 351. The relevant --num-callers=40 --leak-check=full --show-leak-kinds=all backtrace is: [...] 187 (112 direct, 75 indirect) bytes in 1 blocks are definitely lost in loss record 70 of 110 [...] at 0x4C2BBAF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) [...] by 0x60B339: do_xmalloc (wrapper.c:59) [...] by 0x60B2F6: xmalloc (wrapper.c:86) [...] by 0x576B37: parse_pathspec (pathspec.c:652) [...] by 0x4519F0: cmd_grep (grep.c:1215) [...] by 0x4062EF: run_builtin (git.c:371) [...] by 0x40544D: handle_builtin (git.c:572) [...] by 0x4060A2: run_argv (git.c:624) [...] by 0x4051C6: cmd_main (git.c:701) [...] by 0x4C5901: main (common-main.c:43) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-30Merge branch 'bw/recurse-submodules-relative-fix'Libravatar Junio C Hamano1-16/+25
A few commands that recently learned the "--recurse-submodule" option misbehaved when started from a subdirectory of the superproject. * bw/recurse-submodules-relative-fix: ls-files: fix bug when recursing with relative pathspec ls-files: fix typo in variable name grep: fix bug when recursing with relative pathspec setup: allow for prefix to be passed to git commands grep: fix help text typo
2017-03-28Merge branch 'sb/checkout-recurse-submodules'Libravatar Junio C Hamano1-1/+1
"git checkout" is taught the "--recurse-submodules" option. * sb/checkout-recurse-submodules: builtin/read-tree: add --recurse-submodules switch builtin/checkout: add --recurse-submodules switch entry.c: create submodules when interesting unpack-trees: check if we can perform the operation for submodules unpack-trees: pass old oid to verify_clean_submodule update submodules: add submodule_move_head submodule.c: get_super_prefix_or_empty update submodules: move up prepare_submodule_repo_env submodules: introduce check to see whether to touch a submodule update submodules: add a config option to determine if submodules are updated update submodules: add submodule config parsing make is_submodule_populated gently lib-submodule-update.sh: define tests for recursing into submodules lib-submodule-update.sh: replace sha1 by hash lib-submodule-update: teach test_submodule_content the -C <dir> flag lib-submodule-update.sh: do not use ./. as submodule remote lib-submodule-update.sh: reorder create_lib_submodule_repo submodule--helper.c: remove duplicate code connect_work_tree_and_git_dir: safely create leading directories
2017-03-28Merge branch 'bw/grep-recurse-submodules'Libravatar Junio C Hamano1-12/+9
Build fix for NO_PTHREADS build. * bw/grep-recurse-submodules: grep: fix builds with with no thread support grep: set default output method
2017-03-18grep: fix builds with with no thread supportLibravatar Brandon Williams1-12/+9
Commit 0281e487fd91 ("grep: optionally recurse into submodules") added functions grep_submodule() and grep_submodule_launch() which use "struct work_item" which is defined only when thread support is available. The original implementation of grep_submodule() used the "struct work_item" in order to gain access to a strbuf to store its output which was to be printed at a later point in time. This differs from how both grep_file() and grep_sha1() handle their output. This patch eliminates the reliance on the "struct work_item" and instead opts to use the output function stored in the output field of the "struct grep_opt" object directly, making it behave similarly to both grep_file() and grep_sha1(). Reported-by: Rahul Bedarkar <rahul.bedarkar@imgtec.com> Signed-off-by: Brandon Williams <bmwill@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-17Merge branch 'bc/object-id'Libravatar Junio C Hamano1-12/+12
"uchar [40]" to "struct object_id" conversion continues. * bc/object-id: wt-status: convert to struct object_id builtin/merge-base: convert to struct object_id Convert object iteration callbacks to struct object_id sha1_file: introduce an nth_packed_object_oid function refs: simplify parsing of reflog entries refs: convert each_reflog_ent_fn to struct object_id reflog-walk: convert struct reflog_info to struct object_id builtin/replace: convert to struct object_id Convert remaining callers of resolve_refdup to object_id builtin/merge: convert to struct object_id builtin/clone: convert to struct object_id builtin/branch: convert to struct object_id builtin/grep: convert to struct object_id builtin/fmt-merge-message: convert to struct object_id builtin/fast-export: convert to struct object_id builtin/describe: convert to struct object_id builtin/diff-tree: convert to struct object_id builtin/commit: convert to struct object_id hex: introduce parse_oid_hex
2017-03-17grep: fix bug when recursing with relative pathspecLibravatar Brandon Williams1-15/+24
When using the --recurse-submodules flag with a relative pathspec which includes "..", an error is produced inside the child process spawned for a submodule. When creating the pathspec struct in the child, the ".." is interpreted to mean "go up a directory" which causes an error stating that the path ".." is outside of the repository. While it is true that ".." is outside the scope of the submodule, it is confusing to a user who originally invoked the command where ".." was indeed still inside the scope of the superproject. Since the child process launched for the submodule has some context that it is operating underneath a superproject, this error could be avoided. This patch fixes the bug by passing the 'prefix' to the child process. Now each child process that works on a submodule has two points of reference to the superproject: (1) the 'super_prefix' which is the path from the root of the superproject down to root of the submodule and (2) the 'prefix' which is the path from the root of the superproject down to the directory where the user invoked the git command. With these two pieces of information a child process can correctly interpret the pathspecs provided by the user as well as being able to properly format its output relative to the directory the user invoked the original command from. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-17grep: fix help text typoLibravatar Brandon Williams1-1/+1
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-15make is_submodule_populated gentlyLibravatar Stefan Beller1-1/+1
We need the gentle version in a later patch. As we have just one caller, migrate the caller. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-22builtin/grep: convert to struct object_idLibravatar brian m. carlson1-12/+12
Convert several functions to use struct object_id, and rename them so that they no longer refer to SHA-1. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-14grep: treat revs the same for --untracked as for --no-indexLibravatar Jeff King1-5/+5
git-grep has always disallowed grepping in a tree (as opposed to the working directory) with both --untracked and --no-index. But we traditionally did so by first collecting the revs, and then complaining when any were provided. The --no-index option recently learned to detect revs much earlier. This has two user-visible effects: - we don't bother to resolve revision names at all. So when there's a rev/path ambiguity, we always choose to treat it as a path. - likewise, when you do specify a revision without "--", the error you get is "no such path" and not "--untracked cannot be used with revs". The rationale for doing this with --no-index is that it is meant to be used outside a repository, and so parsing revs at all does not make sense. This patch gives --untracked the same treatment. While it _is_ meant to be used in a repository, it is explicitly about grepping the non-repository contents. Telling the user "we found a rev, but you are not allowed to use revs" is not really helpful compared to "we treated your argument as a path, and could not find it". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-14grep: do not diagnose misspelt revs with --no-indexLibravatar Jeff King1-1/+1
If we are using --no-index, then our arguments cannot be revs in the first place. Not only is it pointless to diagnose them, but if we are not in a repository, we should not be trying to resolve any names. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-14grep: avoid resolving revision names in --no-index caseLibravatar Jeff King1-0/+6
We disallow the use of revisions with --no-index, but we don't actually check and complain until well after we've parsed the revisions. This is the cause of a few problems: 1. We shouldn't be calling get_sha1() at all when we aren't in a repository, as it might access the ref or object databases. For now, this should generally just return failure, but eventually it will become a BUG(). 2. When there's a "--" disambiguator and you're outside a repository, we'll complain early with "unable to resolve revision". But we can give a much more specific error. 3. When there isn't a "--" disambiguator, we still do the normal rev/path checks. This is silly, as we know we cannot have any revs with --no-index. Everything we see must be a path. Outside of a repository this doesn't matter (since we know it won't resolve), but inside one, we may complain unnecessarily if a filename happens to also match a refname. This patch skips the get_sha1() call entirely in the no-index case, and behaves as if it failed (with the exception of giving a better error message). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-14grep: fix "--" rev/pathspec disambiguationLibravatar Jeff King1-5/+24
If we see "git grep pattern rev -- file" then we apply the usual rev/pathspec disambiguation rules: any "rev" before the "--" must be a revision, and we do not need to apply the verify_non_filename() check. But there are two bugs here: 1. We keep a seen_dashdash flag to handle this case, but we set it in the same left-to-right pass over the arguments in which we parse "rev". So when we see "rev", we do not yet know that there is a "--", and we mistakenly complain if there is a matching file. We can fix this by making a preliminary pass over the arguments to find the "--", and only then checking the rev arguments. 2. If we can't resolve "rev" but there isn't a dashdash, that's OK. We treat it like a path, and complain later if it doesn't exist. But if there _is_ a dashdash, then we know it must be a rev, and should treat it as such, complaining if it does not resolve. The current code instead ignores it and tries to treat it like a path. This patch fixes both bugs, and tries to comment the parsing flow a bit better. It adds tests that cover the two bugs, but also some related situations (which already worked, but this confirms that our fixes did not break anything). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-14grep: re-order rev-parsing loopLibravatar Jeff King1-9/+11
We loop over the arguments, but every branch of the loop hits either a "continue" or a "break". Surely we can make this simpler. The final conditional is: if (arg is a rev) { ... handle rev ... continue; } break; We can rewrite this as: if (arg is not a rev) break; ... handle rev ... That makes the flow a little bit simpler, and will make things much easier to follow when we add more logic in future patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-14grep: do not unnecessarily query repo for "--"Libravatar Jonathan Tan1-4/+5
When running a command of the form git grep --no-index pattern -- path in the absence of a Git repository, an error message will be printed: fatal: BUG: setup_git_env called without repository This is because "git grep" tries to interpret "--" as a rev. "git grep" has always tried to first interpret "--" as a rev for at least a few years, but this issue was upgraded from a pessimization to a bug in commit 59332d1 ("Resurrect "git grep --no-index"", 2010-02-06), which calls get_sha1 regardless of whether --no-index was specified. This bug appeared to be benign until commit b1ef400 ("setup_git_env: avoid blind fall-back to ".git"", 2016-10-20) when Git was taught to die in this situation. (This "git grep" bug appears to be one of the bugs that commit b1ef400 is meant to flush out.) Therefore, always interpret "--" as signaling the end of options, instead of trying to interpret it as a rev first. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-14grep: move thread initialization a little lowerLibravatar Jeff King1-14/+14
Originally, we set up the threads for grep before parsing the non-option arguments. In 53b8d931b (grep: disable threading in non-worktree case, 2011-12-12), the thread code got bumped lower in the function because it now needed to know whether we got any revision arguments. That put a big block of code in between the parsing of revs and the parsing of pathspecs, both of which share some loop variables. That makes it harder to read the code than the original, where the shared loops were right next to each other. Let's bump the thread initialization until after all of the parsing is done. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-22grep: search history of moved submodulesLibravatar Brandon Williams1-2/+18
If a submodule was renamed at any point since it's inception then if you were to try and grep on a commit prior to the submodule being moved, you wouldn't be able to find a working directory for the submodule since the path in the past is different from the current path. This patch teaches grep to find the .git directory for a submodule in the parents .git/modules/ directory in the event the path to the submodule in the commit that is being searched differs from the state of the currently checked out commit. If found, the child process that is spawned to grep the submodule will chdir into its gitdir instead of a working directory. In order to override the explicit setting of submodule child process's gitdir environment variable (which was introduced in '10f5c526') `GIT_DIR_ENVIORMENT` needs to be pushed onto child process's env_array. This allows the searching of history from a submodule's gitdir, rather than from a working directory. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-22grep: enable recurse-submodules to work on <tree> objectsLibravatar Brandon Williams1-6/+70
Teach grep to recursively search in submodules when provided with a <tree> object. This allows grep to search a submodule based on the state of the submodule that is present in a commit of the super project. When grep is provided with a <tree> object, the name of the object is prefixed to all output. In order to provide uniformity of output between the parent and child processes the option `--parent-basename` has been added so that the child can preface all of it's output with the name of the parent's object instead of the name of the commit SHA1 of the submodule. This changes output from the command `git grep -e. -l --recurse-submodules HEAD` from: HEAD:file <commit sha1 of submodule>:sub/file to: HEAD:file HEAD:sub/file Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-22grep: optionally recurse into submodulesLibravatar Brandon Williams1-19/+281
Allow grep to recognize submodules and recursively search for patterns in each submodule. This is done by forking off a process to recursively call grep on each submodule. The top level --super-prefix option is used to pass a path to the submodule which can in turn be used to prepend to output or in pathspec matching logic. Recursion only occurs for submodules which have been initialized and checked out by the parent project. If a submodule hasn't been initialized and checked out it is simply skipped. In order to support the existing multi-threading infrastructure in grep, output from each child process is captured in a strbuf so that it can be later printed to the console in an ordered fashion. To limit the number of theads that are created, each child process has half the number of threads as its parents (minimum of 1), otherwise we potentailly have a fork-bomb. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07cache: convert struct cache_entry to use struct object_idLibravatar brian m. carlson1-1/+2
Convert struct cache_entry to use struct object_id by applying the following semantic patch and the object_id transforms from contrib, plus the actual change to the struct: @@ struct cache_entry E1; @@ - E1.sha1 + E1.oid.hash @@ struct cache_entry *E1; @@ - E1->sha1 + E1->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-13Merge branch 'nd/ita-cleanup'Libravatar Junio C Hamano1-2/+2
Git does not know what the contents in the index should be for a path added with "git add -N" yet, so "git grep --cached" should not show hits (or show lack of hits, with -L) in such a path, but that logic does not apply to "git grep", i.e. searching in the working tree files. But we did so by mistake, which has been corrected. * nd/ita-cleanup: grep: fix grepping for "intent to add" files t7810-grep.sh: fix a whitespace inconsistency t7810-grep.sh: fix duplicated test name
2016-07-01grep: fix grepping for "intent to add" filesLibravatar Charles Bailey1-2/+2
This reverts commit 4d5520053 (grep: make it clear i-t-a entries are ignored, 2015-12-27) and adds an alternative fix to maintain the -L --cached behavior. 4d5520053 caused 'git grep' to no longer find matches in new files in the working tree where the corresponding index entry had the "intent to add" bit set, despite the fact that these files are tracked. The content in the index of a file for which the "intent to add" bit is set is considered indeterminate and not empty. For most grep queries we want these to behave the same, however for -L --cached (files without a match) we don't want to respond positively for "intent to add" files as their contents are indeterminate. This is in contrast to files with empty contents in the index (no lines implies no matches for any grep query expression) which should be reported in the output of a grep -L --cached invocation. Add tests to cover this case and a few related cases which previously lacked coverage. Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Charles Bailey <cbailey32@bloomberg.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>