summaryrefslogtreecommitdiff
path: root/builtin/blame.c
AgeCommit message (Collapse)AuthorFilesLines
2021-12-21Merge branch 'ld/sparse-diff-blame'Libravatar Junio C Hamano1-0/+3
Teach diff and blame to work well with sparse index. * ld/sparse-diff-blame: blame: enable and test the sparse index diff: enable and test the sparse index diff: replace --staged with --cached in t1092 tests repo-settings: prepare_repo_settings only in git repos test-read-cache: set up repo after git directory commit-graph: return if there is no git directory git: ensure correct git directory setup with -h
2021-12-06blame: enable and test the sparse indexLibravatar Lessley Dennington1-0/+3
Enable the sparse index for the 'git blame' command. The index was already not expanded with this command, so the most interesting thing to do is to add tests that verify that 'git blame' behaves correctly when the sparse index is enabled and that its performance improves. More specifically, these cases are: 1. The index is not expanded for 'blame' when given paths in the sparse checkout cone at multiple levels. 2. Performance measurably improves for 'blame' with sparse index when given paths in the sparse checkout cone at multiple levels. The `p2000` tests demonstrate a ~60% execution time reduction when running 'blame' for a file two levels deep and and a ~30% execution time reduction for a file three levels deep. Test before after ---------------------------------------------------------------- 2000.62: git blame f2/f4/a (full-v3) 0.31 0.32 +3.2% 2000.63: git blame f2/f4/a (full-v4) 0.29 0.31 +6.9% 2000.64: git blame f2/f4/a (sparse-v3) 0.55 0.23 -58.2% 2000.65: git blame f2/f4/a (sparse-v4) 0.57 0.23 -59.6% 2000.66: git blame f2/f4/f3/a (full-v3) 0.77 0.85 +10.4% 2000.67: git blame f2/f4/f3/a (full-v4) 0.78 0.81 +3.8% 2000.68: git blame f2/f4/f3/a (sparse-v3) 1.07 0.72 -32.7% 2000.99: git blame f2/f4/f3/a (sparse-v4) 1.05 0.73 -30.5% We do not include paths outside the sparse checkout cone because blame does not support blaming files that are not present in the working directory. This is true in both sparse and full checkouts. Signed-off-by: Lessley Dennington <lessleydennington@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-25Merge branch 'ab/parse-options-cleanup'Libravatar Junio C Hamano1-0/+3
Random changes to parse-options implementation. * ab/parse-options-cleanup: parse-options: change OPT_{SHORT,UNSET} to an enum parse-options tests: test optname() output parse-options.[ch]: make opt{bug,name}() "static" commit-graph: stop using optname() parse-options.c: move optname() earlier in the file parse-options.h: make the "flags" in "struct option" an enum parse-options.c: use exhaustive "case" arms for "enum parse_opt_result" parse-options.[ch]: consistently use "enum parse_opt_result" parse-options.[ch]: consistently use "enum parse_opt_flags" parse-options.h: move PARSE_OPT_SHELL_EVAL between enums
2021-10-08parse-options.[ch]: consistently use "enum parse_opt_result"Libravatar Ævar Arnfjörð Bjarmason1-0/+3
Use the "enum parse_opt_result" instead of an "int flags" as the return value of the applicable functions in parse-options.c. This will help catch future bugs, such as the missing "case" arms in the two existing users of the API in "blame.c" and "shortlog.c". A third caller in 309be813c9b (update-index: migrate to parse-options API, 2010-12-01) was already checking for these. As can be seen when trying to sort through the deluge of warnings produced when compiling this with CC=g++ (mostly unrelated to this change) we're not consistently using "enum parse_opt_result" even now, i.e. we'll return error() and "return 0;". See f41179f16ba (parse-options: avoid magic return codes, 2019-01-27) for a commit which started changing some of that. I'm not doing any more of that exhaustive migration here, and it's probably not worthwhile past the point of being able to check "enum parse_opt_result" in switch(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27builtin/blame.c: refactor commit_info_init() to COMMIT_INFO_INIT macroLibravatar Ævar Arnfjörð Bjarmason1-17/+13
Remove the commit_info_init() function addded in ea02ffa3857 (mailmap: simplify map_user() interface, 2013-01-05) and instead initialize the "struct commit_info" with a macro. This is the more idiomatic pattern in the codebase, and doesn't leave us wondering when we see the *_init() function if this struct needs more complex initialization than a macro can provide. The get_commit_info() function is only called by the three callers being changed here immediately after initializing the struct with the macros, so by moving the initialization to the callers we don't need to do it in get_commit_info() anymore. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-25Merge branch 'rs/blame-optim'Libravatar Junio C Hamano1-5/+3
Optimization in "git blame" * rs/blame-optim: blame: remove unnecessary use of get_commit_info()
2021-02-17blame: remove unnecessary use of get_commit_info()Libravatar Rafael Silva1-5/+3
When `git blame --color-by-age`, the determine_line_heat() is called to select how to color the output based on the commit's author date. It uses the get_commit_info() to parse the information into a `commit_info` structure, however, this is actually unnecessary because the determine_line_heat() caller also does the same. Instead, let's change the determine_line_heat() to take a `commit_info` structure and remove the internal call to get_commit_info() thus cleaning up and optimizing the code path. Enabling Git's trace2 API in order to record the execution time for every call to determine_line_heat() function: + trace2_region_enter("blame", "determine_line_heat", the_repository); determine_line_heat(ent, &default_color); + trace2_region_enter("blame", "determine_line_heat", the_repository); Then, running `git blame` for "kernel/fork.c" in linux.git and summing all the execution time for every call (around 1.3k calls) resulted in 2.6x faster execution (best out 3): git built from 328c109303 (The eighth batch, 2021-02-12) = 42ms git built from 328c109303 + this change = 16ms Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25Merge branch 'ab/mailmap'Libravatar Junio C Hamano1-1/+1
Clean-up docs, codepaths and tests around mailmap. * ab/mailmap: (22 commits) shortlog: remove unused(?) "repo-abbrev" feature mailmap doc + tests: document and test for case-insensitivity mailmap tests: add tests for empty "<>" syntax mailmap tests: add tests for whitespace syntax mailmap tests: add a test for comment syntax mailmap doc + tests: add better examples & test them tests: refactor a few tests to use "test_commit --append" test-lib functions: add an --append option to test_commit test-lib functions: add --author support to test_commit test-lib functions: document arguments to test_commit test-lib functions: expand "test_commit" comment template mailmap: test for silent exiting on missing file/blob mailmap tests: get rid of overly complex blame fuzzing mailmap tests: add a test for "not a blob" error mailmap tests: remove redundant entry in test mailmap tests: improve --stdin tests mailmap tests: modernize syntax & test idioms mailmap tests: use our preferred whitespace syntax mailmap doc: start by mentioning the comment syntax check-mailmap doc: note config options ...
2021-01-12shortlog: remove unused(?) "repo-abbrev" featureLibravatar Ævar Arnfjörð Bjarmason1-1/+1
Remove support for the magical "repo-abbrev" comment in .mailmap files. This was added to .mailmap parsing in [1], as a generalized feature of the git-shortlog Perl script added earlier in [2]. There was no documentation or tests for this feature, and I don't think it's used in practice anymore. What it did was to allow you to specify a single string to be search-replaced with "/.../" in the .mailmap file. E.g. for linux.git's current .mailmap: git archive --remote=git@gitlab.com:linux-kernel/linux.git \ HEAD -- .mailmap | grep -a repo-abbrev # repo-abbrev: /pub/scm/linux/kernel/git/ Then when running e.g.: git shortlog --merges --author=Linus -1 v5.10-rc7..v5.10 | grep Merge We'd emit (the [...] is mine): Merge tag [...]git://git.kernel.org/.../tip/tip But will now emit: Merge tag [...]git.kernel.org/pub/scm/linux/kernel/git/tip/tip I think at this point this is just a historical artifact we can get rid of. It was initially meant for Linus's own use when we integrated the Perl script[2], but since then it seems he's stopped using it. Digging through Linus's release announcements on the LKML[3] the last release I can find that made use of this output is Linux 2.6.25-rc6 back in March 2008[4]. Later on Linus started using --no-merges[5], and nowadays seems to prefer some custom not-quite-shortlog format of merges from lieutenants[6]. You will still see it on linux.git if you run "git shortlog" manually yourself with --merges, with this removed you can still get the same output with: git log --pretty=fuller v5.10-rc7..v5.10 | sed 's!/pub/scm/linux/kernel/git/!/.../!g' | git shortlog Arguably we should do the same for the search-replacing of "[PATCH]" at the beginning with "". That seems to be another relic of a bygone era when linux.git patches would have their E-Mail subject lines applied as-is by "git am" or whatever. But we documented that feature in "git-shortlog(1)", and it seems more widely applicable than something purely kernel-specific. 1. 7595e2ee6ef (git-shortlog: make common repository prefix configurable with .mailmap, 2006-11-25) 2. fa375c7f1b6 (Add git-shortlog perl script, 2005-06-04) 3. https://lore.kernel.org/lkml/ 4. https://lore.kernel.org/lkml/alpine.LFD.1.00.0803161651350.3020@woody.linux-foundation.org/ 5. https://lore.kernel.org/lkml/BANLkTinrbh7Xi27an3uY7pDWrNKhJRYmEA@mail.gmail.com/ 6. https://lore.kernel.org/lkml/CAHk-=wg1+kf1AVzXA-RQX0zjM6t9J2Kay9xyuNqcFHWV-y5ZYw@mail.gmail.com/ Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06builtin/*: update usage formatLibravatar ZheNing Hu1-24/+24
According to the guidelines in parse-options.h, we should not end in a full stop or start with a capital letter. Fix old error and usage messages to match this expectation. Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01blame: simplify 'setup_blame_bloom_data' interfaceLibravatar Philippe Blain1-1/+1
The penultimate commit moved the initialization of 'sb.path' in 'builtin/blame.c::cmd_blame' before the call to 'blame.c::setup_blame_bloom_data'. Since 'cmd_blame' is the only caller of 'setup_blame_bloom_data', it is now unnecessary for 'setup_blame_bloom_data' to receive 'path' as a separate argument, as 'sb.path' is already initialized. Remove this argument from setup_blame_bloom_data's interface and use the 'path' field of the 'sb' 'struct blame_scoreboard' instead. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01blame: simplify 'setup_scoreboard' interfaceLibravatar Philippe Blain1-1/+1
The previous commit moved the initialization of 'sb.path' in 'builtin/blame.c::cmd_blame' before the call to 'blame.c::setup_scoreboard'. Since 'cmd_blame' is the only caller of 'setup_scoreboard', it is now unnecessary for 'setup_scoreboard' to receive 'path' as a separate argument, as 'sb.path' is already initialized. Remove this argument from setup_scoreboard's interface and use the 'path' field of the 'sb' 'struct blame_scoreboard' instead. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01blame: enable funcname blaming with userdiff driverLibravatar Philippe Blain1-2/+2
In blame.c::cmd_blame, we send the 'path' field of the 'sb' 'struct blame_scoreboard' as the 'path' argument to 'line-range.c::parse_range_arg', but 'sb.path' is not set yet; it's set to the local variable 'path' a few lines later at line 1137. This 'path' argument is only used in 'parse_range_arg' if we are blaming a funcname, i.e. `git blame -L :<funcname> <path>`, and in that case it is sent to 'parse_range_funcname', where it is used to determine if a userdiff driver should be used for said <path> to match the given funcname. Since 'path' is yet unset, the userdiff driver is never used, so we fall back to the default funcname regex, which is usually not appropriate for paths that are set to use a specific userdiff driver, and thus either we match some unrelated lines, or we die with fatal: -L parameter '<funcname>' starting at line 1: no match This has been the case ever since `git blame` learned to blame a funcname in 13b8f68c1f (log -L: :pattern:file syntax to find by funcname, 2013-03-28). Enable funcname blaming for paths using specific userdiff drivers by initializing 'sb.path' earlier in 'cmd_blame', when some of its other fields are initialized, so that it is set when passed to 'parse_range_arg'. Add a regression test in 'annotate-tests.sh', which is sourced in t8001-annotate.sh and t8002-blame.sh, leveraging an existing file used to test the userdiff patterns in t4018-diff-funcname. Also, use 'sb.path' instead of 'path' when constructing the error message at line 1114, for consistency. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01line-log: mention both modes in 'blame' and 'log' short helpLibravatar Philippe Blain1-1/+2
'git blame -h' and 'git log -h' both show '-L <n,m>' and describe this option as "Process only line range n,m, counting from 1". No hint is given that a function name regex can also be used. Use <range> instead, and expand the description of the option to mention both modes. Remove "counting from 1" as it's uneeded; it's uncommon to refer to the first line of a file as "line 0". Also, for 'git log', improve the wording to better reflect the long help. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-12blame: handle deref_tag() returning NULLLibravatar René Scharfe1-0/+2
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-04Merge branch 'jc/blame-ignore-fix'Libravatar Junio C Hamano1-2/+25
"git blame --ignore-rev/--ignore-revs-file" failed to validate their input are valid revision, and failed to take into account that the user may want to give an annotated tag instead of a commit, which has been corrected. * jc/blame-ignore-fix: blame: validate and peel the object names on the ignore list t8013: minimum preparatory clean-up
2020-09-24blame: validate and peel the object names on the ignore listLibravatar Junio C Hamano1-2/+25
The command reads list of object names to place on the ignore list either from the command line or from a file, but they are not checked with their object type (those read from the file are not even checked for object existence). Extend the oidset_parse_file() API and allow it to take a callback that can be used to die (e.g. when an inappropriate input is read) or modify the object name read (e.g. when a tag pointing at a commit is read, and the caller wants a commit object name), and use it in the code that handles ignore list. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-14messages: avoid SHA-1 in end-user facing messagesLibravatar Junio C Hamano1-1/+1
There are still a handful mentions of SHA-1 when we meant the (hexadecimal) object names in end-user facing messages. Rewrite them. I was hoping that this can mostly be s/SHA-1/object name/, but a few messages needed rephrasing to keep the result readable. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-05Merge branch 'dl/opt-callback-cleanup'Libravatar Junio C Hamano1-2/+2
Code cleanup. * dl/opt-callback-cleanup: Use OPT_CALLBACK and OPT_CALLBACK_F
2020-04-28Use OPT_CALLBACK and OPT_CALLBACK_FLibravatar Denton Liu1-2/+2
In the codebase, there are many options which use OPTION_CALLBACK in a plain ol' struct definition. However, we have the OPT_CALLBACK and OPT_CALLBACK_F macros which are meant to abstract these plain struct definitions away. These macros are useful as they semantically signal to developers that these are just normal callback option with nothing fancy happening. Replace plain struct definitions of OPTION_CALLBACK with OPT_CALLBACK or OPT_CALLBACK_F where applicable. The heavy lifting was done using the following (disgusting) shell script: #!/bin/sh do_replacement () { tr '\n' '\r' | sed -e 's/{\s*OPTION_CALLBACK,\s*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\s*0,\(\s*[^[:space:]}]*\)\s*}/OPT_CALLBACK(\1,\2,\3,\4,\5,\6)/g' | sed -e 's/{\s*OPTION_CALLBACK,\s*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\(\s*[^[:space:]}]*\)\s*}/OPT_CALLBACK_F(\1,\2,\3,\4,\5,\6,\7)/g' | tr '\r' '\n' } for f in $(git ls-files \*.c) do do_replacement <"$f" >"$f.tmp" mv "$f.tmp" "$f" done The result was manually inspected and then reformatted to match the style of the surrounding code. Finally, using `git grep OPTION_CALLBACK \*.c`, leftover results which were not handled by the script were manually transformed. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-16blame: use changed-path Bloom filtersLibravatar Derrick Stolee1-0/+10
The changed-path Bloom filters help reduce the amount of tree parsing required during history queries. Before calculating a diff, we can ask the filter if a path changed between a commit and its first parent. If the filter says "no" then we can move on without parsing trees. If the filter says "maybe" then we parse trees to discover if the answer is actually "yes" or "no". When computing a blame, there is a section in find_origin() that computes a diff between a commit and one of its parents. When this is the first parent, we can check the Bloom filters before calling diff_tree_oid(). In order to make this work with the blame machinery, we need to initialize a struct bloom_key with the initial path. But also, we need to add more keys to a list if a rename is detected. We then check to see if _any_ of these keys answer "maybe" in the diff. During development, I purposefully left out this "add a new key when a rename is detected" to see if the test suite would catch my error. That is how I discovered the issues with GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS from the previous change. With that change, we can feel some confidence in the coverage of this change. If a user requests copy detection using "git blame -C", then there are more places where the set of "important" files can expand. I do not know enough about how this happens in the blame machinery. Thus, the Bloom filter integration is explicitly disabled in this mode. A later change could expand the bloom_key data with an appropriate call (or calls) to add_bloom_key(). If we did not disable this mode, then the following tests would fail: t8003-blame-corner-cases.sh t8011-blame-split-file.sh Generally, this is a performance enhancement and should not change the behavior of 'git blame' in any way. If a repo has a commit-graph file with computed changed-path Bloom filters, then they should notice improved performance for their 'git blame' commands. Here are some example timings that I found by blaming some paths in the Linux kernel repository: git blame arch/x86/kernel/topology.c >/dev/null Before: 0.83s After: 0.24s git blame kernel/time/time.c >/dev/null Before: 0.72s After: 0.24s git blame tools/perf/ui/stdio/hist.c >/dev/null Before: 0.27s After: 0.11s I specifically looked for "deep" paths that were also edited many times. As a counterpoint, the MAINTAINERS file was edited many times but is located in the root tree. This means that the cost of computing a diff relative to the pathspec is very small. Here are the timings for that command: git blame MAINTAINERS >/dev/null Before: 20.1s After: 18.0s These timings are the best of five. The worst-case runs were on the order of 2.5 minutes for both cases. Note that the MAINTAINERS file has 18,740 lines across 17,000+ commits. This happens to be one of the cases where this change provides the least improvement. The lack of improvement for the MAINTAINERS file and the relatively modest improvement for the other examples can be easily explained. The blame machinery needs to compute line-level diffs to determine which lines were changed by each commit. That makes up a large proportion of the computation time, and this change does not attempt to improve on that section of the algorithm. The MAINTAINERS file is large and changed often, so it takes time to determine which lines were updated by which commit. In contrast, the code files are much smaller, and it takes longer to comute the line-by-line diff for a single patch on the Linux mailing lists. Outside of the "-C" integration, I believe there is little more to gain from the changed-path Bloom filters for 'git blame' after this patch. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-01Merge branch 'sg/blame-indent-heuristics-is-now-the-default'Libravatar Junio C Hamano1-8/+0
Message update. * sg/blame-indent-heuristics-is-now-the-default: builtin/blame.c: remove '--indent-heuristic' from usage string
2019-11-10Merge branch 'hv/bitshift-constants-in-blame'Libravatar Junio C Hamano1-12/+12
Move the definition of a set of bitmask constants from 0ctal literal to (1U<<count) notation. * hv/bitshift-constants-in-blame: builtin/blame.c: constants into bit shift format
2019-10-29builtin/blame.c: remove '--indent-heuristic' from usage stringLibravatar SZEDER Gábor1-8/+0
The indent heuristic is our default diff heuristic since 33de716387 (diff: enable indent heuristic by default, 2017-05-08), but the usage string of 'git blame' still mentions it as "experimental heuristic". We could simply update the short help associated with the option, but according to the comment above the option's declaration it was "only included here to get included in the "-h" output". That made sense while the feature was still experimental and we wanted to give it more exposure, but nowadays it's unnecessary. So let's rather remove the '--indent-heuristic' option from 'git blame's usage string. Note that 'git blame' will still accept this option, as it is parsed in parse_revision_opt(). Astute readers may notice that this patch removes a comment mentioning "the following two options", but it only removes one option. The reason is that the comment is outdated: that other options was '--compaction-heuristic', and it has already been removed in 3cde4e02ee (diff: retire "compaction" heuristics, 2016-12-23), but that commit forgot to update this comment. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-21builtin/blame.c: constants into bit shift formatLibravatar Hariom Verma1-12/+12
We are looking at bitfield constants, and elsewhere in the Git source code, such cases are handled via bit shift operators rather than octal numbers, which also makes it easier to spot holes in the range (if, say, 1<<5 was missing, it is easier to spot it between 1<<4 and 1<<6 than it is to spot a missing 040 between a 020 and a 0100). Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-11Merge branch 'rs/dedup-includes'Libravatar Junio C Hamano1-1/+0
Code cleanup. * rs/dedup-includes: treewide: remove duplicate #include directives
2019-10-04treewide: remove duplicate #include directivesLibravatar René Scharfe1-1/+0
Found with "git grep '^#include ' '*.c' | sort | uniq -d". Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19builtin/blame: switch uses of GIT_SHA1_HEXSZ to the_hash_algoLibravatar brian m. carlson1-3/+4
Switch several uses of GIT_SHA1_HEXSZ to the_hash_algo. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-19Merge branch 'br/blame-ignore'Libravatar Junio C Hamano1-0/+56
"git blame" learned to "ignore" commits in the history, whose effects (as well as their presence) get ignored. * br/blame-ignore: t8014: remove unnecessary braces blame: drop some unused function parameters blame: add a test to cover blame_coalesce() blame: use the fingerprint heuristic to match ignored lines blame: add a fingerprint heuristic to match ignored lines blame: optionally track line fingerprints during fill_blame_origin() blame: add config options for the output of ignored or unblamable lines blame: add the ability to ignore commits and their changes blame: use a helper function in blame_chunk() Move oidset_parse_file() to oidset.c fsck: rename and touch up init_skiplist()
2019-06-20cache-tree/blame: avoid reusing the DEBUG constantLibravatar Jeff Hostetler1-3/+3
In MS Visual C, the `DEBUG` constant is set automatically whenever compiling with debug information. This is clearly not what was intended in `cache-tree.c` nor in `builtin/blame.c`, so let's use a less ambiguous name there. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-16blame: add config options for the output of ignored or unblamable linesLibravatar Barret Rhoden1-0/+18
When ignoring commits, the commit that is blamed might not be responsible for the change, due to the inaccuracy of our heuristic. Users might want to know when a particular line has a potentially inaccurate blame. Furthermore, guess_line_blames() may fail to find any parent commit for a given line touched by an ignored commit. Those 'unblamable' lines remain blamed on an ignored commit. Users might want to know if a line is unblamable so that they do not spend time investigating a commit they know is uninteresting. This patch adds two config options to mark these two types of lines in the output of blame. The first option can identify ignored lines by specifying blame.markIgnoredLines. When this option is set, each blame line that was blamed on a commit other than the ignored commit is marked with a '?'. For example: 278b6158d6fdb (Barret Rhoden 2016-04-11 13:57:54 -0400 26) appears as: ?278b6158d6fd (Barret Rhoden 2016-04-11 13:57:54 -0400 26) where the '?' is placed before the commit, and the hash has one fewer characters. Sometimes we are unable to even guess at what ancestor commit touched a line. These lines are 'unblamable.' The second option, blame.markUnblamableLines, will mark the line with '*'. For example, say we ignore e5e8d36d04cbe, yet we are unable to blame this line on another commit: e5e8d36d04cbe (Barret Rhoden 2016-04-11 13:57:54 -0400 26) appears as: *e5e8d36d04cb (Barret Rhoden 2016-04-11 13:57:54 -0400 26) When these config options are used together, every line touched by an ignored commit will be marked with either a '?' or a '*'. Signed-off-by: Barret Rhoden <brho@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-16blame: add the ability to ignore commits and their changesLibravatar Barret Rhoden1-0/+38
Commits that make formatting changes or function renames are often not interesting when blaming a file. A user may deem such a commit as 'not interesting' and want to ignore and its changes it when assigning blame. For example, say a file has the following git history / rev-list: ---O---A---X---B---C---D---Y---E---F Commits X and Y both touch a particular line, and the other commits do not: X: "Take a third parameter" -MyFunc(1, 2); +MyFunc(1, 2, 3); Y: "Remove camelcase" -MyFunc(1, 2, 3); +my_func(1, 2, 3); git-blame will blame Y for the change. I'd like to be able to ignore Y: both the existence of the commit as well as any changes it made. This differs from -S rev-list, which specifies the list of commits to process for the blame. We would still process Y, but just don't let the blame 'stick.' This patch adds the ability for users to ignore a revision with --ignore-rev=rev, which may be repeated. They can specify a set of files of full object names of revs, e.g. SHA-1 hashes, one per line. A single file may be specified with the blame.ignoreRevFile config option or with --ignore-rev-file=file. Both the config option and the command line option may be repeated multiple times. An empty file name "" will clear the list of revs from previously processed files. Config options are processed before command line options. For a typical use case, projects will maintain the file containing revisions for commits that perform mass reformatting, and their users have the option to ignore all of the commits in that file. Additionally, a user can use the --ignore-rev option for one-off investigation. To go back to the example above, X was a substantive change to the function, but not the change the user is interested in. The user inspected X, but wanted to find the previous change to that line - perhaps a commit that introduced that function call. To make this work, we can't simply remove all ignored commits from the rev-list. We need to diff the changes introduced by Y so that we can ignore them. We let the blames get passed to Y, just like when processing normally. When Y is the target, we make sure that Y does not *keep* any blames. Any changes that Y is responsible for get passed to its parent. Note we make one pass through all of the scapegoats (parents) to attempt to pass blame normally; we don't know if we *need* to ignore the commit until we've checked all of the parents. The blame_entry will get passed up the tree until we find a commit that has a diff chunk that affects those lines. One issue is that the ignored commit *did* make some change, and there is no general solution to finding the line in the parent commit that corresponds to a given line in the ignored commit. That makes it hard to attribute a particular line within an ignored commit's diff correctly. For example, the parent of an ignored commit has this, say at line 11: commit-a 11) #include "a.h" commit-b 12) #include "b.h" Commit X, which we will ignore, swaps these lines: commit-X 11) #include "b.h" commit-X 12) #include "a.h" We can pass that blame entry to the parent, but line 11 will be attributed to commit A, even though "include b.h" came from commit B. The blame mechanism will be looking at the parent's view of the file at line number 11. ignore_blame_entry() is set up to allow alternative algorithms for guessing per-line blames. Any line that is not attributed to the parent will continue to be blamed on the ignored commit as if that commit was not ignored. Upcoming patches have the ability to detect these lines and mark them in the blame output. The existing algorithm is simple: blame each line on the corresponding line in the parent's diff chunk. Any lines beyond that stay with the target. For example, the parent of an ignored commit has this, say at line 11: commit-a 11) void new_func_1(void *x, void *y); commit-b 12) void new_func_2(void *x, void *y); commit-c 13) some_line_c commit-d 14) some_line_d After a commit 'X', we have: commit-X 11) void new_func_1(void *x, commit-X 12) void *y); commit-X 13) void new_func_2(void *x, commit-X 14) void *y); commit-c 15) some_line_c commit-d 16) some_line_d Commit X nets two additionally lines: 13 and 14. The current guess_line_blames() algorithm will not attribute these to the parent, whose diff chunk is only two lines - not four. When we ignore with the current algorithm, we get: commit-a 11) void new_func_1(void *x, commit-b 12) void *y); commit-X 13) void new_func_2(void *x, commit-X 14) void *y); commit-c 15) some_line_c commit-d 16) some_line_d Note that line 12 was blamed on B, though B was the commit for new_func_2(), not new_func_1(). Even when guess_line_blames() finds a line in the parent, it may still be incorrect. Signed-off-by: Barret Rhoden <brho@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-25Merge branch 'sg/blame-in-bare-start-at-head'Libravatar Junio C Hamano1-0/+13
"git blame -- path" in a non-bare repository starts blaming from the working tree, and the same command in a bare repository errors out because there is no working tree by definition. The command has been taught to instead start blaming from the commit at HEAD, which is more useful. * sg/blame-in-bare-start-at-head: blame: default to HEAD in a bare repo when no start commit is given
2019-04-08blame: default to HEAD in a bare repo when no start commit is givenLibravatar SZEDER Gábor1-0/+13
When 'git blame' is invoked without specifying the commit to start blaming from, it starts from the given file's state in the work tree. However, when invoked in a bare repository without a start commit, then there is no work tree state to start from, and it dies with the following error message: $ git rev-parse --is-bare-repository true $ git blame file.c fatal: this operation must be run in a work tree This is misleading, because it implies that 'git blame' doesn't work in bare repositories at all, but it does, in fact, work just fine when it is given a commit to start from. We could improve the error message, of course, but let's just default to HEAD in a bare repository instead, as most likely that is what the user wanted anyway (if they wanted to start from an other commit, then they would have specified that in the first place). 'git annotate' is just a thin wrapper around 'git blame', so in the same situation it printed the same misleading error message, and this patch fixes it, too. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-07Merge branch 'nd/diff-parseopt'Libravatar Junio C Hamano1-1/+1
The diff machinery, one of the oldest parts of the system, which long predates the parse-options API, uses fairly long and complex handcrafted option parser. This is being rewritten to use the parse-options API. * nd/diff-parseopt: diff.c: convert --raw diff.c: convert -W|--[no-]function-context diff.c: convert -U|--unified diff.c: convert -u|-p|--patch diff.c: prepare to use parse_options() for parsing diff.h: avoid bit fields in struct diff_flags diff.h: keep forward struct declarations sorted parse-options: allow ll_callback with OPTION_CALLBACK parse-options: avoid magic return codes parse-options: stop abusing 'callback' for lowlevel callbacks parse-options: add OPT_BITOP() parse-options: disable option abbreviation with PARSE_OPT_KEEP_UNKNOWN parse-options: add one-shot mode parse-options.h: remove extern on function prototypes
2019-02-06Merge branch 'lt/date-human'Libravatar Junio C Hamano1-0/+4
A new date format "--date=human" that morphs its output depending on how far the time is from the current time has been introduced. "--date=auto" can be used to use this new format when the output is going to the pager or to the terminal and otherwise the default format. * lt/date-human: Add `human` date format tests. Add `human` format to test-tool Add 'human' date format documentation Replace the proposed 'auto' mode with 'auto:' Add 'human' date format
2019-01-27parse-options: stop abusing 'callback' for lowlevel callbacksLibravatar Nguyễn Thái Ngọc Duy1-1/+1
Lowlevel callbacks have different function signatures. Add a new field in 'struct option' with the right type for lowlevel callbacks. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switchLibravatar Nguyễn Thái Ngọc Duy1-1/+2
By default, index compat macros are off from now on, because they could hide the_index dependency. Only those in builtin can use it. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-18Add 'human' date formatLibravatar Linus Torvalds1-0/+4
This adds --date=human, which skips the timezone if it matches the current time-zone, and doesn't print the whole date if that matches (ie skip printing year for dates that are "this year", but also skip the whole date itself if it's in the last few days and we can just say what weekday it was). For really recent dates (same day), use the relative date stamp, while for old dates (year doesn't match), don't bother with time and timezone. Also add 'auto' date mode, which defaults to human if we're using the pager. So you can do git config --add log.date auto and your "git log" commands will show the human-legible format unless you're scripting things. Note that this time format still shows the timezone for recent enough events (but not so recent that they show up as relative dates). You can combine it with the "-local" suffix to never show timezones for an even more simplified view. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Stephen P. Smith <ischis2@cox.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-15Merge branch 'nd/show-gitcomp-compilation-fix' into maintLibravatar Junio C Hamano1-0/+2
Portability fix for a recent update to parse-options API. * nd/show-gitcomp-compilation-fix: parse-options: fix SunCC compiler warning
2018-12-12parse-options: fix SunCC compiler warningLibravatar Nguyễn Thái Ngọc Duy1-0/+2
The compiler reports this because show_gitcomp() never actually returns a value: "parse-options.c", line 520: warning: Function has no return statement : show_gitcomp We could shut the compiler up. But instead let's not bury exit() too deep. Do the same as internal -h handling, return a special error code and handle the exit() in parse_options() (and other parse_options_step() callers) instead. Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-06assert NOARG/NONEG behavior of parse-options callbacksLibravatar Jeff King1-0/+4
When we define a parse-options callback, the flags we put in the option struct must match what the callback expects. For example, a callback which does not handle the "unset" parameter should only be used with PARSE_OPT_NONEG. But since the callback and the option struct are not defined next to each other, it's easy to get this wrong (as earlier patches in this series show). Fortunately, the compiler can help us here: compiling with -Wunused-parameters can show us which callbacks ignore their "unset" parameters (and likewise, ones that ignore "arg" expect to be triggered with PARSE_OPT_NOARG). But after we've inspected a callback and determined that all of its callers use the right flags, what do we do next? We'd like to silence the compiler warning, but do so in a way that will catch any wrong calls in the future. We can do that by actually checking those variables and asserting that they match our expectations. Because this is such a common pattern, we'll introduce some helper macros. The resulting messages aren't as descriptive as we could make them, but the file/line information from BUG() is enough to identify the problem (and anyway, the point is that these should never be seen). Each of the annotated callbacks in this patch triggers -Wunused-parameters, and was manually inspected to make sure all callers use the correct options (so none of these BUGs should be triggerable). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-21revision.c: remove implicit dependency on the_indexLibravatar Nguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-21line-range.c: remove implicit dependency on the_indexLibravatar Nguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20Merge branch 'nd/no-the-index'Libravatar Junio C Hamano1-0/+1
The more library-ish parts of the codebase learned to work on the in-core index-state instance that is passed in by their callers, instead of always working on the singleton "the_index" instance. * nd/no-the-index: (24 commits) blame.c: remove implicit dependency on the_index apply.c: remove implicit dependency on the_index apply.c: make init_apply_state() take a struct repository apply.c: pass struct apply_state to more functions resolve-undo.c: use the right index instead of the_index archive-*.c: use the right repository archive.c: avoid access to the_index grep: use the right index instead of the_index attr: remove index from git_attr_set_direction() entry.c: use the right index instead of the_index submodule.c: use the right index instead of the_index pathspec.c: use the right index instead of the_index unpack-trees: avoid the_index in verify_absent() unpack-trees: convert clear_ce_flags* to avoid the_index unpack-trees: don't shadow global var the_index unpack-trees: add a note about path invalidation unpack-trees: remove 'extern' on function declaration ls-files: correct index argument to get_convert_attr_ascii() preload-index.c: use the right index instead of the_index dir.c: remove an implicit dependency on the_index in pathspec code ...
2018-08-15Merge branch 'nd/i18n'Libravatar Junio C Hamano1-1/+1
Many more strings are prepared for l10n. * nd/i18n: (23 commits) transport-helper.c: mark more strings for translation transport.c: mark more strings for translation sha1-file.c: mark more strings for translation sequencer.c: mark more strings for translation replace-object.c: mark more strings for translation refspec.c: mark more strings for translation refs.c: mark more strings for translation pkt-line.c: mark more strings for translation object.c: mark more strings for translation exec-cmd.c: mark more strings for translation environment.c: mark more strings for translation dir.c: mark more strings for translation convert.c: mark more strings for translation connect.c: mark more strings for translation config.c: mark more strings for translation commit-graph.c: mark more strings for translation builtin/replace.c: mark more strings for translation builtin/pack-objects.c: mark more strings for translation builtin/grep.c: mark strings for translation builtin/config.c: mark more strings for translation ...
2018-08-13blame.c: remove implicit dependency on the_indexLibravatar Nguyễn Thái Ngọc Duy1-0/+1
Side note, since we gain access to the right repository, we can stop rely on the_repository in this code as well. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-02Merge branch 'is/parsing-line-range'Libravatar Junio C Hamano1-2/+2
Parsing of -L[<N>][,[<M>]] parameters "git blame" and "git log" take has been tweaked. * is/parsing-line-range: log: prevent error if line range ends past end of file blame: prevent error if range ends past end of file
2018-07-24Merge branch 'sb/blame-color'Libravatar Junio C Hamano1-1/+3
Code clean-up. * sb/blame-color: blame: prefer xsnprintf to strcpy for colors
2018-07-23Update messages in preparation for i18nLibravatar Nguyễn Thái Ngọc Duy1-1/+1
Many messages will be marked for translation in the following commits. This commit updates some of them to be more consistent and reduce diff noise in those commits. Changes are - keep the first letter of die(), error() and warning() in lowercase - no full stop in die(), error() or warning() if it's single sentence messages - indentation - some messages are turned to BUG(), or prefixed with "BUG:" and will not be marked for i18n - some messages are improved to give more information - some messages are broken down by sentence to be i18n friendly (on the same token, combine multiple warning() into one big string) - the trailing \n is converted to printf_ln if possible, or deleted if not redundant - errno_errno() is used instead of explicit strerror() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>