summaryrefslogtreecommitdiff
path: root/git-add--interactive.perl
AgeCommit message (Collapse)AuthorFilesLines
2017-07-10Merge branch 'pw/unquote-path-in-git-pm'Libravatar Junio C Hamano1-42/+1
Code refactoring. * pw/unquote-path-in-git-pm: t9700: add tests for Git::unquote_path() Git::unquote_path(): throw an exception on bad path Git::unquote_path(): handle '\a' add -i: move unquote_path() to Git.pm
2017-06-30add -i: move unquote_path() to Git.pmLibravatar Phillip Wood1-42/+1
Move unquote_path() from git-add--interactive to Git.pm so it can be used by other scripts. Note this is a straight copy, it does not handle '\a'. That will be fixed in the next commit. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-26Merge branch 'jk/add-p-commentchar-fix'Libravatar Junio C Hamano1-1/+2
"git add -p" were updated in 2.12 timeframe to cope with custom core.commentchar but the implementation was buggy and a metacharacter like $ and * did not work. * jk/add-p-commentchar-fix: add--interactive: quote commentChar regex add--interactive: handle EOF in prompt_yesno
2017-06-21add--interactive: quote commentChar regexLibravatar Jeff King1-1/+1
Since c9d961647 (i18n: add--interactive: mark edit_hunk_manually message for translation, 2016-12-14), when the user asks to edit a hunk manually, we respect core.commentChar in generating the edit instructions. However, when we then strip out comment lines, we use a simple regex like: /^$commentChar/ If your chosen comment character is a regex metacharacter, then that will behave in a confusing manner ("$", for instance, would only eliminate blank lines, not actual comment lines). We can fix that by telling perl not to respect metacharacters. Reported-by: Christian Rösch <christian@croesch.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-21add--interactive: handle EOF in prompt_yesnoLibravatar Jeff King1-0/+1
The prompt_yesno function loops indefinitely waiting for a "y" or "n" response. But it doesn't handle EOF, meaning that we can end up in an infinite loop of reading EOF from stdin. One way to simulate that is with: echo e | GIT_EDITOR='echo corrupt >' git add -p Let's break out of the loop and propagate the undef to the caller. Without modifying the callers that effectively turns it into a "no" response. This is reasonable for both of the current callers, and it leaves room for any future caller to check for undef explicitly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-09add--interactive: drop diff.indentHeuristic handlingLibravatar Jeff King1-4/+0
Now that diff.indentHeuristic is handled automatically by the plumbing commands, there's no need to propagate it manually. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Marc Branchaud <marcnarc@xiplink.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-19Merge branch 'va/i18n-perl-scripts'Libravatar Junio C Hamano1-1/+1
Message fix. * va/i18n-perl-scripts: git-add--interactive.perl: add missing dot in a message
2017-04-13git-add--interactive.perl: add missing dot in a messageLibravatar Ralf Thielow1-1/+1
One message appears twice in the translations and the only difference is a dot at the end. So add this dot to make the messages being identical. Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-17Merge branch 'jk/add-i-use-pathspecs'Libravatar Junio C Hamano1-11/+2
"git add -p <pathspec>" unnecessarily expanded the pathspec to a list of individual files that matches the pathspec by running "git ls-files <pathspec>", before feeding it to "git diff-index" to see which paths have changes, because historically the pathspec language supported by "diff-index" was weaker. These days they are equivalent and there is no reason to internally expand it. This helps both performance and avoids command line argument limit on some platforms. * jk/add-i-use-pathspecs: add--interactive: do not expand pathspecs with ls-files
2017-03-14add--interactive: do not expand pathspecs with ls-filesLibravatar Jeff King1-11/+2
When we want to get the list of modified files, we first expand any user-provided pathspecs with "ls-files", and then feed the resulting list of paths as arguments to "diff-index" and "diff-files". If your pathspec expands into a large number of paths, you may run into one of two problems: 1. The OS may complain about the size of the argument list, and refuse to run. For example: $ (ulimit -s 128 && git add -p drivers) Can't exec "git": Argument list too long at .../git-add--interactive line 177. Died at .../git-add--interactive line 177. That's on the linux.git repository, which has about 20K files in the "drivers" directory (none of them modified in this case). The "ulimit -s" trick is necessary to show the problem on Linux even for such a gigantic set of paths. Other operating systems have much smaller limits (e.g., a real-world case was seen with only 5K files on OS X). 2. Even when it does work, it's really slow. The pathspec code is not optimized for huge numbers of paths. Here's the same case without the ulimit: $ time git add -p drivers No changes. real 0m16.559s user 0m53.140s sys 0m0.220s We can improve this by skipping "ls-files" completely, and just feeding the original pathspecs to the diff commands. This solution was discussed in 2010: http://public-inbox.org/git/20100105041438.GB12574@coredump.intra.peff.net/ but at the time the diff code's pathspecs were more primitive than those used by ls-files (e.g., they did not support globs). Making the change would have caused a user-visible regression, so we didn't. Since then, the pathspec code has been unified, and the diff commands natively understand pathspecs like '*.c'. This patch implements that solution. That skips the argument-list limits, and the result runs much faster: $ time git add -p drivers No changes. real 0m0.149s user 0m0.116s sys 0m0.080s There are two new tests. The first just exercises the globbing behavior to confirm that we are not causing a regression there. The second checks the actual argument behavior using GIT_TRACE. We _could_ do it with the "ulimit -s" trick, as above. But that would mean the test could only run where "ulimit -s" works. And tests of that sort are expensive, because we have to come up with enough files to actually bust the limit (we can't just shrink the "128" down infinitely, since it is also the in-program stack size). Finally, two caveats and possibilities for future work: a. This fixes one argument-list expansion, but there may be others. In fact, it's very likely that if you run "git add -i" and select a large number of modified files that the script would try to feed them all to a single git command. In practice this is probably fine. The real issue here is that the argument list was growing with the _total_ number of files, not the number of modified or selected files. b. If the repository contains filenames with literal wildcard characters (e.g., "foo*"), the original code expanded them via "ls-files" and then fed those wildcard names to "diff-index", which would have treated them as wildcards. This was a bug, which is now fixed (though unless you really go through some contortions with ":(literal)", it's likely that your original pathspec would match whatever the accidentally-expanded wildcard would anyway). So this takes us one step closer to working correctly with files whose names contain wildcard characters, but it's likely that others remain (e.g., if "git add -i" feeds the selected paths to "git add"). Reported-by: Wincent Colaiuta <win@wincent.com> Reported-by: Mislav Marohnić <mislav.marohnic@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-02add--interactive: fix missing file prompt for patch mode with "-i"Libravatar Jeff King1-4/+4
When invoked as "git add -i", each menu interactive menu option prompts the user to select a list of files. This includes the "patch" option, which gets the list before starting the hunk-selection loop. As "git add -p", it behaves differently, and jumps straight to the hunk selection loop. Since 0539d5e6d (i18n: add--interactive: mark patch prompt for translation, 2016-12-14), the "add -i" case mistakenly jumps to straight to the hunk-selection loop. Prior to that commit the distinction between the two cases was managed by the $patch_mode variable. That commit used $patch_mode for something else, and moved the old meaning to the "$cmd" variable. But it forgot to update the $patch_mode check inside patch_update_cmd() which controls the file-list behavior. The simplest fix would be to change that line to check $cmd. But while we're here, let's use a less obscure name for this flag: $patch_mode_only, a boolean which tells whether we are in full-interactive mode or only in patch-mode. Reported-by: Henrik Grubbström <grubba@grubba.org> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-24Merge branch 'rt/align-add-i-help-text'Libravatar Junio C Hamano1-1/+1
Doc update. * rt/align-add-i-help-text: git add -i: replace \t with blanks in the help message
2017-02-22git add -i: replace \t with blanks in the help messageLibravatar Ralf Thielow1-1/+1
Within the help message of 'git add -i', the 'diff' command uses one tab character and blanks to create the space between the name and the description while the others use blanks only. So if the tab size is not at 4 characters, this description will not be in range. Replace the tab character with blanks. Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-10Merge branch 'jc/retire-compaction-heuristics'Libravatar Junio C Hamano1-3/+0
"git diff" and its family had two experimental heuristics to shift the contents of a hunk to make the patch easier to read. One of them turns out to be better than the other, so leave only the "--indent-heuristic" option and remove the other one. * jc/retire-compaction-heuristics: diff: retire "compaction" heuristics
2016-12-27Merge branch 'va/i18n-perl-scripts'Libravatar Junio C Hamano1-114/+215
Porcelain scripts written in Perl are getting internationalized. * va/i18n-perl-scripts: i18n: difftool: mark warnings for translation i18n: send-email: mark composing message for translation i18n: send-email: mark string with interpolation for translation i18n: send-email: mark warnings and errors for translation i18n: send-email: mark strings for translation i18n: add--interactive: mark status words for translation i18n: add--interactive: remove %patch_modes entries i18n: add--interactive: mark edit_hunk_manually message for translation i18n: add--interactive: i18n of help_patch_cmd i18n: add--interactive: mark patch prompt for translation i18n: add--interactive: mark plural strings i18n: clean.c: match string with git-add--interactive.perl i18n: add--interactive: mark strings with interpolation for translation i18n: add--interactive: mark simple here-documents for translation i18n: add--interactive: mark strings for translation Git.pm: add subroutines for commenting lines
2016-12-23diff: retire "compaction" heuristicsLibravatar Junio C Hamano1-3/+0
When a patch inserts a block of lines, whose last lines are the same as the existing lines that appear before the inserted block, "git diff" can choose any place between these existing lines as the boundary between the pre-context and the added lines (adjusting the end of the inserted block as appropriate) to come up with variants of the same patch, and some variants are easier to read than others. We have been trying to improve the choice of this boundary, and Git 2.11 shipped with an experimental "compaction-heuristic". Since then another attempt to improve the logic further resulted in a new "indent-heuristic" logic. It is agreed that the latter gives better result overall, and the former outlived its usefulness. Retire "compaction", and keep "indent" as an experimental feature. The latter hopefully will be turned on by default in a future release, but that should be done as a separate step. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-14i18n: add--interactive: mark status words for translationLibravatar Vasco Almeida1-6/+6
Mark words 'nothing', 'unchanged' and 'binary' used to display what has been staged or not, in "git add -i" status command. Alternatively one could mark N__('nothing') no-op in order to xgettext(1) extract the string and then trigger the translation at run time only with __($print->{FILE}), but that has the side effect of triggering retrieval of translations for the changes indicator too (e.g. +2/-1) which may or may not be a problem. To avoid that potential problem, mark only where there is certain to trigger translation only of those words but in this case we must also retrieve the translation for the eq tests, since the value assigned was of the translation, not the English source. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-14i18n: add--interactive: remove %patch_modes entriesLibravatar Vasco Almeida1-21/+0
Remove unnecessary entries from %patch_modes. After the i18n conversion, these entries are not used anymore. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-14i18n: add--interactive: mark edit_hunk_manually message for translationLibravatar Vasco Almeida1-13/+39
Mark message of edit_hunk_manually displayed in the editing file when user chooses 'e' option. The message had to be unfolded to allow translation of the $participle verb. Some messages end up being exactly the same for some use cases, but left it for easier change in the future, e.g., wanting to change wording of one particular use case. The comment character is now used according to the git configuration core.commentchar. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-14i18n: add--interactive: i18n of help_patch_cmdLibravatar Vasco Almeida1-8/+46
Mark help message of help_patch_cmd for translation. The message must be unfolded to be free of variables so we can have high quality translations. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-14i18n: add--interactive: mark patch prompt for translationLibravatar Vasco Almeida1-8/+46
Mark prompt message assembled in place for translation, unfolding each use case for each entry in the %patch_modes hash table. Previously, this script relied on whether $patch_mode was set to run the command patch_update_cmd() or show status and loop the main loop. Now, it uses $cmd to indicate we must run patch_update_cmd() and $patch_mode is used to tell which flavor of the %patch_modes are we on. This is introduced in order to be able to mark and unfold the message prompt knowing in which context we are. The tracking of context was done previously by point %patch_mode_flavour hash table to the correct entry of %patch_modes, focusing only on value of %patch_modes. Now, we are also interested in the key ('staged', 'stash', 'checkout_head', ...). Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-14i18n: add--interactive: mark plural stringsLibravatar Vasco Almeida1-9/+18
Mark plural strings for translation. Unfold each action case in one entire sentence. Pass new keyword for xgettext to extract. Update test to include new subroutine __n() for plural strings handling. Update documentation to include a description of the new __n() subroutine. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-14i18n: add--interactive: mark strings with interpolation for translationLibravatar Vasco Almeida1-12/+13
Since at this point Git::I18N.perl lacks support for Perl i18n placeholder substitution, use of sprintf following die or error_msg is necessary for placeholder substitution take place. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-14i18n: add--interactive: mark simple here-documents for translationLibravatar Vasco Almeida1-3/+5
Mark messages in here-documents without interpolation for translation. The here-document delimiter \EOF, which is the same as 'EOF', indicates that the text is to be treated literally without interpolation of its content. Unfortunately xgettext is not able to extract here-documents delimited with \EOF but it is with delimiter enclosed in single quotes. So change \EOF to 'EOF', although in this case does not make difference what variation of here-document to use since there is nothing to interpolate. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-14i18n: add--interactive: mark strings for translationLibravatar Vasco Almeida1-34/+42
Mark simple strings (without interpolation) for translation. Brackets around first parameter of ternary operator is necessary because otherwise xgettext fails to extract strings marked for translation from the rest of the file. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-19diff: improve positioning of add/delete blocks in diffsLibravatar Michael Haggerty1-1/+4
Some groups of added/deleted lines in diffs can be slid up or down, because lines at the edges of the group are not unique. Picking good shifts for such groups is not a matter of correctness but definitely has a big effect on aesthetics. For example, consider the following two diffs. The first is what standard Git emits: --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -231,6 +231,9 @@ if (!defined $initial_reply_to && $prompting) { } if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} +if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { $smtp_server = $_; The following diff is equivalent, but is obviously preferable from an aesthetic point of view: --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -230,6 +230,9 @@ if (!defined $initial_reply_to && $prompting) { $initial_reply_to =~ s/(^\s+|\s+$)//g; } +if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { This patch teaches Git to pick better positions for such "diff sliders" using heuristics that take the positions of nearby blank lines and the indentation of nearby lines into account. The existing Git code basically always shifts such "sliders" as far down in the file as possible. The only exception is when the slider can be aligned with a group of changed lines in the other file, in which case Git favors depicting the change as one add+delete block rather than one add and a slightly offset delete block. This naive algorithm often yields ugly diffs. Commit d634d61ed6 improved the situation somewhat by preferring to position add/delete groups to make their last line a blank line, when that is possible. This heuristic does more good than harm, but (1) it can only help if there are blank lines in the right places, and (2) always picks the last blank line, even if there are others that might be better. The end result is that it makes perhaps 1/3 as many errors as the default Git algorithm, but that still leaves a lot of ugly diffs. This commit implements a new and much better heuristic for picking optimal "slider" positions using the following approach: First observe that each hypothetical positioning of a diff slider introduces two splits: one between the context lines preceding the group and the first added/deleted line, and the other between the last added/deleted line and the first line of context following it. It tries to find the positioning that creates the least bad splits. Splits are evaluated based only on the presence and locations of nearby blank lines, and the indentation of lines near the split. Basically, it prefers to introduce splits adjacent to blank lines, between lines that are indented less, and between lines with the same level of indentation. In more detail: 1. It measures the following characteristics of a proposed splitting position in a `struct split_measurement`: * the number of blank lines above the proposed split * whether the line directly after the split is blank * the number of blank lines following that line * the indentation of the nearest non-blank line above the split * the indentation of the line directly below the split * the indentation of the nearest non-blank line after that line 2. It combines the measured attributes using a bunch of empirically-optimized weighting factors to derive a `struct split_score` that measures the "badness" of splitting the text at that position. 3. It combines the `split_score` for the top and the bottom of the slider at each of its possible positions, and selects the position that has the best `split_score`. I determined the initial set of weighting factors by collecting a corpus of Git histories from 29 open-source software projects in various programming languages. I generated many diffs from this corpus, and determined the best positioning "by eye" for about 6600 diff sliders. I used about half of the repositories in the corpus (corresponding to about 2/3 of the sliders) as a training set, and optimized the weights against this corpus using a crude automated search of the parameter space to get the best agreement with the manually-determined values. Then I tested the resulting heuristic against the full corpus. The results are summarized in the following table, in column `indent-1`: | repository | count | Git 2.9.0 | compaction | compaction-fixed | indent-1 | indent-2 | | --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- | | afnetworking | 109 | 89 (81.7%) | 37 (33.9%) | 37 (33.9%) | 2 (1.8%) | 2 (1.8%) | | alamofire | 30 | 18 (60.0%) | 14 (46.7%) | 15 (50.0%) | 0 (0.0%) | 0 (0.0%) | | angular | 184 | 127 (69.0%) | 39 (21.2%) | 23 (12.5%) | 5 (2.7%) | 5 (2.7%) | | animate | 313 | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | | ant | 380 | 356 (93.7%) | 152 (40.0%) | 148 (38.9%) | 15 (3.9%) | 15 (3.9%) | * | bugzilla | 306 | 263 (85.9%) | 109 (35.6%) | 99 (32.4%) | 14 (4.6%) | 15 (4.9%) | * | corefx | 126 | 91 (72.2%) | 22 (17.5%) | 21 (16.7%) | 6 (4.8%) | 6 (4.8%) | | couchdb | 78 | 44 (56.4%) | 26 (33.3%) | 28 (35.9%) | 6 (7.7%) | 6 (7.7%) | * | cpython | 937 | 158 (16.9%) | 50 (5.3%) | 49 (5.2%) | 5 (0.5%) | 5 (0.5%) | * | discourse | 160 | 95 (59.4%) | 42 (26.2%) | 36 (22.5%) | 18 (11.2%) | 13 (8.1%) | | docker | 307 | 194 (63.2%) | 198 (64.5%) | 253 (82.4%) | 8 (2.6%) | 8 (2.6%) | * | electron | 163 | 132 (81.0%) | 38 (23.3%) | 39 (23.9%) | 6 (3.7%) | 6 (3.7%) | | git | 536 | 470 (87.7%) | 73 (13.6%) | 78 (14.6%) | 16 (3.0%) | 16 (3.0%) | * | gitflow | 127 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | | ionic | 133 | 89 (66.9%) | 29 (21.8%) | 38 (28.6%) | 1 (0.8%) | 1 (0.8%) | | ipython | 482 | 362 (75.1%) | 167 (34.6%) | 169 (35.1%) | 11 (2.3%) | 11 (2.3%) | * | junit | 161 | 147 (91.3%) | 67 (41.6%) | 66 (41.0%) | 1 (0.6%) | 1 (0.6%) | * | lighttable | 15 | 5 (33.3%) | 0 (0.0%) | 2 (13.3%) | 0 (0.0%) | 0 (0.0%) | | magit | 88 | 75 (85.2%) | 11 (12.5%) | 9 (10.2%) | 1 (1.1%) | 0 (0.0%) | | neural-style | 28 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | | nodejs | 781 | 649 (83.1%) | 118 (15.1%) | 111 (14.2%) | 4 (0.5%) | 5 (0.6%) | * | phpmyadmin | 491 | 481 (98.0%) | 75 (15.3%) | 48 (9.8%) | 2 (0.4%) | 2 (0.4%) | * | react-native | 168 | 130 (77.4%) | 79 (47.0%) | 81 (48.2%) | 0 (0.0%) | 0 (0.0%) | | rust | 171 | 128 (74.9%) | 30 (17.5%) | 27 (15.8%) | 16 (9.4%) | 14 (8.2%) | | spark | 186 | 149 (80.1%) | 52 (28.0%) | 52 (28.0%) | 2 (1.1%) | 2 (1.1%) | | tensorflow | 115 | 66 (57.4%) | 48 (41.7%) | 48 (41.7%) | 5 (4.3%) | 5 (4.3%) | | test-more | 19 | 15 (78.9%) | 2 (10.5%) | 2 (10.5%) | 1 (5.3%) | 1 (5.3%) | * | test-unit | 51 | 34 (66.7%) | 14 (27.5%) | 8 (15.7%) | 2 (3.9%) | 2 (3.9%) | * | xmonad | 23 | 22 (95.7%) | 2 (8.7%) | 2 (8.7%) | 1 (4.3%) | 1 (4.3%) | * | --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- | | totals | 6668 | 4391 (65.9%) | 1496 (22.4%) | 1491 (22.4%) | 150 (2.2%) | 144 (2.2%) | | totals (training set) | 4552 | 3195 (70.2%) | 1053 (23.1%) | 1061 (23.3%) | 86 (1.9%) | 88 (1.9%) | | totals (test set) | 2116 | 1196 (56.5%) | 443 (20.9%) | 430 (20.3%) | 64 (3.0%) | 56 (2.6%) | In this table, the numbers are the count and percentage of human-rated sliders that the corresponding algorithm got *wrong*. The columns are * "repository" - the name of the repository used. I used the diffs between successive non-merge commits on the HEAD branch of the corresponding repository. * "count" - the number of sliders that were human-rated. I chose most, but not all, sliders to rate from those among which the various algorithms gave different answers. * "Git 2.9.0" - the default algorithm used by `git diff` in Git 2.9.0. * "compaction" - the heuristic used by `git diff --compaction-heuristic` in Git 2.9.0. * "compaction-fixed" - the heuristic used by `git diff --compaction-heuristic` after the fixes from earlier in this patch series. Note that the results are not dramatically different than those for "compaction". Both produce non-ideal diffs only about 1/3 as often as the default `git diff`. * "indent-1" - the new `--indent-heuristic` algorithm, using the first set of weighting factors, determined as described above. * "indent-2" - the new `--indent-heuristic` algorithm, using the final set of weighting factors, determined as described below. * `*` - indicates that repo was part of training set used to determine the first set of weighting factors. The fact that the heuristic performed nearly as well on the test set as on the training set in column "indent-1" is a good indication that the heuristic was not over-trained. Given that fact, I ran a second round of optimization, using the entire corpus as the training set. The resulting set of weights gave the results in column "indent-2". These are the weights included in this patch. The final result gives consistently and significantly better results across the whole corpus than either `git diff` or `git diff --compaction-heuristic`. It makes only about 1/30 as many errors as the former and about 1/10 as many errors as the latter. (And a good fraction of the remaining errors are for diffs that involve weirdly-formatted code, sometimes apparently machine-generated.) The tools that were used to do this optimization and analysis, along with the human-generated data values, are recorded in a separate project [1]. This patch adds a new command-line option `--indent-heuristic`, and a new configuration setting `diff.indentHeuristic`, that activate this heuristic. This interface is only meant for testing purposes, and should be finalized before including this change in any release. [1] https://github.com/mhagger/diff-slider-tools Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-16add--interactive: respect diff.compactionHeuristicLibravatar Jeff King1-0/+4
We use plumbing to generate the diff, so it doesn't automatically pick up UI config like compactionHeuristic. Let's forward it on, since interactive adding is porcelain. Note that we only need to handle the "true" case. There's no point in passing --no-compaction-heuristic when the variable is false, since nothing else could have turned it on. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-28add--interactive: allow custom diff highlighting programsLibravatar Jeff King1-2/+10
The patch hunk selector of add--interactive knows how ask git for colorized diffs, and correlate them with the uncolored diffs we apply. But there's not any way for somebody who uses a diff-filter tool like contrib's diff-highlight to see their normal highlighting. This patch lets users define an arbitrary shell command to pipe the colorized diff through. The exact output shouldn't matter (since we just show the result to humans) as long as it is line-compatible with the original diff (so that hunk-splitting can split the colorized version, too). I left two minor issues with the new system that I don't think are worth fixing right now, but could be done later: 1. We only filter colorized diffs. Theoretically a user could want to filter a non-colorized diff, but I find it unlikely in practice. Users who are doing things like diff-highlighting are likely to want color, too. 2. add--interactive will re-colorize a diff which has been hand-edited, but it won't have run through the filter. Fixing this is conceptually easy (just pipe the diff through the filter), but practically hard to do without using tempfiles (it would need to feed data to and read the result from the filter without deadlocking; this raises portability questions with respect to Windows). I've punted on both issues for now, and if somebody really cares later, they can do a patch on top. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17Merge branch 'ak/add-i-empty-candidates'Libravatar Junio C Hamano1-0/+5
The interactive "show a list and let the user choose from it" interface "add -i" used showed and prompted to the user even when the candidate list was empty, against which the only "choice" the user could have made was to choose nothing. * ak/add-i-empty-candidates: add -i: return from list_and_choose if there is no candidate
2015-01-22add -i: return from list_and_choose if there is no candidateLibravatar Alexander Kuleshov1-0/+5
The list_and_choose() helper is given a prompt and a list, asks the user to make selection from the list, and then returns a list of items chosen. Even when it is given an empty list as the original candidate set to choose from, it gave a prompt to the user, who can only say "I am done choosing". Return an empty result when the input is an empty list without bothering the user. The existing caller must already have a logic to say "Nothing to do" or an equivalent when the returned list is empty (i.e. the user chose to select nothing) if it is necessary, so no change to the callers is necessary. This fixes the case where "add untracked" is asked in "git add -i" and there is no untracked files in the working tree. We used to give an empty list of files to choose from with a prompt, but with this change, we no longer do. Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-15add--interactive: leave main loop on read errorLibravatar Jeff King1-0/+1
The main hunk loop for add--interactive will loop if it does not get a known input. This is a good thing if the user typed some invalid input. However, if we have an uncorrectable read error, we'll end up looping infinitely. We can fix this by noticing read errors (i.e., <STDIN> returns undef) and breaking out of the loop. One easy way to trigger this is if you have an editor that does not take over the terminal (e.g., one that spawns a window in an existing process and waits), start the editor with the hunk-edit command, and hit ^C to send SIGINT. The editor process dies due to SIGINT, but the perl add--interactive process does not (perl suspends SIGINT for the duration of our system() call). We return to the main loop, but further reads from stdin don't work. The SIGINT _also_ killed our parent git process, which orphans our process group, meaning that further reads from the terminal will always fail. We loop infinitely, getting EIO on each read. Note that there are several other spots where we read from stdin, too. However, in each of those cases, we do something sane when the read returns undef (breaking out of the loop, taking the input as "no", etc). They don't need similar treatment. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-08Merge branch 'jl/nor-or-nand-and'Libravatar Junio C Hamano1-2/+2
Eradicate mistaken use of "nor" (that is, essentially "nor" used not in "neither A nor B" ;-)) from in-code comments, command output strings, and documentations. * jl/nor-or-nand-and: code and test: fix misuses of "nor" comments: fix misuses of "nor" contrib: fix misuses of "nor" Documentation: fix misuses of "nor"
2014-03-31code and test: fix misuses of "nor"Libravatar Justin Lebar1-2/+2
Signed-off-by: Justin Lebar <jlebar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-03git-add--interactive: warn if module for interactive.singlekey is missingLibravatar Simon Ruderich1-0/+3
Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Simon Ruderich <simon@ruderich.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-25add-interactive: handle unborn branch in patch modeLibravatar Jeff King1-9/+13
The list_modified function already knows how to handle an unborn branch by diffing against the empty tree. However, the diff we perform to get the actual hunks does not. Let's use the same logic for both diffs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-04add--interactive: fix external command invocation on WindowsLibravatar Johannes Sixt1-1/+1
Back in 21e9757e (Hack git-add--interactive to make it work with ActiveState Perl, 2007-08-01), the invocation of external commands was changed to use qx{} on Windows. The rationale was that the command interpreter on Windows is not a POSIX shell, but rather Windows's CMD. That patch was wrong to include 'msys' in the check whether to use qx{} or not: 'msys' identifies MSYS perl as shipped with Git for Windows, which does not need the special treatment; qx{} should be used only with ActiveState perl, which is identified by 'MSWin32'. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-23add -i: add extra options at the right place in "diff" command lineLibravatar Junio C Hamano1-1/+1
Appending "--diff-algorithm=histogram" at the end of canned command line for various modes of "diff" is correct for most of them but not for "stash" that has a non-option already wired in, like so: 'stash' => { DIFF => 'diff-index -p HEAD', Appending an extra option after non-option may happen to work due to overly lax command line parser, but that is not something we should rely on. Instead, splice in the extra argument immediately after the command name (i.e. 'diff-index', 'diff-files', etc.). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-12add--interactive: respect diff.algorithmLibravatar John Keeping1-0/+5
When staging hunks interactively it is sometimes useful to use an alternative diff algorithm which splits the changes into hunks in a more logical manner. This is not possible because the plumbing commands called by add--interactive ignore the "diff.algorithm" configuration option (as they should). Since add--interactive is a porcelain command it should respect this configuration variable. To do this, make it read diff.algorithm and pass its value to the underlying diff-index and diff-files invocations. At this point, do not add options to "git add", "git reset" or "git checkout" (all of which can call git-add--interactive). If a user wants to override the value on the command line they can use: git -c diff.algorithm=$ALGO ... Signed-off-by: John Keeping <john@keeping.me.uk> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-12Correct common spelling mistakes in comments and testsLibravatar Stefano Lattarini1-1/+1
Most of these were found using Lucas De Marchi's codespell tool. Signed-off-by: Stefano Lattarini <stefano.lattarini@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-06-25git-add--interactive.perl: Remove two unused variablesLibravatar Thomas Badie1-2/+0
The patch 8f0bef6 refactored this script and made the variable $fh unneeded in subs diff_applies and patch_update_file, but forgot to remove them. Signed-off-by: Thomas Badie <badie@lrde.epita.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-05add--interactive: ignore unmerged entries in patch modeLibravatar Jeff King1-7/+18
When "add -p" sees an unmerged entry, it shows the combined diff and then immediately skips the hunk. This can be confusing in a variety of ways, depending on whether there are other changes to stage (in which case you get the superfluous combined diff output in between other hunks) or not (in which case you get the combined diff and the program exits immediately, rather than seeing "No changes"). The current behavior was not planned, and is just what the implementation happens to do. Instead, let's explicitly remove unmerged entries from our list of modified files, and print a warning that we are ignoring them. We can cheaply find which entries are unmerged by adding "--raw" output to the "diff-files --numstat" we already run. There is one non-obvious thing we must change when parsing this combined output. Before this patch, when we saw a numstat line for a file that did not have index changes, we would create a new record with 'unchanged' in the 'INDEX' field. Because "--raw" comes before "--numstat", we must move this special-case down to the raw-line case (and it is sufficient to move it rather than handle it in both places, since any file which has a --numstat will also have a --raw entry). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-17add -i: ignore terminal escape sequencesLibravatar Thomas Rast1-0/+19
On the author's terminal, the up-arrow input sequence is ^[[A, and thus fat-fingering an up-arrow into 'git checkout -p' is quite dangerous: git-add--interactive.perl will ignore the ^[ and [ characters and happily treat A as "discard everything". As a band-aid fix, use Term::Cap to get all terminal capabilities. Then use the heuristic that any capability value that starts with ^[ (i.e., \e in perl) must be a key input sequence. Finally, given an input that starts with ^[, read more characters until we have read a full escape sequence, then return that to the caller. We use a timeout of 0.5 seconds on the subsequent reads to avoid getting stuck if the user actually input a lone ^[. Since none of the currently recognized keys start with ^[, the net result is that the sequence as a whole will be ignored and the help displayed. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29"add -p": work-around an old laziness that does not coalesce hunksLibravatar Junio C Hamano1-1/+1
Since 0beee4c (git-add--interactive: remove hunk coalescing, 2008-07-02), "git add--interactive" behaves lazily and passes overlapping hunks to the underlying "git apply" without coalescing. This was partially corrected by 7a26e65 (its partial revert, 2009-05-16), but overlapping hunks are still passed when the patch is edited. Teach "git apply" the --allow-overlap option that disables a safety feature that avoids misapplication of patches by not applying patches to overlapping hunks, and pass this option form "add -p" codepath. Do not even advertise the option, as this is merely a workaround, and the correct fix is to make "add -p" correctly coalesce adjacent patch hunks. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29add--interactive.perl: factor out repeated --recount optionLibravatar Junio C Hamano1-8/+8
Depending on the direction and the target of patch application, we would need to pass --cached and --reverse to underlying "git apply". Also we only pass --check when we are not applying but just checking. But we always pass --recount since 8cbd431 (git-add--interactive: replace hunk recounting with apply --recount, 2008-07-02). Instead of repeating the same --recount over and over again, move it to a single place that actually runs the command, namely, "run_git_apply" subroutine. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29add -p: 'q' should really quitLibravatar Junio C Hamano1-5/+4
The "quit" command was added in 9a7a1e0 (git add -p: new "quit" command at the prompt, 2009-04-10) to allow the user to say that hunks other than what have already been chosen are undesirable, and exit the interactive loop immediately. It forgot that there may be an undecided hunk before the current one. In such a case, the interactive loop still goes back to the beginning. Clear all the USE bit for undecided hunks and exit the loop. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-29Merge branch 'jl/add-p-reverse-message'Libravatar Junio C Hamano1-2/+11
* jl/add-p-reverse-message: Correct help blurb in checkout -p and friends
2010-10-28Correct help blurb in checkout -p and friendsLibravatar Jonathan "Duke" Leto1-2/+11
When git checkout -p from the index or HEAD is run in edit mode, the help message about removing '-' and '+' lines was backwards. Because it is reverse applying the patch, the meanings of '-' and '+' are reversed. Signed-off-by: Jonathan "Duke" Leto <jonathan@leto.net> Acked-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-27perl: use "use warnings" instead of -wLibravatar Ævar Arnfjörð Bjarmason1-1/+2
Change the Perl scripts to turn on lexical warnings instead of setting the global $^W variable via the -w switch. The -w sets warnings for all code that interpreter runs, while "use warnings" is lexically scoped. The former is probably not what the authors wanted. As an auxiliary benefit it's now possible to build Git with: PERL_PATH='/usr/bin/env perl' Which would previously result in failures, since "#!/usr/bin/env perl -w" doesn't work as a shebang. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-27perl: bump the required Perl version to 5.8 from 5.6.[21]Libravatar Ævar Arnfjörð Bjarmason1-0/+1
Formalize our dependency on perl 5.8, bumped from 5.6.[12]. We already used the three-arg form of open() which was introduced in 5.6.1, but t/t9700/test.pl explicitly depended on 5.6.2. However git-add--interactive.pl has been failing on the 5.6 line since it was introduced in v1.5.0-rc0~12^2~2 back in 2006 due to this open syntax: sub run_cmd_pipe { my $fh = undef; open($fh, '-|', @_) or die; return <$fh>; } Which when executed dies on "Can't use an undefined value as filehandle reference". Several of our tests also fail on 5.6 (even more when compiled with NO_PERL_MAKEMAKER=1): t2016-checkout-patch.sh t3904-stash-patch.sh t3701-add-interactive.sh t7105-reset-patch.sh t7501-commit.sh t9700-perl-git.sh Our code is bitrotting on 5.6 with no-one interested in fixing it, and pinning us to such an ancient release of Perl is keeping us from using useful features introduced in the 5.8 release. The 5.6 series is now over 10 years old, and the 5.6.2 maintenance release almost 7. 5.8 on the other hand is more than 8 years old. All the modern Unix-like operating systems have now upgraded to it or a later version, and 5.8 packages are available for old IRIX, AIX Solaris and Tru64 systems. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Tor Arntsen <tor@spacetec.no> Acked-by: Randal L. Schwartz <merlyn@stonehenge.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-13add-interactive: Clarify “remaining hunks in the file”Libravatar Jonathan Nieder1-3/+3
The "a" and "d" commands to ‘add --patch’ (accept/reject rest of file) interact with "j", "g", and "/" (skip some hunks) in a perhaps confusing way: after accepting or rejecting all _later_ hunks in the file, they return to the earlier, skipped hunks and prompt the user about them again. This behavior can be very useful in practice. One can still accept or reject _all_ undecided hunks in a file by using the "g" command to move to hunk #1 first. Reported-by: Frédéric Brière <fbriere@fbriere.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>