summaryrefslogtreecommitdiff
path: root/builtin/shortlog.c
AgeCommit message (Collapse)AuthorFilesLines
2021-01-25Merge branch 'ab/mailmap'Libravatar Junio C Hamano1-14/+2
Clean-up docs, codepaths and tests around mailmap. * ab/mailmap: (22 commits) shortlog: remove unused(?) "repo-abbrev" feature mailmap doc + tests: document and test for case-insensitivity mailmap tests: add tests for empty "<>" syntax mailmap tests: add tests for whitespace syntax mailmap tests: add a test for comment syntax mailmap doc + tests: add better examples & test them tests: refactor a few tests to use "test_commit --append" test-lib functions: add an --append option to test_commit test-lib functions: add --author support to test_commit test-lib functions: document arguments to test_commit test-lib functions: expand "test_commit" comment template mailmap: test for silent exiting on missing file/blob mailmap tests: get rid of overly complex blame fuzzing mailmap tests: add a test for "not a blob" error mailmap tests: remove redundant entry in test mailmap tests: improve --stdin tests mailmap tests: modernize syntax & test idioms mailmap tests: use our preferred whitespace syntax mailmap doc: start by mentioning the comment syntax check-mailmap doc: note config options ...
2021-01-12shortlog: remove unused(?) "repo-abbrev" featureLibravatar Ævar Arnfjörð Bjarmason1-14/+2
Remove support for the magical "repo-abbrev" comment in .mailmap files. This was added to .mailmap parsing in [1], as a generalized feature of the git-shortlog Perl script added earlier in [2]. There was no documentation or tests for this feature, and I don't think it's used in practice anymore. What it did was to allow you to specify a single string to be search-replaced with "/.../" in the .mailmap file. E.g. for linux.git's current .mailmap: git archive --remote=git@gitlab.com:linux-kernel/linux.git \ HEAD -- .mailmap | grep -a repo-abbrev # repo-abbrev: /pub/scm/linux/kernel/git/ Then when running e.g.: git shortlog --merges --author=Linus -1 v5.10-rc7..v5.10 | grep Merge We'd emit (the [...] is mine): Merge tag [...]git://git.kernel.org/.../tip/tip But will now emit: Merge tag [...]git.kernel.org/pub/scm/linux/kernel/git/tip/tip I think at this point this is just a historical artifact we can get rid of. It was initially meant for Linus's own use when we integrated the Perl script[2], but since then it seems he's stopped using it. Digging through Linus's release announcements on the LKML[3] the last release I can find that made use of this output is Linux 2.6.25-rc6 back in March 2008[4]. Later on Linus started using --no-merges[5], and nowadays seems to prefer some custom not-quite-shortlog format of merges from lieutenants[6]. You will still see it on linux.git if you run "git shortlog" manually yourself with --merges, with this removed you can still get the same output with: git log --pretty=fuller v5.10-rc7..v5.10 | sed 's!/pub/scm/linux/kernel/git/!/.../!g' | git shortlog Arguably we should do the same for the search-replacing of "[PATCH]" at the beginning with "". That seems to be another relic of a bygone era when linux.git patches would have their E-Mail subject lines applied as-is by "git am" or whatever. But we documented that feature in "git-shortlog(1)", and it seems more widely applicable than something purely kernel-specific. 1. 7595e2ee6ef (git-shortlog: make common repository prefix configurable with .mailmap, 2006-11-25) 2. fa375c7f1b6 (Add git-shortlog perl script, 2005-06-04) 3. https://lore.kernel.org/lkml/ 4. https://lore.kernel.org/lkml/alpine.LFD.1.00.0803161651350.3020@woody.linux-foundation.org/ 5. https://lore.kernel.org/lkml/BANLkTinrbh7Xi27an3uY7pDWrNKhJRYmEA@mail.gmail.com/ 6. https://lore.kernel.org/lkml/CAHk-=wg1+kf1AVzXA-RQX0zjM6t9J2Kay9xyuNqcFHWV-y5ZYw@mail.gmail.com/ Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06builtin/*: update usage formatLibravatar ZheNing Hu1-5/+5
According to the guidelines in parse-options.h, we should not end in a full stop or start with a capital letter. Fix old error and usage messages to match this expectation. Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11shortlog: use strset from strmap.hLibravatar Elijah Newren1-57/+4
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-02hashmap: provide deallocation function namesLibravatar Elijah Newren1-1/+1
hashmap_free(), hashmap_free_entries(), and hashmap_free_() have existed for a while, but aren't necessarily the clearest names, especially with hashmap_partial_clear() being added to the mix and lazy-initialization now being supported. Peff suggested we adopt the following names[1]: - hashmap_clear() - remove all entries and de-allocate any hashmap-specific data, but be ready for reuse - hashmap_clear_and_free() - ditto, but free the entries themselves - hashmap_partial_clear() - remove all entries but don't deallocate table - hashmap_partial_clear_and_free() - ditto, but free the entries This patch provides the new names and converts all existing callers over to the new naming scheme. [1] https://lore.kernel.org/git/20201030125059.GA3277724@coredump.intra.peff.net/ Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27shortlog: allow multiple groups to be specifiedLibravatar Jeff King1-24/+40
Now that shortlog supports reading from trailers, it can be useful to combine counts from multiple trailers, or between trailers and authors. This can be done manually by post-processing the output from multiple runs, but it's non-trivial to make sure that each name/commit pair is counted only once. This patch teaches shortlog to accept multiple --group options on the command line, and pull data from all of them. That makes it possible to run: git shortlog -ns --group=author --group=trailer:co-authored-by to get a shortlog that counts authors and co-authors equally. The implementation is mostly straightforward. The "group" enum becomes a bitfield, and the trailer key becomes a list. I didn't bother implementing the multi-group semantics for reading from stdin. It would be possible to do, but the existing matching code makes it awkward, and I doubt anybody cares. The duplicate suppression we used for trailers now covers authors and committers as well (though in non-trailer single-group mode we can skip the hash insertion and lookup, since we only see one value per commit). There is one subtlety: we now care about the case when no group bit is set (in which case we default to showing the author). The caller in builtin/log.c needs to be adapted to ask explicitly for authors, rather than relying on shortlog_init(). It would be possible with some gymnastics to make this keep working as-is, but it's not worth it for a single caller. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27shortlog: parse trailer identsLibravatar Jeff King1-0/+6
Trailers don't necessarily contain name/email identity values, so shortlog has so far treated them as opaque strings. However, since many trailers do contain identities, it's useful to treat them as such when they can be parsed. That lets "-e" work as usual, as well as mailmap. When they can't be parsed, we'll continue with the old behavior of treating them as a single string (there's no new test for that here, since the existing tests cover a trailer like this). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27shortlog: rename parse_stdin_ident()Libravatar Jeff King1-3/+3
This function is actually useful for parsing any identity, whether from stdin or not. We'll need it for handling trailers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27shortlog: de-duplicate trailer valuesLibravatar Jeff King1-0/+58
The current documentation is vague about what happens with --group=trailer:signed-off-by when we see a commit with: Signed-off-by: One Signed-off-by: Two Signed-off-by: One We clearly should credit both "One" and "Two", but should "One" get credited twice? The current code does so, but mostly because that was the easiest thing to do. It's probably more useful to count each commit at most once. This will become especially important when we allow values from multiple sources in a future patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27shortlog: match commit trailers with --groupLibravatar Jeff King1-1/+43
If a project uses commit trailers, this patch lets you use shortlog to see who is performing each action. For example, running: git shortlog -ns --group=trailer:reviewed-by in git.git shows who has reviewed. You can even use a custom format to see things like who has helped whom: git shortlog --format="...helped %an (%ad)" \ --group=trailer:helped-by Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27shortlog: add grouping optionLibravatar Jeff King1-11/+48
In preparation for adding more grouping types, let's refactor the committer/author grouping code and add a user-facing option that binds them together. In particular: - the main option is now "--group", to make it clear that the various group types are mutually exclusive. The "--committer" option is an alias for "--group=committer". - we keep an enum rather than a binary flag, to prepare for more values - we prefer switch statements to ternary assignment, since other group types will need more custom code Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25shortlog: change "author" variables to "ident"Libravatar Jeff King1-18/+18
We already match "committer", and we're about to start matching more things. Let's use a more neutral variable to avoid confusion. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-28Use OPT_CALLBACK and OPT_CALLBACK_FLibravatar Denton Liu1-2/+2
In the codebase, there are many options which use OPTION_CALLBACK in a plain ol' struct definition. However, we have the OPT_CALLBACK and OPT_CALLBACK_F macros which are meant to abstract these plain struct definitions away. These macros are useful as they semantically signal to developers that these are just normal callback option with nothing fancy happening. Replace plain struct definitions of OPTION_CALLBACK with OPT_CALLBACK or OPT_CALLBACK_F where applicable. The heavy lifting was done using the following (disgusting) shell script: #!/bin/sh do_replacement () { tr '\n' '\r' | sed -e 's/{\s*OPTION_CALLBACK,\s*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\s*0,\(\s*[^[:space:]}]*\)\s*}/OPT_CALLBACK(\1,\2,\3,\4,\5,\6)/g' | sed -e 's/{\s*OPTION_CALLBACK,\s*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\(\s*[^[:space:]}]*\)\s*}/OPT_CALLBACK_F(\1,\2,\3,\4,\5,\6,\7)/g' | tr '\r' '\n' } for f in $(git ls-files \*.c) do do_replacement <"$f" >"$f.tmp" mv "$f.tmp" "$f" done The result was manually inspected and then reformatted to match the style of the surrounding code. Finally, using `git grep OPTION_CALLBACK \*.c`, leftover results which were not handled by the script were manually transformed. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-15Merge branch 'nd/show-gitcomp-compilation-fix' into maintLibravatar Junio C Hamano1-0/+2
Portability fix for a recent update to parse-options API. * nd/show-gitcomp-compilation-fix: parse-options: fix SunCC compiler warning
2018-12-12parse-options: fix SunCC compiler warningLibravatar Nguyễn Thái Ngọc Duy1-0/+2
The compiler reports this because show_gitcomp() never actually returns a value: "parse-options.c", line 520: warning: Function has no return statement : show_gitcomp We could shut the compiler up. But instead let's not bury exit() too deep. Do the same as internal -h handling, return a special error code and handle the exit() in parse_options() (and other parse_options_step() callers) instead. Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-21revision.c: remove implicit dependency on the_indexLibravatar Nguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-17Merge branch 'rs/parse-opt-lithelp'Libravatar Junio C Hamano1-2/+3
The parse-options machinery learned to refrain from enclosing placeholder string inside a "<bra" and "ket>" pair automatically without PARSE_OPT_LITERAL_ARGHELP. Existing help text for option arguments that are not formatted correctly have been identified and fixed. * rs/parse-opt-lithelp: parse-options: automatically infer PARSE_OPT_LITERAL_ARGHELP shortlog: correct option help for -w send-pack: specify --force-with-lease argument help explicitly pack-objects: specify --index-version argument help explicitly difftool: remove angular brackets from argument help add, update-index: fix --chmod argument help push: use PARSE_OPT_LITERAL_ARGHELP instead of unbalanced brackets
2018-08-03parse-options: automatically infer PARSE_OPT_LITERAL_ARGHELPLibravatar René Scharfe1-2/+1
Parseopt wraps argument help strings in a pair of angular brackets by default, to tell users that they need to replace it with an actual value. This is useful in most cases, because most option arguments are indeed single values of a certain type. The option PARSE_OPT_LITERAL_ARGHELP needs to be used in option definitions with arguments that have multiple parts or are literal strings. Stop adding these angular brackets if special characters are present, as they indicate that we don't deal with a simple placeholder. This simplifies the code a bit and makes defining special options slightly easier. Remove the flag PARSE_OPT_LITERAL_ARGHELP in the cases where the new and more cautious handling suffices. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-03shortlog: correct option help for -wLibravatar René Scharfe1-2/+4
Wrap the placeholders in the option help string for -w in pairs of angular brackets to document that users need to replace them with actual numbers. Use the flag PARSE_OPT_LITERAL_ARGHELP to prevent parseopt from adding another pair. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-10Merge branch 'ps/contains-id-error-message'Libravatar Junio C Hamano1-0/+1
"git tag --contains no-such-commit" gave a full list of options after giving an error message. * ps/contains-id-error-message: parse-options: do not show usage upon invalid option value
2018-03-22parse-options: do not show usage upon invalid option valueLibravatar Paul-Sebastian Ungureanu1-0/+1
Usually, the usage should be shown only if the user does not know what options are available. If the user specifies an invalid value, the user is already aware of the available options. In this case, there is no point in displaying the usage anymore. This patch applies to "git tag --contains", "git branch --contains", "git branch --points-at", "git for-each-ref --contains" and many more. Signed-off-by: Paul-Sebastian Ungureanu <ungureanupaulsebastian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15shortlog: disallow left-over arguments outside repoLibravatar Martin Ågren1-0/+5
If we are outside a repo and have any arguments left after option-parsing, `setup_revisions()` will try to do its job and something like this will happen: $ git shortlog v2.16.0.. BUG: environment.c:183: git environment hasn't been setup Aborted (core dumped) The usage is wrong, but we could obviously handle this better. Note that commit abe549e179 (shortlog: do not require to run from inside a git repository, 2008-03-14) explicitly enabled `git shortlog` to run from outside a repo, since we do not need a repo for parsing data from stdin. Disallow left-over arguments when run from outside a repo. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-13shortlog: add usage-string for stdin-readingLibravatar Martin Ågren1-1/+2
This has been missing since we learned to print usage, way back in 4e27fb06f (add commit count options to git-shortlog, 2006-10-06). While at it, drop the [] around "<path>...". This matches `git log -h` and Documentation/git-{short}log.txt. It formally makes it look like we do not allow `git shortlog --`, but we gain readability and consistency. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09shortlog: skip format/parse roundtrip for internal traversalLibravatar Jeff King1-21/+35
The original git-shortlog command parsed the output of git-log, and the logic went something like this: 1. Read stdin looking for "author" lines. 2. Parse the identity into its name/email bits. 3. Apply mailmap to the name/email. 4. Reformat the identity into a single buffer that is our "key" for grouping entries (either a name by default, or "name <email>" if --email was given). The first part happens in read_from_stdin(), and the other three steps are part of insert_one_record(). When we do an internal traversal, we just swap out the stdin read in step 1 for reading the commit objects ourselves. Prior to 2db6b83d18 (shortlog: replace hand-parsing of author with pretty-printer, 2016-01-18), that made sense; we still had to parse the ident in the commit message. But after that commit, we use pretty.c's "%an <%ae>" to get the author ident (for simplicity). Which means that the pretty printer is doing a parse/format under the hood, and then we parse the result, apply the mailmap, and format the result again. Instead, we can just ask pretty.c to do all of those steps for us (including the mailmap via "%aN <%aE>", and not formatting the address when --email is missing). And then we can push steps 2-4 into read_from_stdin(). This speeds up "git shortlog -ns" on linux.git by about 3%, and eliminates a leak in insert_one_record() of the namemailbuf strbuf. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15config: don't include config.h by defaultLibravatar Brandon Williams1-0/+1
Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-24Merge branch 'rs/shortlog-cleanup'Libravatar Junio C Hamano1-1/+0
Code clean-up. * rs/shortlog-cleanup: shortlog: don't set after_subject to an empty string
2017-03-18shortlog: don't set after_subject to an empty stringLibravatar René Scharfe1-1/+0
The string after_subject is added to a strbuf by pp_title_line() if it's not NULL. Adding an empty string has the same effect as not adding anything, but the latter is easier, so don't bother changing the context member from NULL to "". Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-10Merge branch 'rs/log-email-subject'Libravatar Junio C Hamano1-1/+1
Code clean-up. * rs/log-email-subject: pretty: use fmt_output_email_subject() log-tree: factor out fmt_output_email_subject()
2017-03-01pretty: use fmt_output_email_subject()Libravatar René Scharfe1-1/+1
Add the email-style subject prefix (e.g. "Subject: [PATCH] ") directly when it's needed instead of letting log_write_email_headers() prepare it in a static buffer in advance. This simplifies storage ownership and code flow. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-15shortlog: group by committer informationLibravatar Linus Torvalds1-3/+12
In some situations you may want to group the commits not by author, but by committer instead. For example, when I just wanted to look up what I'm still missing from linux-next in the current merge window, I don't care so much about who wrote a patch, as what git tree it came from, which generally boils down to "who committed it". So make git shortlog take a "-c" or "--committer" option to switch grouping to that. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-29use QSORT, part 2Libravatar René Scharfe1-1/+1
Convert two more qsort(3) calls to QSORT to reduce code size and for better safety and consistency. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-19Merge branch 'js/log-to-diffopt-file'Libravatar Junio C Hamano1-5/+10
The commands in the "log/diff" family have had an FILE* pointer in the data structure they pass around for a long time, but some codepaths used to always write to the standard output. As a preparatory step to make "git format-patch" available to the internal callers, these codepaths have been updated to consistently write into that FILE* instead. * js/log-to-diffopt-file: mingw: fix the shortlog --output=<file> test diff: do not color output when --color=auto and --output=<file> is given t4211: ensure that log respects --output=<file> shortlog: respect the --output=<file> setting format-patch: use stdout directly format-patch: avoid freopen() format-patch: explicitly switch off color when writing to files shortlog: support outputting to streams other than stdout graph: respect the diffopt.file setting line-log: respect diffopt's configured output file stream log-tree: respect diffopt's configured output file stream log: prepare log/log-tree to reuse the diffopt.close_file attribute
2016-06-24shortlog: respect the --output=<file> settingLibravatar Johannes Schindelin1-1/+3
Thanks to the diff option parsing, we already know about this option. We just have to make use of it. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-24shortlog: support outputting to streams other than stdoutLibravatar Johannes Schindelin1-5/+8
This will be needed to avoid freopen() in `git format-patch`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13Merge branch 'jk/parseopt-string-list' into jk/string-list-static-initLibravatar Junio C Hamano1-3/+3
* jk/parseopt-string-list: blame,shortlog: don't make local option variables static interpret-trailers: don't duplicate option strings parse_opt_string_list: stop allocating new strings
2016-06-13blame,shortlog: don't make local option variables staticLibravatar Jeff King1-3/+3
There's no need for these option variables to be static, except that they are referenced by the options array itself, which is static. But having all of this static is simply unnecessary and confusing (and inconsistent with most other commands, which either use a static global option list or a true function-local one). Note that in some cases we may need to actually initialize the variables (since we cannot rely on BSS to do so). This is a net improvement to readability, though, as we can use the more verbose initializers for our string_lists. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-28Merge branch 'jk/shortlog'Libravatar Junio C Hamano1-87/+99
"git shortlog" used to accumulate various pieces of information regardless of what was asked to be shown in the final output. It has been optimized by noticing what need not to be collected (e.g. there is no need to collect the log messages when showing only the number of changes). * jk/shortlog: shortlog: don't warn on empty author shortlog: optimize out useless string list shortlog: optimize out useless "<none>" normalization shortlog: optimize "--summary" mode shortlog: replace hand-parsing of author with pretty-printer shortlog: use strbufs to read from stdin shortlog: match both "Author:" and "author" on stdin
2016-01-19shortlog: don't warn on empty authorLibravatar Jeff King1-8/+0
Git tries to avoid creating a commit with an empty author name or email. However, commits created by older, less strict versions of git may still be in the history. There's not much point in issuing a warning to stderr for an empty author. The user can't do anything about it now, and we are better off to simply include it in the shortlog output as an empty name/email, and let the caller process it however they see fit. Older versions of shortlog differentiated between "author header not present" (which complained) and "author name/email are blank" (which included the empty ident in the output). But since switching to format_commit_message, we complain to stderr about either case (linux.git has a blank author deep in its history which triggers this). We could try to restore the older behavior (complaining only about the missing header), but in retrospect, there's not much point in differentiating these cases. A missing author header is bogus, but as for the "blank" case, the only useful behavior is to add it to the "empty name" collection. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-19shortlog: optimize out useless string listLibravatar Jeff King1-12/+31
If we are in "--summary" mode, then we do not care about the actual list of subject onelines associated with each author. We care only about the number. So rather than store a string-list for each author full of "<none>", let's just keep a count. This drops my best-of-five for "git shortlog -ns HEAD" on linux.git from: real 0m5.194s user 0m5.028s sys 0m0.168s to: real 0m5.057s user 0m4.916s sys 0m0.144s That's about 2.5%. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-19shortlog: optimize out useless "<none>" normalizationLibravatar Jeff King1-29/+34
If we are in --summary mode, we will always pass <none> to insert_one_record, which will then do some normalization (e.g., cutting out "[PATCH]"). There's no point in doing so if we aren't going to use the result anyway. This drops my best-of-five for "git shortlog -ns HEAD" on linux.git from: real 0m5.257s user 0m5.104s sys 0m0.156s to: real 0m5.194s user 0m5.028s sys 0m0.168s That's only 1%, but arguably the result is clearer to read, as we're able to group our variable declarations inside the conditional block. It also opens up further optimization possibilities for future patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-19shortlog: optimize "--summary" modeLibravatar Jeff King1-4/+6
If the user asked us only to show counts for each author, rather than the individual summary lines, then there is no point in us generating the summaries only to throw them away. With this patch, I measured the following speedup for "git shortlog -ns HEAD" on linux.git (best-of-five): [before] real 0m5.644s user 0m5.472s sys 0m0.176s [after] real 0m5.257s user 0m5.104s sys 0m0.156s That's only ~7%, but it's so easy to do, there's no good reason not to. We don't have to touch any downstream code, since we already fill in the magic string "<none>" to handle commits without a message. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-19shortlog: replace hand-parsing of author with pretty-printerLibravatar Jeff King1-36/+26
When gathering the author and oneline subject for each commit, we hand-parse the commit headers to find the "author" line, and then continue past to the blank line at the end of the header. We can replace this tricky hand-parsing by simply asking the pretty-printer for the relevant items. This also decouples the author and oneline parsing, opening up some new optimizations in further commits. One reason to avoid the pretty-printer is that it might be less efficient than hand-parsing. However, I measured no slowdown at all running "git shortlog -ns HEAD" on linux.git. As a bonus, we also fix a memory leak in the (uncommon) case that the author field is blank. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-19shortlog: use strbufs to read from stdinLibravatar Jeff King1-9/+12
We currently use fixed-size buffers with fgets(), which could lead to incorrect results in the unlikely event that a line had something like "Author:" at exactly its 1024th character. But it's easy to convert this to a strbuf, and because we can reuse the same buffer through the loop, we don't even pay the extra allocation cost. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-19shortlog: match both "Author:" and "author" on stdinLibravatar Jeff King1-3/+4
The original git-shortlog could read both the normal "git log" output as well as "git log --format=raw". However, when it was converted to C by b8ec592 (Build in shortlog, 2006-10-22), the trailing colon became mandatory, and we no longer matched the raw output. Given the amount of intervening time without any bug reports, it's probable that nobody cares. But it's relatively easy to fix, and the end result is hopefully more readable than the original. Note that this no longer matches "author: ", which we did before, but that has never been a format generated by git. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-20Convert struct object to object_idLibravatar brian m. carlson1-1/+1
struct object is one of the major data structures dealing with object IDs. Convert it to use struct object_id instead of an unsigned char array. Convert get_object_hash to refer to the new member as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
2015-06-29convert "enum date_mode" into a structLibravatar Jeff King1-1/+1
In preparation for adding date modes that may carry extra information beyond the mode itself, this patch converts the date_mode enum into a struct. Most of the conversion is fairly straightforward; we pass the struct as a pointer and dereference the type field where necessary. Locations that declare a date_mode can use a "{}" constructor. However, the tricky case is where we use the enum labels as constants, like: show_date(t, tz, DATE_NORMAL); Ideally we could say: show_date(t, tz, &{ DATE_NORMAL }); but of course C does not allow that. Likewise, we cannot cast the constant to a struct, because we need to pass an actual address. Our options are basically: 1. Manually add a "struct date_mode d = { DATE_NORMAL }" definition to each caller, and pass "&d". This makes the callers uglier, because they sometimes do not even have their own scope (e.g., they are inside a switch statement). 2. Provide a pre-made global "date_normal" struct that can be passed by address. We'd also need "date_rfc2822", "date_iso8601", and so forth. But at least the ugliness is defined in one place. 3. Provide a wrapper that generates the correct struct on the fly. The big downside is that we end up pointing to a single global, which makes our wrapper non-reentrant. But show_date is already not reentrant, so it does not matter. This patch implements 3, along with a minor macro to keep the size of the callers sane. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-14standardize usage info string formatLibravatar Alex Henrie1-1/+1
This patch puts the usage info strings that were not already in docopt- like format into docopt-like format, which will be a litle easier for end users and a lot easier for translators. Changes include: - Placing angle brackets around fill-in-the-blank parameters - Putting dashes in multiword parameter names - Adding spaces to [-f|--foobar] to make [-f | --foobar] - Replacing <foobar>* with [<foobar>...] Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05replace {pre,suf}fixcmp() with {starts,ends}_with()Libravatar Christian Couder1-3/+3
Leaving only the function definitions and declarations so that any new topic in flight can still make use of the old functions, replace existing uses of the prefixcmp() and suffixcmp() with new API functions. The change can be recreated by mechanically applying this: $ git grep -l -e prefixcmp -e suffixcmp -- \*.c | grep -v strbuf\\.c | xargs perl -pi -e ' s|!prefixcmp\(|starts_with\(|g; s|prefixcmp\(|!starts_with\(|g; s|!suffixcmp\(|ends_with\(|g; s|suffixcmp\(|!ends_with\(|g; ' on the result of preparatory changes in this series. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-24Merge branch 'jk/shortlog-tolerate-broken-commit'Libravatar Jonathan Nieder1-2/+4
* jk/shortlog-tolerate-broken-commit: shortlog: ignore commits with missing authors
2013-09-18shortlog: ignore commits with missing authorsLibravatar Jeff King1-2/+4
Most of git's traversals are robust against minor breakages in commit data. For example, "git log" will still output an entry for a commit that has a broken encoding or missing author, and will not abort the whole operation. Shortlog, on the other hand, will die as soon as it sees a commit without an author, meaning that a repository with a broken commit cannot get any shortlog output at all. Let's downgrade this fatal error to a warning, and continue the operation. We simply ignore the commit and do not count it in the total (since we do not have any author under which to file it). Alternatively, we could output some kind of "<empty>" record to collect these bogus commits. It is probably not worth it, though; we have already warned to stderr, so the user is aware that such bogosities exist, and any placeholder we came up with would either be syntactically invalid, or would potentially conflict with real data. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>