diff options
author | SZEDER Gábor <szeder.dev@gmail.com> | 2019-12-09 12:52:57 +0100 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2019-12-09 13:33:01 -0800 |
commit | 49f7a2fde98e375562c2f8f2e3e3effba70d0402 (patch) | |
tree | 231bec7df797870efd57d7458961c75b1c29f7f9 /blob.h | |
parent | name-rev: use 'name->tip_name' instead of 'tip_name' (diff) | |
download | tgif-49f7a2fde98e375562c2f8f2e3e3effba70d0402.tar.xz |
name-rev: eliminate recursion in name_rev()
The name_rev() function calls itself recursively for each interesting
parent of the commit it got as parameter, and, consequently, it can
segfault when processing a deep history if it exhausts the available
stack space. E.g. running 'git name-rev --all' and 'git name-rev
HEAD~100000' in the gcc, gecko-dev, llvm, and WebKit repositories
results in segfaults on my machine ('ulimit -s' reports 8192kB of
stack size limit), and nowadays the former segfaults in the Linux repo
as well (it reached the necessasry depth sometime between v5.3-rc4 and
-rc5).
Eliminate the recursion by inserting the interesting parents into a
LIFO 'prio_queue' [1] and iterating until the queue becomes empty.
Note that the parent commits must be added in reverse order to the
LIFO 'prio_queue', so their relative order is preserved during
processing, i.e. the first parent should come out first from the
queue, because otherwise performance greatly suffers on mergy
histories [2].
The stacksize-limited test 'name-rev works in a deep repo' in
't6120-describe.sh' demonstrated this issue and expected failure. Now
the recursion is gone, so flip it to expect success. Also gone are
the dmesg entries logging the segfault of that segfaulting 'git
name-rev' process on every execution of the test suite.
Note that this slightly changes the order of lines in the output of
'git name-rev --all', usually swapping two lines every 35 lines in
git.git or every 150 lines in linux.git. This shouldn't matter in
practice, because the output has always been unordered anyway.
This patch is best viewed with '--ignore-all-space'.
[1] Early versions of this patch used a 'commit_list', resulting in
~15% performance penalty for 'git name-rev --all' in 'linux.git',
presumably because of the memory allocation and release for each
insertion and removal. Using a LIFO 'prio_queue' has basically no
effect on performance.
[2] We prefer shorter names, i.e. 'v0.1~234' is preferred over
'v0.1^2~5', meaning that usually following the first parent of a
merge results in the best name for its ancestors. So when later
we follow the remaining parent(s) of a merge, and reach an already
named commit, then we usually find that we can't give that commit
a better name, and thus we don't have to visit any of its
ancestors again.
OTOH, if we were to follow the Nth parent of the merge first, then
the name of all its ancestors would include a corresponding '^N'.
Those are not the best names for those commits, so when later we
reach an already named commit following the first parent of that
merge, then we would have to update the name of that commit and
the names of all of its ancestors as well. Consequently, we would
have to visit many commits several times, resulting in a
significant slowdown.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'blob.h')
0 files changed, 0 insertions, 0 deletions