tgif.git - Terin's Improved Git Fork

diff options

author	Jeff King <peff@peff.net>	2017-02-08 15:53:10 -0500
committer	Junio C Hamano <gitster@pobox.com>	2017-02-08 15:39:55 -0800
commit	ab6eea6f7b9a5289d72c05476da19ab2bb457fd3 (patch)
tree	70640ab6aea899442df3e67d3d5cf32e24ca7758 /t
parent	add oidset API (diff)
download	tgif-ab6eea6f7b9a5289d72c05476da19ab2bb457fd3.tar.xz

receive-pack: use oidset to de-duplicate .have lines

If you have an alternate object store with a very large number of refs, the peak memory usage of the sha1_array can grow high, even if most of them are duplicates that end up not being printed at all. The similar for_each_alternate_ref() code-paths in fetch-pack solve this by using flags in "struct object" to de-duplicate (and so are relying on obj_hash at the core). But we don't have a "struct object" at all in this case. We could call lookup_unknown_object() to get one, but if our goal is reducing memory footprint, it's not great: - an unknown object is as large as the largest object type (a commit), which is bigger than an oidset entry - we can free the memory after our ref advertisement, but "struct object" entries persist forever (and the receive-pack may hang around for a long time, as the bottleneck is often client upload bandwidth). So let's use an oidset. Note that unlike a sha1-array it doesn't sort the output as a side effect. However, our output is at least stable, because for_each_alternate_ref() will give us the sha1s in ref-sorted order. In one particularly pathological case with an alternate that has 60,000 unique refs out of 80 million total, this reduced the peak heap usage of "git receive-pack . </dev/null" from 13GB to 14MB. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Diffstat (limited to 't')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: