commit-graph: reuse existing Bloom filters during write

Add logic to a) parse Bloom filter information from the commit graph file and, b) re-use existing Bloom filters. See Documentation/technical/commit-graph-format for the format in which the Bloom filter information is written to the commit graph file. To read Bloom filter for a given commit with lexicographic position 'i' we need to: 1. Read BIDX[i] which essentially gives us the starting index in BDAT for filter of commit i+1. It is essentially the index past the end of the filter of commit i. It is called end_index in the code. 2. For i>0, read BIDX[i-1] which will give us the starting index in BDAT for filter of commit i. It is called the start_index in the code. For the first commit, where i = 0, Bloom filter data starts at the beginning, just past the header in the BDAT chunk. Hence, start_index will be 0. 3. The length of the filter will be end_index - start_index, because BIDX[i] gives the cumulative 8-byte words including the ith commit's filter. We toggle whether Bloom filters should be recomputed based on the compute_if_not_present flag. Helped-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
author: Garima Singh <garima.singh@microsoft.com> 2020-04-06 16:59:50 +0000
committer: Junio C Hamano <gitster@pobox.com> 2020-04-06 11:08:37 -0700
commit: 1217c03e7b87b15f2c78af5b1e1915a675050454 (patch)
tree: e19c660138425048b891a39028c6b6ce567c62d2 /commit-graph.c
parent: commit-graph: write Bloom filters to commit graph file (diff)
download: tgif-1217c03e7b87b15f2c78af5b1e1915a675050454.tar.xz
1 files changed, 3 insertions, 3 deletions
diff --git a/commit-graph.c b/commit-graph.c
index a8b6b5cca5..77668629e2 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1086,7 +1086,7 @@ static void write_graph_chunk_bloom_indexes(struct hashfile *f,
 			ctx->commits.nr);
 
 	while (list < last) {
-		struct bloom_filter *filter = get_bloom_filter(ctx->r, *list);
+		struct bloom_filter *filter = get_bloom_filter(ctx->r, *list, 0);
 		cur_pos += filter->len;
 		display_progress(progress, ++i);
 		hashwrite_be32(f, cur_pos);
@@ -1115,7 +1115,7 @@ static void write_graph_chunk_bloom_data(struct hashfile *f,
 	hashwrite_be32(f, settings->bits_per_entry);
 
 	while (list < last) {
-		struct bloom_filter *filter = get_bloom_filter(ctx->r, *list);
+		struct bloom_filter *filter = get_bloom_filter(ctx->r, *list, 0);
 		display_progress(progress, ++i);
 		hashwrite(f, filter->data, filter->len * sizeof(unsigned char));
 		list++;
@@ -1296,7 +1296,7 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
 
 	for (i = 0; i < ctx->commits.nr; i++) {
 		struct commit *c = sorted_commits[i];
-		struct bloom_filter *filter = get_bloom_filter(ctx->r, c);
+		struct bloom_filter *filter = get_bloom_filter(ctx->r, c, 1);
 		ctx->total_bloom_filter_data_size += sizeof(unsigned char) * filter->len;
 		display_progress(progress, i + 1);
 	}
author	Garima Singh <garima.singh@microsoft.com>	2020-04-06 16:59:50 +0000
committer	Junio C Hamano <gitster@pobox.com>	2020-04-06 11:08:37 -0700
commit	1217c03e7b87b15f2c78af5b1e1915a675050454 (patch)
tree	e19c660138425048b891a39028c6b6ce567c62d2 /commit-graph.c
parent	commit-graph: write Bloom filters to commit graph file (diff)
download	tgif-1217c03e7b87b15f2c78af5b1e1915a675050454.tar.xz