summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-02-24commit-graph.c: display correct number of chunks when writingLibravatar Taylor Blau1-4/+3
When writing a commit-graph, a progress meter is shown which indicates the number of pieces of data to write (one per commit in each chunk). In 47410aa837 (commit-graph: use chunk-format write API, 2021-02-18), the number of chunks became tracked by the new chunk-format API. But a stray local variable was left behind from when write_commit_graph_file() used to keep track of the same. Since this was no longer updated after 47410aa837, the progress meter appeared broken: $ git commit-graph write --reachable Expanding reachable commits in commit graph: 837569, done. Writing out commit graph in 3 passes: 166% (4187845/2512707), done. Drop the local variable and rely instead on the chunk-format API to tell us the correct number of chunks. Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18chunk-format: add technical docsLibravatar Derrick Stolee3-0/+122
The chunk-based file format is now an API in the code, but we should also take time to document it as a file format. Specifically, it matches the CHUNK LOOKUP sections of the commit-graph and multi-pack-index files, but there are some commonalities that should be grouped in this document. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18chunk-format: restore duplicate chunk checksLibravatar Derrick Stolee1-0/+9
Before refactoring into the chunk-format API, the commit-graph parsing logic included checks for duplicate chunks. It is unlikely that we would desire a chunk-based file format that allows duplicate chunk IDs in the table of contents, so add duplicate checks into read_table_of_contents(). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18midx: use 64-bit multiplication for chunk sizesLibravatar Derrick Stolee1-5/+6
When calculating the sizes of certain chunks, we should use 64-bit multiplication always. This allows us to properly predict the chunk sizes without risk of overflow. Other possible overflows were discovered by evaluating each multiplication in midx.c and ensuring that at least one side of the operator was of type size_t or off_t. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18midx: use chunk-format read APILibravatar Derrick Stolee2-50/+29
Instead of parsing the table of contents directly, use the chunk-format API methods read_table_of_contents() and pair_chunk(). In particular, we can use the return value of pair_chunk() to generate an error when a required chunk is missing. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18commit-graph: use chunk-format read APILibravatar Derrick Stolee2-106/+55
Instead of parsing the table of contents directly, use the chunk-format API methods read_table_of_contents() and pair_chunk(). While the current implementation loses the duplicate-chunk detection, that will be added in a future change. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18chunk-format: create read chunk APILibravatar Derrick Stolee2-0/+127
Add the capability to read the table of contents, then pair the chunks with necessary logic using read_chunk_fn pointers. Callers will be added in future changes, but the typical outline will be: 1. initialize a 'struct chunkfile' with init_chunkfile(NULL). 2. call read_table_of_contents(). 3. for each chunk to parse, a. call pair_chunk() to assign a pointer with the chunk position, or b. call read_chunk() to run a callback on the chunk start and size. 4. call free_chunkfile() to clear the 'struct chunkfile' data. We are re-using the anonymous 'struct chunkfile' data, as it is internal to the chunk-format API. This gives it essentially two modes: write and read. If the same struct instance was used for both reads and writes, then there would be failures. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18midx: use chunk-format API in write_midx_internal()Libravatar Derrick Stolee1-86/+20
The chunk-format API allows writing the table of contents and all chunks using the anonymous 'struct chunkfile' type. We only need to convert our local chunk logic to this API for the multi-pack-index writes to share that logic with the commit-graph file writes. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18midx: drop chunk progress during writeLibravatar Derrick Stolee1-7/+0
Most expensive operations in write_midx_internal() use the context struct's progress member, and these indicate the process of the expensive operations within the chunk writing methods. However, there is a competing progress struct that counts the progress over all chunks. This is not very helpful compared to the others, so drop it. This also reduces our barriers to combining the chunk writing code with chunk-format.c. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18midx: return success/failure in chunk write methodsLibravatar Derrick Stolee1-36/+27
Historically, the chunk-writing methods in midx.c have returned the amount of data written so the writer method could compare this with the table of contents. This presents with some interesting issues: 1. If a chunk writing method has a bug that miscalculates the written bytes, then we can satisfy the table of contents without actually writing the right amount of data to the hashfile. The commit-graph writing code checks the hashfile struct directly for a more robust verification. 2. There is no way for a chunk writing method to gracefully fail. Returning an int presents an opportunity to fail without a die(). 3. The current pattern doesn't match chunk_write_fn type exactly, so we cannot share code with commit-graph.c For these reasons, convert the midx chunk writer methods to return an 'int'. Since none of them fail at the moment, they all return 0. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18midx: add num_large_offsets to write_midx_contextLibravatar Derrick Stolee1-7/+10
In an effort to align write_midx_internal() with the chunk-format API, continue to group necessary data into "struct write_midx_context". This change collects the "uint32_t num_large_offsets" into the context. With this new data, write_midx_large_offsets() now matches the chunk_write_fn type. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18midx: add pack_perm to write_midx_contextLibravatar Derrick Stolee1-19/+21
In an effort to align write_midx_internal() with the chunk-format API, continue to group necessary data into "struct write_midx_context". This change collects the "uint32_t *pack_perm" and large_offsets_needed bit into the context. Update write_midx_object_offsets() to match chunk_write_fn. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18midx: add entries to write_midx_contextLibravatar Derrick Stolee1-23/+26
In an effort to align write_midx_internal() with the chunk-format API, continue to group necessary data into "struct write_midx_context". This change collects the "struct pack_midx_entry *entries" list and its count into the context. Update write_midx_oid_fanout() and write_midx_oid_lookup() to take the context directly, as these are easy conversions with this new data. Only the callers of write_midx_object_offsets() and write_midx_large_offsets() are updated here, since additional data in the context before those methods can match chunk_write_fn. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18midx: use context in write_midx_pack_names()Libravatar Derrick Stolee1-11/+10
In an effort to align the write_midx_internal() to use the chunk-format API, start converting chunk writing methods to match chunk_write_fn. The first case is to convert write_midx_pack_names() to take "void *data". We already have the necessary data in "struct write_midx_context", so this conversion is rather mechanical. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18midx: rename pack_info to write_midx_contextLibravatar Derrick Stolee1-65/+65
In an effort to streamline our chunk-based file formats, align some of the code structure in write_midx_internal() to be similar to the patterns in write_commit_graph_file(). Specifically, let's create a "struct write_midx_context" that can be used as a data parameter to abstract function types. This change only renames "struct pack_info" to "struct write_midx_context" and the names of instances from "packs" to "ctx". In future changes, we will expand the data inside "struct write_midx_context" and align our chunk-writing method with the chunk-format API. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18commit-graph: use chunk-format write APILibravatar Derrick Stolee1-82/+37
The commit-graph write logic is ready to make use of the chunk-format write API. Each chunk write method is already in the correct prototype. We only need to use the 'struct chunkfile' pointer and the correct API calls. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18chunk-format: create chunk format write APILibravatar Derrick Stolee3-0/+112
In anticipation of combining the logic from the commit-graph and multi-pack-index file formats, create a new chunk-format API. Use a 'struct chunkfile' pointer to keep track of data that has been registered for writes. This struct is anonymous outside of chunk-format.c to ensure no user attempts to interfere with the data. The next change will use this API in commit-graph.c, but the general approach is: 1. initialize the chunkfile with init_chunkfile(f). 2. add chunks in the intended writing order with add_chunk(). 3. write any header information to the hashfile f. 4. write the chunkfile data using write_chunkfile(). 5. free the chunkfile struct using free_chunkfile(). Helped-by: Taylor Blau <me@ttaylorr.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05commit-graph: anonymize data in chunk_write_fnLibravatar Derrick Stolee1-10/+19
In preparation for creating an API around file formats using chunks and tables of contents, prepare the commit-graph write code to use prototypes that will match this new API. Specifically, convert chunk_write_fn to take a "void *data" parameter instead of the commit-graph-specific "struct write_commit_graph_context" pointer. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18doc: add corrected commit date infoLibravatar Abhishek Kumar2-19/+86
With generation data chunk and corrected commit dates implemented, let's update the technical documentation for commit-graph. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18commit-reach: use corrected commit dates in paint_down_to_common()Libravatar Abhishek Kumar4-2/+25
091f4cf (commit: don't use generation numbers if not needed, 2018-08-30) changed paint_down_to_common() to use commit dates instead of generation numbers v1 (topological levels) as the performance regressed on certain topologies. With generation number v2 (corrected commit dates) implemented, we no longer have to rely on commit dates and can use generation numbers. For example, the command `git merge-base v4.8 v4.9` on the Linux repository walks 167468 commits, taking 0.135s for committer date and 167496 commits, taking 0.157s for corrected committer date respectively. While using corrected commit dates, Git walks nearly the same number of commits as commit date, the process is slower as for each comparision we have to access a commit-slab (for corrected committer date) instead of accessing struct member (for committer date). This change incidentally broke the fragile t6404-recursive-merge test. t6404-recursive-merge sets up a unique repository where all commits have the same committer date without a well-defined merge-base. While running tests with GIT_TEST_COMMIT_GRAPH unset, we use committer date as a heuristic in paint_down_to_common(). 6404.1 'combined merge conflicts' merges commits in the order: - Merge C with B to form an intermediate commit. - Merge the intermediate commit with A. With GIT_TEST_COMMIT_GRAPH=1, we write a commit-graph and subsequently use the corrected committer date, which changes the order in which commits are merged: - Merge A with B to form an intermediate commit. - Merge the intermediate commit with C. While resulting repositories are equivalent, 6404.4 'virtual trees were processed' fails with GIT_TEST_COMMIT_GRAPH=1 as we are selecting different merge-bases and thus have different object ids for the intermediate commits. As this has already causes problems (as noted in 859fdc0 (commit-graph: define GIT_TEST_COMMIT_GRAPH, 2018-08-29)), we disable commit graph within t6404-recursive-merge. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18commit-graph: use generation v2 only if entire chain doesLibravatar Abhishek Kumar3-2/+210
Since there are released versions of Git that understand generation numbers in the commit-graph's CDAT chunk but do not understand the GDAT chunk, the following scenario is possible: 1. "New" Git writes a commit-graph with the GDAT chunk. 2. "Old" Git writes a split commit-graph on top without a GDAT chunk. If each layer of split commit-graph is treated independently, as it was the case before this commit, with Git inspecting only the current layer for chunk_generation_data pointer, commits in the lower layer (one with GDAT) whould have corrected commit date as their generation number, while commits in the upper layer would have topological levels as their generation. Corrected commit dates usually have much larger values than topological levels. This means that if we take two commits, one from the upper layer, and one reachable from it in the lower layer, then the expectation that the generation of a parent is smaller than the generation of a child would be violated. It is difficult to expose this issue in a test. Since we _start_ with artificially low generation numbers, any commit walk that prioritizes generation numbers will walk all of the commits with high generation number before walking the commits with low generation number. In all the cases I tried, the commit-graph layers themselves "protect" any incorrect behavior since none of the commits in the lower layer can reach the commits in the upper layer. This issue would manifest itself as a performance problem in this case, especially with something like "git log --graph" since the low generation numbers would cause the in-degree queue to walk all of the commits in the lower layer before allowing the topo-order queue to write anything to output (depending on the size of the upper layer). Therefore, When writing the new layer in split commit-graph, we write a GDAT chunk only if the topmost layer has a GDAT chunk. This guarantees that if a layer has GDAT chunk, all lower layers must have a GDAT chunk as well. Rewriting layers follows similar approach: if the topmost layer below the set of layers being rewritten (in the split commit-graph chain) exists, and it does not contain GDAT chunk, then the result of rewrite does not have GDAT chunks either. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18commit-graph: implement generation data chunkLibravatar Abhishek Kumar10-32/+200
As discovered by Ævar, we cannot increment graph version to distinguish between generation numbers v1 and v2 [1]. Thus, one of pre-requistes before implementing generation number v2 was to distinguish between graph versions in a backwards compatible manner. We are going to introduce a new chunk called Generation DATa chunk (or GDAT). GDAT will store corrected committer date offsets whereas CDAT will still store topological level. Old Git does not understand GDAT chunk and would ignore it, reading topological levels from CDAT. New Git can parse GDAT and take advantage of newer generation numbers, falling back to topological levels when GDAT chunk is missing (as it would happen with a commit-graph written by old Git). We introduce a test environment variable 'GIT_TEST_COMMIT_GRAPH_NO_GDAT' which forces commit-graph file to be written without generation data chunk to emulate a commit-graph file written by old Git. To minimize the space required to store corrrected commit date, Git stores corrected commit date offsets into the commit-graph file, instea of corrected commit dates. This saves us 4 bytes per commit, decreasing the GDAT chunk size by half, but it's possible for the offset to overflow the 4-bytes allocated for storage. As such overflows are and should be exceedingly rare, we use the following overflow management scheme: We introduce a new commit-graph chunk, Generation Data OVerflow ('GDOV') to store corrected commit dates for commits with offsets greater than GENERATION_NUMBER_V2_OFFSET_MAX. If the offset is greater than GENERATION_NUMBER_V2_OFFSET_MAX, we set the MSB of the offset and the other bits store the position of corrected commit date in GDOV chunk, similar to how Extra Edge List is maintained. We test the overflow-related code with the following repo history: F - N - U / \ U - N - U N \ / N - F - N Where the commits denoted by U have committer date of zero seconds since Unix epoch, the commits denoted by N have committer date of 1112354055 (default committer date for the test suite) seconds since Unix epoch and the commits denoted by F have committer date of (2 ^ 31 - 2) seconds since Unix epoch. The largest offset observed is 2 ^ 31, just large enough to overflow. [1]: https://lore.kernel.org/git/87a7gdspo4.fsf@evledraar.gmail.com/ Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18commit-graph: implement corrected commit dateLibravatar Abhishek Kumar1-4/+17
With most of preparations done, let's implement corrected commit date. The corrected commit date for a commit is defined as: * A commit with no parents (a root commit) has corrected commit date equal to its committer date. * A commit with at least one parent has corrected commit date equal to the maximum of its commit date and one more than the largest corrected commit date among its parents. As a special case, a root commit with timestamp of zero (01.01.1970 00:00:00Z) has corrected commit date of one, to be able to distinguish from GENERATION_NUMBER_ZERO (that is, an uncomputed corrected commit date). To minimize the space required to store corrected commit date, Git stores corrected commit date offsets into the commit-graph file. The corrected commit date offset for a commit is defined as the difference between its corrected commit date and actual commit date. Storing corrected commit date requires sizeof(timestamp_t) bytes, which in most cases is 64 bits (uintmax_t). However, corrected commit date offsets can be safely stored using only 32-bits. This halves the size of GDAT chunk, which is a reduction of around 6% in the size of commit-graph file. However, using offsets be problematic if a commit is malformed but valid and has committer date of 0 Unix time, as the offset would be the same as corrected commit date and thus require 64-bits to be stored properly. While Git does not write out offsets at this stage, Git stores the corrected commit dates in member generation of struct commit_graph_data. It will begin writing commit date offsets with the introduction of generation data chunk. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18commit-graph: return 64-bit generation numberLibravatar Abhishek Kumar8-42/+42
In a preparatory step for introducing corrected commit dates, let's return timestamp_t values from commit_graph_generation(), use timestamp_t for local variables and define GENERATION_NUMBER_INFINITY as (2 ^ 63 - 1) instead. We rename GENERATION_NUMBER_MAX to GENERATION_NUMBER_V1_MAX to represent the largest topological level we can store in the commit data chunk. With corrected commit dates implemented, we will have two such *_MAX variables to denote the largest offset and largest topological level that can be stored. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18commit-graph: add a slab to store topological levelsLibravatar Abhishek Kumar2-15/+31
In a later commit we will introduce corrected commit date as the generation number v2. Corrected commit dates will be stored in the new seperate Generation Data chunk. However, to ensure backwards compatibility with "Old" Git we need to continue to write generation number v1 (topological levels) to the commit data chunk. Thus, we need to compute and store both versions of generation numbers to write the commit-graph file. Therefore, let's introduce a commit-slab `topo_level_slab` to store topological levels; corrected commit date will be stored in the member `generation` of struct commit_graph_data. The macros `GENERATION_NUMBER_INFINITY` and `GENERATION_NUMBER_ZERO` mark commits not in the commit-graph file and commits written by a version of Git that did not compute generation numbers respectively. Generation numbers are computed identically for both kinds of commits. A "slab-miss" should return `GENERATION_NUMBER_INFINITY` as the commit is not in the commit-graph file. However, since the slab is zero-initialized, it returns 0 (or rather `GENERATION_NUMBER_ZERO`). Thus, we no longer need to check if the topological level of a commit is `GENERATION_NUMBER_INFINITY`. We will add a pointer to the slab in `struct write_commit_graph_context` and `struct commit_graph` to populate the slab in `fill_commit_graph_info` if the commit has a pre-computed topological level as in case of split commit-graphs. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18t6600-test-reach: generalize *_three_modesLibravatar Abhishek Kumar1-31/+31
In a preparatory step to implement generation number v2, we add tests to ensure Git can read and parse commit-graph files without Generation Data chunk. These files represent commit-graph files written by Old Git and are neccesary for backward compatability. We extend run_three_modes() and test_three_modes() to *_all_modes() with the fourth mode being "commit-graph without generation data chunk". Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18commit-graph: consolidate fill_commit_graph_infoLibravatar Abhishek Kumar2-20/+31
Both fill_commit_graph_info() and fill_commit_in_graph() parse information present in commit data chunk. Let's simplify the implementation by calling fill_commit_graph_info() within fill_commit_in_graph(). fill_commit_graph_info() used to not load committer data from commit data chunk. However, with the upcoming switch to using corrected committer date as generation number v2, we will have to load committer date to compute generation number value anyway. e51217e15 (t5000: test tar files that overflow ustar headers, 30-06-2016) introduced a test 'generate tar with future mtime' that creates a commit with committer date of (2^36 + 1) seconds since EPOCH. The CDAT chunk provides 34-bits for storing committer date, thus committer time overflows into generation number (within CDAT chunk) and has undefined behavior. The test used to pass as fill_commit_graph_info() would not set struct member `date` of struct commit and load committer date from the object database, generating a tar file with the expected mtime. However, with corrected commit date, we will load the committer date from CDAT chunk (truncated to lower 34-bits to populate the generation number. Thus, Git sets date and generates tar file with the truncated mtime. The ustar format (the header format used by most modern tar programs) only has room for 11 (or 12, depending on some implementations) octal digits for the size and mtime of each file. As the CDAT chunk is overflow by 12-octal digits but not 11-octal digits, we split the existing tests to test both implementations separately and add a new explicit test for 11-digit implementation. To test the 11-octal digit implementation, we create a future commit with committer date of 2^34 - 1, which overflows 11-octal digits without overflowing 34-bits of the Commit Date chunks. To test the 12-octal digit implementation, the smallest committer date possible is 2^36 + 1, which overflows the CDAT chunk and thus commit-graph must be disabled for the test. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18revision: parse parent in indegree_walk_step()Libravatar Abhishek Kumar1-0/+3
In indegree_walk_step(), we add unvisited parents to the indegree queue. However, parents are not guaranteed to be parsed. As the indegree queue sorts by generation number, let's parse parents before inserting them to ensure the correct priority order. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18commit-graph: fix regression when computing Bloom filtersLibravatar Abhishek Kumar1-2/+6
Before computing Bloom filters, the commit-graph machinery uses commit_gen_cmp to sort commits by generation order for improved diff performance. 3d11275505 (commit-graph: examine commits by generation number, 2020-03-30) claims that this sort can reduce the time spent to compute Bloom filters by nearly half. But since c49c82aa4c (commit: move members graph_pos, generation to a slab, 2020-06-17), this optimization is broken, since asking for a 'commit_graph_generation()' directly returns GENERATION_NUMBER_INFINITY while writing. Not all hope is lost, though: 'commit_gen_cmp()' falls back to comparing commits by their date when they have equal generation number, and so since c49c82aa4c is purely a date comparison function. This heuristic is good enough that we don't seem to loose appreciable performance while computing Bloom filters. Applying this patch (compared with v2.30.0) speeds up computing Bloom filters by factors ranging from 0.40% to 5.19% on various repositories [1]. So, avoid the useless 'commit_graph_generation()' while writing by instead accessing the slab directly. This returns the newly-computed generation numbers, and allows us to avoid the heuristic by directly comparing generation numbers. [1]: https://lore.kernel.org/git/20210105094535.GN8396@szeder.dev/ Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-27Git 2.30Libravatar Junio C Hamano2-12/+9
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-27Merge branch 'pb/doc-git-linkit-fix'Libravatar Junio C Hamano1-2/+2
Docfix. * pb/doc-git-linkit-fix: git.txt: fix typos in 'linkgit' macro invocation
2020-12-27Merge tag 'l10n-2.30.0-rnd2' of https://github.com/git-l10n/git-poLibravatar Junio C Hamano12-26547/+53354
l10n for Git 2.30.0 round 2 * tag 'l10n-2.30.0-rnd2' of https://github.com/git-l10n/git-po: l10n: zh_CN: for git v2.30.0 l10n round 1 and 2 l10n: zh_TW.po: v2.30.0 round 2 (1 untranslated) l10n: pl.po: add translation and set team leader l10n: pl.po: started Polish translation l10n: de.po: Update German translation for Git 2.30.0 l10n: Update Catalan translation l10n: bg.po: Updated Bulgarian translation (5037t) l10n: fr.po v2.30.0 rnd 2 l10n: tr: v2.30.0-r2 l10n: sv.po: Update Swedish translation (5037t0f0u) l10n: vi.po(5037t): v2.30.0 rnd 2 l10n: git.pot: v2.30.0 round 2 (1 new, 2 removed) l10n: Update Catalan translation l10n: fr.po: v2.30.0 rnd 1 l10n: fr.po Fix a typo l10n: fr fix misleading message l10n: tr: v2.30.0-r1 l10n: sv.po: Update Swedish translation (5038t0f0u) l10n: git.pot: v2.30.0 round 1 (70 new, 45 removed)
2020-12-27l10n: zh_CN: for git v2.30.0 l10n round 1 and 2Libravatar Jiang Xin1-2475/+2588
Translate 71 new messages (5037t0f0u) for git 2.30.0. Reviewed-by: 依云 <lilydjwg@gmail.com> Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2020-12-25Merge branch 'l10n/zh_TW/201223' of github.com:l10n-tw/git-poLibravatar Jiang Xin1-2744/+2602
* 'l10n/zh_TW/201223' of github.com:l10n-tw/git-po: l10n: zh_TW.po: v2.30.0 round 2 (1 untranslated)
2020-12-25l10n: zh_TW.po: v2.30.0 round 2 (1 untranslated)Libravatar pan934121-2744/+2602
Signed-off-by: pan93412 <pan93412@gmail.com>
2020-12-23l10n: pl.po: add translation and set team leaderLibravatar Arusekk2-8992/+22386
Signed-off-by: Arusekk <arek_koz@o2.pl>
2020-12-23Git 2.30-rc2Libravatar Junio C Hamano2-1/+6
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-23Merge branch 'nk/refspecs-negative-fix'Libravatar Junio C Hamano2-3/+64
Hotfix for recent regression. * nk/refspecs-negative-fix: negative-refspec: improve comment on query_matches_negative_refspec negative-refspec: fix segfault on : refspec
2020-12-23Merge branch 'ma/maintenance-crontab-fix'Libravatar Junio C Hamano3-5/+20
Hotfix for a topic of this cycle. * ma/maintenance-crontab-fix: t7900-maintenance: test for magic markers gc: fix handling of crontab magic markers git-maintenance.txt: add missing word
2020-12-23Merge branch 'dl/checkout-p-merge-base'Libravatar Junio C Hamano2-2/+9
Fix to a regression introduced during this cycle. * dl/checkout-p-merge-base: checkout -p: handle tree arguments correctly again
2020-12-23Merge branch 'js/no-more-prepare-for-main-in-test'Libravatar Junio C Hamano11-619/+621
Test coverage fix. * js/no-more-prepare-for-main-in-test: tests: drop the `PREPARE_FOR_MAIN_BRANCH` prereq t9902: use `main` as initial branch name t6302: use `main` as initial branch name t5703: use `main` as initial branch name t5510: use `main` as initial branch name t5505: finalize transitioning to using the branch name `main` t3205: finalize transitioning to using the branch name `main` t3203: complete the transition to using the branch name `main` t3201: finalize transitioning to using the branch name `main` t3200: finish transitioning to the initial branch name `main` t1400: use `main` as initial branch name
2020-12-23Merge branch 'jx/pack-redundant-on-single-pack'Libravatar Junio C Hamano2-4/+39
"git pack-redandant" when there is only one packfile used to crash, which has been corrected. * jx/pack-redundant-on-single-pack: pack-redundant: fix crash when one packfile in repo
2020-12-23l10n: pl.po: started Polish translationLibravatar m4sk1n1-0/+12579
Signed-off-by: Arusekk <arek_koz@o2.pl>
2020-12-23l10n: de.po: Update German translation for Git 2.30.0Libravatar Matthias Rüster1-2600/+2729
Reviewed-by: Ralf Thielow <ralf.thielow@gmail.com> Reviewed-by: Phillip Szelat <phillip.szelat@gmail.com> Signed-off-by: Matthias Rüster <matthias.ruester@gmail.com>
2020-12-23Merge branch 'master' of github.com:Softcatala/git-poLibravatar Jiang Xin1-1082/+755
* 'master' of github.com:Softcatala/git-po: l10n: Update Catalan translation
2020-12-22git.txt: fix typos in 'linkgit' macro invocationLibravatar Philippe Blain1-2/+2
The 'linkgit' Asciidoc macro is misspelled as 'linkit' in the description of 'GIT_SEQUENCE_EDITOR' since the addition of that variable to git(1) in 902a126eca (doc: mention GIT_SEQUENCE_EDITOR and 'sequence.editor' more, 2020-08-31). Also, it uses two colons instead of one. Fix that. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-22l10n: Update Catalan translationLibravatar Jordi Mas1-1082/+755
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2020-12-22l10n: bg.po: Updated Bulgarian translation (5037t)Libravatar Alexander Shopov1-2480/+2600
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2020-12-21negative-refspec: improve comment on query_matches_negative_refspecLibravatar Nipunn Koorapati1-0/+6
Comment did not adequately explain how the two loops work together to achieve the goal of querying for matching of any negative refspec. Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21negative-refspec: fix segfault on : refspecLibravatar Nipunn Koorapati2-3/+58
The logic added to check for negative pathspec match by c0192df630 (refspec: add support for negative refspecs, 2020-09-30) looks at refspec->src assuming it is never NULL, however when remote.origin.push is set to ":", then refspec->src is NULL, causing a segfault within strcmp. Tell git to handle matching refspec by adding the needle to the set of positively matched refspecs, since matching ":" refspecs match anything as src. Add test for matching refspec pushes fetch-negative-refspec both individually and in combination with a negative refspec. Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>