summaryrefslogtreecommitdiff
path: root/merge-recursive.c
AgeCommit message (Collapse)AuthorFilesLines
2021-07-30use fspathhash() everywhereLibravatar René Scharfe1-8/+3
cf2dc1c238 (speed up alt_odb_usable() with many alternates, 2021-07-07) introduced the function fspathhash() for calculating path hashes while respecting the configuration option core.ignorecase. Call it instead of open-coding it; the resulting code is shorter and less repetitive. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-28Merge branch 'en/rename-limits-doc'Libravatar Junio C Hamano1-1/+1
Documentation on "git diff -l<n>" and diff.renameLimit have been updated, and the defaults for these limits have been raised. * en/rename-limits-doc: rename: bump limit defaults yet again diffcore-rename: treat a rename_limit of 0 as unlimited doc: clarify documentation for rename/copy limits diff: correct warning message when renameLimit exceeded
2021-07-28Merge branch 'ab/attribute-format'Libravatar Junio C Hamano1-0/+1
Many "printf"-like helper functions we have have been annotated with __attribute__() to catch placeholder/parameter mismatches. * ab/attribute-format: advice.h: add missing __attribute__((format)) & fix usage *.h: add a few missing __attribute__((format)) *.c static functions: add missing __attribute__((format)) sequencer.c: move static function to avoid forward decl *.c static functions: don't forward-declare __attribute__
2021-07-16Merge branch 'ab/struct-init'Libravatar Junio C Hamano1-2/+2
Code cleanup around struct_type_init() functions. * ab/struct-init: string-list.h users: change to use *_{nodup,dup}() string-list.[ch]: add a string_list_init_{nodup,dup}() dir.[ch]: replace dir_init() with DIR_INIT *.c *_init(): define in terms of corresponding *_INIT macro *.h: move some *_INIT to designated initializers
2021-07-16Merge branch 'en/merge-dir-rename-corner-case-fix'Libravatar Junio C Hamano1-6/+13
The merge code had funny interactions between content based rename detection and directory rename detection. * en/merge-dir-rename-corner-case-fix: merge-recursive: handle rename-to-self case merge-ort: ensure we consult df_conflict and path_conflicts t6423: test directory renames causing rename-to-self
2021-07-15rename: bump limit defaults yet againLibravatar Elijah Newren1-1/+1
These were last bumped in commit 92c57e5c1d29 (bump rename limit defaults (again), 2011-02-19), and were bumped both because processors had gotten faster, and because people were getting ugly merges that caused problems and reporting it to the mailing list (suggesting that folks were willing to spend more time waiting). Since that time: * Linus has continued recommending kernel folks to set diff.renameLimit=0 (maps to 32767, currently) * Folks with repositories with lots of renames were happy to set merge.renameLimit above 32767, once the code supported that, to get correct cherry-picks * Processors have gotten faster * It has been discovered that the timing methodology used last time probably used too large example files. The last point is probably worth explaining a bit more: * The "average" file size used appears to have been average blob size in the linux kernel history at the time (probably v2.6.25 or something close to it). * Since bigger files are modified more frequently, such a computation weights towards larger files. * Larger files may be more likely to be modified over time, but are not more likely to be renamed -- the mean and median blob size within a tree are a bit higher than the mean and median of blob sizes in the history leading up to that version for the linux kernel. * The mean blob size in v2.6.25 was half the average blob size in history leading to that point * The median blob size in v2.6.25 was about 40% of the mean blob size in v2.6.25. * Since the mean blob size is more than double the median blob size, any file as big as the mean will not be compared to any files of median size or less (because they'd be more than 50% dissimilar). * Since it is the number of files compared that provides the O(n^2) behavior, median-sized files should matter more than mean-sized ones. The combined effect of the above is that the file size used in past calculations was likely about 5x too large. Combine that with a CPU performance improvement of ~30%, and we can increase the limits by a factor of sqrt(5/(1-.3)) = 2.67, while keeping the original stated time limits. Keeping the same approximate time limit probably makes sense for diff.renameLimit (there is no progress feedback in e.g. git log -p), but the experience above suggests merge.renameLimit could be extended significantly. In fact, it probably would make sense to have an unlimited default setting for merge.renameLimit, but that would likely need to be coupled with changes to how progress is displayed. (See https://lore.kernel.org/git/YOx+Ok%2FEYvLqRMzJ@coredump.intra.peff.net/ for details in that area.) For now, let's just bump the approximate time limit from 10s to 1m. (Note: We do not want to use actual time limits, because getting results that depend on how loaded your system is that day feels bad, and because we don't discover that we won't get all the renames until after we've put in a lot of work rather than just upfront telling the user there are too many files involved.) Using the original time limit of 2s for diff.renameLimit, and bumping merge.renameLimit from 10s to 60s, I found the following timings using the simple script at the end of this commit message (on an AWS c5.xlarge which reports as "Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz"): N Timing 1300 1.995s 7100 59.973s So let's round down to nice even numbers and bump the limits from 400->1000, and from 1000->7000. Here is the measure_rename_perf script (adapted from https://lore.kernel.org/git/20080211113516.GB6344@coredump.intra.peff.net/ in particular to avoid triggering the linear handling from basename-guided rename detection): #!/bin/bash n=$1; shift rm -rf repo mkdir repo && cd repo git init -q -b main mkdata() { mkdir $1 for i in `seq 1 $2`; do (sed "s/^/$i /" <../sample echo tag: $1 ) >$1/$i done } mkdata initial $n git add . git commit -q -m initial mkdata new $n git add . cd new for i in *; do git mv $i $i.renamed; done cd .. git rm -q -rf initial git commit -q -m new time git diff-tree -M -l0 --summary HEAD^ HEAD Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-13*.c static functions: add missing __attribute__((format))Libravatar Ævar Arnfjörð Bjarmason1-0/+1
Add missing __attribute__((format)) function attributes to various "static" functions that take printf arguments. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-01string-list.h users: change to use *_{nodup,dup}()Libravatar Ævar Arnfjörð Bjarmason1-2/+2
Change all in-tree users of the string_list_init(LIST, BOOL) API to use string_list_init_{nodup,dup}(LIST) instead. As noted in the preceding commit let's leave the now-unused string_list_init() wrapper in-place for any in-flight users, it can be removed at some later date. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-30merge-recursive: handle rename-to-self caseLibravatar Elijah Newren1-6/+13
Directory rename detection can cause transitive renames, e.g. if the two different sides of history each do one half of: A/file -> B/file B/ -> C/ then directory rename detection transitively renames to give us A/file -> C/file However, when C/ == A/, note that this gives us A/file -> A/file. merge-recursive assumed that any rename D -> E would have D != E. While that is almost always true, the above is a special case where it is not. So we cannot do things like delete the rename source, we cannot assume that a file existing at path E implies a rename/add conflict and we have to be careful about what stages end up in the output. This change feels a bit hackish. It took me surprisingly many hours to find, and given merge-recursive's design causing it to attempt to enumerate all combinations of edge and corner cases with special code for each combination, I'm worried there are other similar fixes needed elsewhere if we can just come up with the right special testcase. Perhaps an audit would rule it out, but I have not the energy. merge-recursive deserves to die, and since it is on its way out anyway, fixing this particular bug narrowly will have to be good enough. Reported-by: Anders Kaseorg <andersk@mit.edu> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-14*: fix typos which duplicate a wordLibravatar Andrei Rybak1-1/+1
Fix typos in documentation, code comments, and RelNotes which repeat various words. In trivial cases, just delete the duplicated word and rewrap text, if needed. Reword the affected sentence in Documentation/RelNotes/1.8.4.txt for it to make sense. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11Merge branch 'js/merge-already-up-to-date-message-reword'Libravatar Junio C Hamano1-1/+1
A few variants of informational message "Already up-to-date" has been rephrased. * js/merge-already-up-to-date-message-reword: merge: fix swapped "up to date" message components merge(s): apply consistent punctuation to "up to date" messages
2021-05-10Merge branch 'bc/hash-transition-interop-part-1'Libravatar Junio C Hamano1-5/+5
SHA-256 transition. * bc/hash-transition-interop-part-1: hex: print objects using the hash algorithm member hex: default to the_hash_algo on zero algorithm value builtin/pack-objects: avoid using struct object_id for pack hash commit-graph: don't store file hashes as struct object_id builtin/show-index: set the algorithm for object IDs hash: provide per-algorithm null OIDs hash: set, copy, and use algo field in struct object_id builtin/pack-redundant: avoid casting buffers to struct object_id Use the final_oid_fn to finalize hashing of object IDs hash: add a function to finalize object IDs http-push: set algorithm when reading object ID Always use oidread to read into struct object_id hash: add an algo member to struct object_id
2021-05-03merge(s): apply consistent punctuation to "up to date" messagesLibravatar Eric Sunshine1-1/+1
Although the various "Already up to date" messages resulting from merge attempts share identical phrasing, they use a mix of punctuation ranging from "." to "!" and even "Yeeah!", which leads to extra work for translators. Ease the job of translators by settling upon "." as punctuation for all such messages. While at it, take advantage of printf_ln() to further ease the translation task so translators need not worry about line termination, and fix a case of missing line termination in the (unused) merge_ort_nonrecursive() function. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-30Merge branch 'ds/sparse-index-protections'Libravatar Junio C Hamano1-1/+3
Builds on top of the sparse-index infrastructure to mark operations that are not ready to mark with the sparse index, causing them to fall back on fully-populated index that they always have worked with. * ds/sparse-index-protections: (47 commits) name-hash: use expand_to_path() sparse-index: expand_to_path() name-hash: don't add directories to name_hash revision: ensure full index resolve-undo: ensure full index read-cache: ensure full index pathspec: ensure full index merge-recursive: ensure full index entry: ensure full index dir: ensure full index update-index: ensure full index stash: ensure full index rm: ensure full index merge-index: ensure full index ls-files: ensure full index grep: ensure full index fsck: ensure full index difftool: ensure full index commit: ensure full index checkout: ensure full index ...
2021-04-27hash: provide per-algorithm null OIDsLibravatar brian m. carlson1-5/+5
Up until recently, object IDs did not have an algorithm member, only a hash. Consequently, it was possible to share one null (all-zeros) object ID among all hash algorithms. Now that we're going to be handling objects from multiple hash algorithms, it's important to make sure that all object IDs have a correct algorithm field. Introduce a per-algorithm null OID, and add it to struct hash_algo. Introduce a wrapper function as well, and use it everywhere we used to use the null_oid constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16Merge branch 'en/ort-readiness'Libravatar Junio C Hamano1-0/+37
Plug the ort merge backend throughout the rest of the system, and start testing it as a replacement for the recursive backend. * en/ort-readiness: Add testing with merge-ort merge strategy t6423: mark remaining expected failure under merge-ort as such Revert "merge-ort: ignore the directory rename split conflict for now" merge-recursive: add a bunch of FIXME comments documenting known bugs merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a conflict t: mark several submodule merging tests as fixed under merge-ort merge-ort: implement CE_SKIP_WORKTREE handling with conflicted entries t6428: new test for SKIP_WORKTREE handling and conflicts merge-ort: support subtree shifting merge-ort: let renormalization change modify/delete into clean delete merge-ort: have ll_merge() use a special attr_index for renormalization merge-ort: add a special minimal index just for renormalization merge-ort: use STABLE_QSORT instead of QSORT where required
2021-04-14merge-recursive: ensure full indexLibravatar Derrick Stolee1-0/+2
Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14*: remove 'const' qualifier for struct index_stateLibravatar Derrick Stolee1-1/+1
Several methods specify that they take a 'struct index_state' pointer with the 'const' qualifier because they intend to only query the data, not change it. However, we will be introducing a step very low in the method stack that might modify a sparse-index to become a full index in the case that our queries venture inside a sparse-directory entry. This change only removes the 'const' qualifiers that are necessary for the following change which will actually modify the implementation of index_name_stage_pos(). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30Merge branch 'ab/read-tree'Libravatar Junio C Hamano1-3/+3
Code simplification by removing support for a caller that is long gone. * ab/read-tree: tree.h API: simplify read_tree_recursive() signature tree.h API: expose read_tree_1() as read_tree_at() archive: stop passing "stage" through read_tree_recursive() ls-files: refactor away read_tree() ls-files: don't needlessly pass around stage variable tree.c API: move read_tree() into builtin/ls-files.c ls-files tests: add meaningful --with-tree tests show tests: add test for "git show <tree>"
2021-03-20tree.h API: simplify read_tree_recursive() signatureLibravatar Ævar Arnfjörð Bjarmason1-3/+3
Simplify the signature of read_tree_recursive() to omit the "base", "baselen" and "stage" arguments. No callers of it use these parameters for anything anymore. The last function to call read_tree_recursive() with a non-"" path was read_tree_recursive() itself, but that was changed in ffd31f661d5 (Reimplement read_tree_recursive() using tree_entry_interesting(), 2011-03-25). The last user of the "stage" parameter went away in the last commit, and even that use was mere boilerplate. So let's remove those and rename the read_tree_recursive() function to just read_tree(). We had another read_tree() function that I've refactored away in preceding commits, since all in-tree users read trees recursively with a callback we can change the name to signify that this is the norm. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20merge-recursive: add a bunch of FIXME comments documenting known bugsLibravatar Elijah Newren1-0/+37
The plan is to just delete merge-recursive, but not until everyone is comfortable with merge-ort as a replacement. Given that I haven't switched all callers of merge-recursive over yet (e.g. git-am still uses merge-recursive), maybe there's some value documenting known bugs in the algorithm in case we end up keeping it or someone wants to dig it up in the future. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13use CALLOC_ARRAYLibravatar René Scharfe1-5/+4
Add and apply a semantic patch for converting code that open-codes CALLOC_ARRAY to use it instead. It shortens the code and infers the element size automatically. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16commit: move reverse_commit_list() from merge-recursiveLibravatar Elijah Newren1-11/+0
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-02hashmap: provide deallocation function namesLibravatar Elijah Newren1-3/+3
hashmap_free(), hashmap_free_entries(), and hashmap_free_() have existed for a while, but aren't necessarily the clearest names, especially with hashmap_partial_clear() being added to the mix and lazy-initialization now being supported. Peff suggested we adopt the following names[1]: - hashmap_clear() - remove all entries and de-allocate any hashmap-specific data, but be ready for reuse - hashmap_clear_and_free() - ditto, but free the entries themselves - hashmap_partial_clear() - remove all entries but don't deallocate table - hashmap_partial_clear_and_free() - ditto, but free the entries This patch provides the new names and converts all existing callers over to the new naming scheme. [1] https://lore.kernel.org/git/20201030125059.GA3277724@coredump.intra.peff.net/ Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-10Merge branch 'en/eol-attrs-gotchas'Libravatar Junio C Hamano1-0/+3
All "mergy" operations that internally use the merge-recursive machinery should honor the merge.renormalize configuration, but many of them didn't. * en/eol-attrs-gotchas: checkout: support renormalization with checkout -m <paths> merge: make merge.renormalize work for all uses of merge machinery t6038: remove problematic test t6038: make tests fail for the right reason
2020-08-03merge: make merge.renormalize work for all uses of merge machineryLibravatar Elijah Newren1-0/+3
The 'merge' command is not the only one that does merges; other commands like checkout -m or rebase do as well. Unfortunately, the only area of the code that checked for the "merge.renormalize" config setting was in builtin/merge.c, meaning it could only affect merges performed by the "merge" command. Move the handling of this config setting to merge_recursive_config() so that other commands can benefit from it as well. Fixes a few tests in t6038. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-02merge-recursive: fix unclear and outright wrong commentsLibravatar Elijah Newren1-2/+3
Commits 7c0a6c8e47 ("merge-recursive: move some definitions around to clean up the header", 2019-08-17), and b4db8a2b76 ("merge-recursive: remove useless parameter in merge_trees()", 2019-08-17) added some useful documentation to the functions, but had a few places where the new comments were unclear or even misleading. Fix those comments. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-14merge-recursive: fix rename/rename(1to2) for working tree with a binaryLibravatar Elijah Newren1-0/+12
With a rename/rename(1to2) conflict, we attempt to do a three-way merge of the file contents, so that the correct contents can be placed in the working tree at both paths. If the file is a binary, however, no content merging is possible and we should just use the original version of the file at each of the paths. Reported-by: Chunlin Zhang <zhangchunlin@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-16convert: permit passing additional metadata to filter processesLibravatar brian m. carlson1-1/+1
There are a variety of situations where a filter process can make use of some additional metadata. For example, some people find the ident filter too limiting and would like to include the commit or the branch in their smudged files. This information isn't available during checkout as HEAD hasn't been updated at that point, and it wouldn't be available in archives either. Let's add a way to pass this metadata down to the filter. We pass the blob we're operating on, the treeish (preferring the commit over the tree if one exists), and the ref we're operating on. Note that we won't pass this information in all cases, such as when renormalizing or when we're performing diffs, since it doesn't make sense in those cases. The data we currently get from the filter process looks like the following: command=smudge pathname=git.c 0000 With this change, we'll get data more like this: command=smudge pathname=git.c refname=refs/tags/v2.25.1 treeish=c522f061d551c9bb8684a7c3859b2ece4499b56b blob=7be7ad34bd053884ec48923706e70c81719a8660 0000 There are a couple things to note about this approach. For operations like checkout, treeish will always be a commit, since we cannot check out individual trees, but for other operations, like archive, we can end up operating on only a particular tree, so we'll provide only a tree as the treeish. Similar comments apply for refname, since there are a variety of cases in which we won't have a ref. This commit wires up the code to print this information, but doesn't pass any of it at this point. In a future commit, we'll have various code paths pass the actual useful data down. Signed-off-by: brian m. carlson <bk2204@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-09Merge branch 'en/merge-path-collision'Libravatar Junio C Hamano1-105/+47
Handling of conflicting renames in merge-recursive have further been made consistent with how existing codepaths try to mimic what is done to add/add conflicts. * en/merge-path-collision: merge-recursive: apply collision handling unification to recursive case
2020-03-02Merge branch 'en/t3433-rebase-stat-dirty-failure'Libravatar Junio C Hamano1-2/+5
The merge-recursive machinery failed to refresh the cache entry for a merge result in a couple of places, resulting in an unnecessary merge failure, which has been fixed. * en/t3433-rebase-stat-dirty-failure: merge-recursive: fix the refresh logic in update_file_flags t3433: new rebase testcase documenting a stat-dirty-like failure
2020-02-27merge-recursive: apply collision handling unification to recursive caseLibravatar Elijah Newren1-105/+47
In the en/merge-path-collision topic (see commit ac193e0e0aa5, "Merge branch 'en/merge-path-collision'", 2019-01-04), all the "file collision" conflict types were modified for consistency. In particular, rename/add, rename/rename(2to1) and each rename/add piece of a rename/rename(1to2)/add[/add] conflict were made to behave like add/add conflicts have always been handled. However, this consistency was not enforced when opt->priv->call_depth > 0 for rename/rename conflicts. Update rename/rename(1to2) and rename/rename(2to1) conflicts in the recursive case to also be consistent. As an added bonus, this simplifies the code considerably. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-19merge-recursive: fix the refresh logic in update_file_flagsLibravatar Elijah Newren1-2/+5
If we need to delete a higher stage entry in the index to place the file at stage 0, then we'll lose that file's stat information. In such situations we may still be able to detect that the file on disk is the version we want (as noted by our comment in the code: /* do not overwrite file if already present */ ), but we do still need to update the mtime since we are creating a new cache_entry for that file. Update the logic used to determine whether we refresh a file's mtime. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-27merge-recursive: use subtraction to flip stageLibravatar Junio C Hamano1-3/+2
The flip_stage() helper uses a bit-flipping xor to switch between "2" and "3". While clever, this relies on a property of those two numbers that is mostly coincidence. Let's write it as a subtraction; that's more clear and would extend to other numbers if somebody copies the logic. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-27merge-recursive: silence -Wxor-used-as-pow warningLibravatar Jeff King1-5/+14
The merge-recursive code uses stage number constants like this: add = &ci->ren1->dst_entry->stages[2 ^ 1]; ... add = &ci->ren2->dst_entry->stages[3 ^ 1]; The xor has the effect of flipping the "1" bit, so that "2 ^ 1" becomes "3" and "3 ^ 1" becomes "2", which correspond to the "ours" and "theirs" stages respectively. Unfortunately, clang-10 and up issue a warning for this code: merge-recursive.c:1759:40: error: result of '2 ^ 1' is 3; did you mean '1 << 1' (2)? [-Werror,-Wxor-used-as-pow] add = &ci->ren1->dst_entry->stages[2 ^ 1]; ~~^~~ 1 << 1 merge-recursive.c:1759:40: note: replace expression with '0x2 ^ 1' to silence this warning We could silence it by using 0x2, as the compiler mentions. Or by just using the constants "2" and "3" directly. But after digging into it, I do think this bit-flip is telling us something. If we just wrote: add = &ci->ren2->dst_entry->stages[2]; for the second one, you might think that "ren2" and "2" correspond. But they don't. The logic is: ren2 is theirs, which is stage 3, but we are interested in the opposite side's stage, so flip it to 2. So let's keep the bit-flipping, but let's also put it behind a named function, which will make its purpose a bit clearer. This also has the side effect of suppressing the warning (and an optimizing compiler should be able to easily turn it into a constant as before). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-06Merge branch 'en/merge-recursive-oid-eq-simplify'Libravatar Junio C Hamano1-22/+11
Code cleanup. * en/merge-recursive-oid-eq-simplify: merge-recursive: remove unnecessary oid_eq function
2020-01-02merge-recursive: remove unnecessary oid_eq functionLibravatar Elijah Newren1-22/+11
Back when merge-recursive was first introduced in commit 6d297f8137 (Status update on merge-recursive in C, 2006-07-08), it created a sha_eq() function. This function pre-dated the introduction of hashcmp() to cache.h by about a month, but was switched over to using hashcmp() as part of commit 9047ebbc22 (Split out merge_recursive() to merge-recursive.c, 2008-08-12). In commit b4da9d62f9 (merge-recursive: convert leaf functions to use struct object_id, 2016-06-24), sha_eq() was renamed to oid_eq() and its hashcmp() call was switched to oideq(). oid_eq() is basically just a wrapper around oideq() that has some extra checks to protect against NULL arguments or to allow short-circuiting if one of the arguments is NULL. I don't know if any caller ever tried to call with NULL arguments, but certainly none do now which means the extra checks serve no purpose. (Also, if these checks were genuinely useful, then they probably should be added to the main oideq() so all callers could benefit from them.) Reduce the cognitive overhead of having both oid_eq() and oideq(), by getting rid of merge-recursive's special oid_eq() wrapper. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-10Merge branch 'en/merge-recursive-directory-rename-fixes'Libravatar Junio C Hamano1-23/+81
When all files from some subdirectory were renamed to the root directory, the directory rename heuristics would fail to detect that as a rename/merge of the subdirectory to the root directory, which has been corrected. * en/merge-recursive-directory-rename-fixes: t604[236]: do not run setup in separate tests merge-recursive: fix merging a subdirectory into the root directory merge-recursive: clean up get_renamed_dir_portion()
2019-10-23merge-recursive: fix merging a subdirectory into the root directoryLibravatar Elijah Newren1-3/+49
We allow renaming all entries in e.g. a directory named z/ into a directory named y/ to be detected as a z/ -> y/ rename, so that if the other side of history adds any files to the directory z/ in the mean time, we can provide the hint that they should be moved to y/. There is no reason to not allow 'y/' to be the root directory, but the code did not handle that case correctly. Add a testcase and the necessary special checks to support this case. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-23merge-recursive: clean up get_renamed_dir_portion()Libravatar Elijah Newren1-24/+36
Dscho noted a few things making this function hard to follow. Restructure it a bit and add comments to make it easier to follow. The restructurings include: * There was a special case if-check at the end of the function checking whether someone just renamed a file within its original directory, meaning that there could be no directory rename involved. That check was slightly convoluted; it could be done in a more straightforward fashion earlier in the function, and can be done more cheaply too (no call to strncmp). * The conditions for advancing end_of_old and end_of_new before calling strchr were both confusing and unnecessary. If either points at a '/', then they need to be advanced in order to find the next '/'. If either doesn't point at a '/', then advancing them one char before calling strchr() doesn't hurt. So, just rip out the if conditions and advance both before calling strchr(). Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-15Merge branch 'ew/hashmap'Libravatar Junio C Hamano1-40/+51
Code clean-up of the hashmap API, both users and implementation. * ew/hashmap: hashmap_entry: remove first member requirement from docs hashmap: remove type arg from hashmap_{get,put,remove}_entry OFFSETOF_VAR macro to simplify hashmap iterators hashmap: introduce hashmap_free_entries hashmap: hashmap_{put,remove} return hashmap_entry * hashmap: use *_entry APIs for iteration hashmap_cmp_fn takes hashmap_entry params hashmap_get{,_from_hash} return "struct hashmap_entry *" hashmap: use *_entry APIs to wrap container_of hashmap_get_next returns "struct hashmap_entry *" introduce container_of macro hashmap_put takes "struct hashmap_entry *" hashmap_remove takes "const struct hashmap_entry *" hashmap_get takes "const struct hashmap_entry *" hashmap_add takes "struct hashmap_entry *" hashmap_get_next takes "const struct hashmap_entry *" hashmap_entry_init takes "struct hashmap_entry *" packfile: use hashmap_entry in delta_base_cache_entry coccicheck: detect hashmap_entry.hash assignment diff: use hashmap_entry_init on moved_entry.ent
2019-10-15Merge branch 'en/merge-recursive-cleanup'Libravatar Junio C Hamano1-226/+346
The merge-recursive machiery is one of the most complex parts of the system that accumulated cruft over time. This large series cleans up the implementation quite a bit. * en/merge-recursive-cleanup: (26 commits) merge-recursive: fix the fix to the diff3 common ancestor label merge-recursive: fix the diff3 common ancestor label for virtual commits merge-recursive: alphabetize include list merge-recursive: add sanity checks for relevant merge_options merge-recursive: rename MERGE_RECURSIVE_* to MERGE_VARIANT_* merge-recursive: split internal fields into a separate struct merge-recursive: avoid losing output and leaking memory holding that output merge-recursive: comment and reorder the merge_options fields merge-recursive: consolidate unnecessary fields in merge_options merge-recursive: move some definitions around to clean up the header merge-recursive: rename merge_options argument to opt in header merge-recursive: rename 'mrtree' to 'result_tree', for clarity merge-recursive: use common name for ancestors/common/base_list merge-recursive: fix some overly long lines cache-tree: share code between functions writing an index as a tree merge-recursive: don't force external callers to do our logging merge-recursive: remove useless parameter in merge_trees() merge-recursive: exit early if index != head Ensure index matches head before invoking merge machinery, round N merge-recursive: remove another implicit dependency on the_repository ...
2019-10-08merge-recursive: fix the fix to the diff3 common ancestor labelLibravatar Elijah Newren1-1/+2
In commit 8e4ec337 ("merge-recursive: fix the diff3 common ancestor label for virtual commits", 2019-10-01), which was a fix to commit 743474cbfa8b ("merge-recursive: provide a better label for diff3 common ancestor", 2019-08-17), the label for the common ancestor was changed from always being "merged common ancestors" to instead be based on the number of merge bases and whether the merge base was a real commit or a virtual one: >=2: "merged common ancestors" 1, via merge_recursive_generic: "constructed merge base" 1, otherwise: <abbreviated commit hash> 0: "<empty tree>" The handling for "constructed merge base" worked by allowing opt->ancestor to be set in merge_recursive_generic(), so we paid attention to the setting of that variable in merge_recursive_internal(). Now, for the outer merge, the code flow was simply the following: ancestor_name = "merged merge bases" loop over merge_bases: merge_recursive_internal() The first merge base not needing recursion would determine its own ancestor_name however necessary and thus run ancestor_name = $SOMETHING empty loop over merge_bases... opt->ancestor = ancestor_name merge_trees_internal() Now, the next set of merge_bases that would need to be merged after this particular merge had completed would note that opt->ancestor has been set to something (to a local ancestor_name variable that has since been popped off the stack), and thus it would run: ... else if (opt->ancestor) { ancestor_name = opt->ancestor; /* OOPS! */ loop over merge_bases: merge_recursive_internal() opt->ancestor = ancestor_name merge_trees_internal() This resulted in garbage strings being printed for the virtual merge bases, which was visible in git.git by just merging commit b744c3af07 into commit 6d8cb22a4f. There are two ways to fix this: set opt->ancestor to NULL after using it to avoid re-use, or add a !opt->priv->call_depth check to the if block for using a pre-defined opt->ancestor. Apply both fixes. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap: remove type arg from hashmap_{get,put,remove}_entryLibravatar Eric Wong1-4/+2
Since these macros already take a `keyvar' pointer of a known type, we can rely on OFFSETOF_VAR to get the correct offset without relying on non-portable `__typeof__' and `offsetof'. Argument order is also rearranged, so `keyvar' and `member' are sequential as they are used as: `keyvar->member' Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07OFFSETOF_VAR macro to simplify hashmap iteratorsLibravatar Eric Wong1-5/+0
While we cannot rely on a `__typeof__' operator being portable to use with `offsetof'; we can calculate the pointer offset using an existing pointer and the address of a member using pointer arithmetic for compilers without `__typeof__'. This allows us to simplify usage of hashmap iterator macros by not having to specify a type when a pointer of that type is already given. In the future, list iterator macros (e.g. list_for_each_entry) may also be implemented using OFFSETOF_VAR to save hackers the trouble of using container_of/list_entry macros and without relying on non-portable `__typeof__'. v3: use `__typeof__' to avoid clang warnings Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap: introduce hashmap_free_entriesLibravatar Eric Wong1-3/+4
`hashmap_free_entries' behaves like `container_of' and passes the offset of the hashmap_entry struct to the internal `hashmap_free_' function, allowing the function to free any struct pointer regardless of where the hashmap_entry field is located. `hashmap_free' no longer takes any arguments aside from the hashmap itself. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap: use *_entry APIs for iterationLibravatar Eric Wong1-10/+15
Inspired by list_for_each_entry in the Linux kernel. Once again, these are somewhat compromised usability-wise by compilers lacking __typeof__ support. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap_cmp_fn takes hashmap_entry paramsLibravatar Eric Wong1-12/+21
Another step in eliminating the requirement of hashmap_entry being the first member of a struct. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap_get{,_from_hash} return "struct hashmap_entry *"Libravatar Eric Wong1-2/+4
Update callers to use hashmap_get_entry, hashmap_get_entry_from_hash or container_of as appropriate. This is another step towards eliminating the requirement of hashmap_entry being the first field in a struct. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap_put takes "struct hashmap_entry *"Libravatar Eric Wong1-2/+2
This is less error-prone than "void *" as the compiler now detects invalid types being passed. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>