summaryrefslogtreecommitdiff
path: root/cache.h
AgeCommit message (Collapse)AuthorFilesLines
2014-04-07tree-diff: rework diff_tree() to generate diffs for multiparent cases as wellLibravatar Kirill Smelkov1-0/+15
Previously diff_tree(), which is now named ll_diff_tree_sha1(), was generating diff_filepair(s) for two trees t1 and t2, and that was usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes a commit introduces. In Git, however, we have fundamentally built flexibility in that a commit can have many parents - 1 for a plain commit, 2 for a simple merge, but also more than 2 for merging several heads at once. For merges there is a so called combine-diff, which shows diff, a merge introduces by itself, omitting changes done by any parent. That works through first finding paths, that are different to all parents, and then showing generalized diff, with separate columns for +/- for each parent. The code lives in combine-diff.c . There is an impedance mismatch, however, in that a commit could generally have any number of parents, and that while diffing trees, we divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there is no special casing for multiple parents commits in e.g. revision-walker . That impedance mismatch *hurts* *performance* *badly* for generating combined diffs - in "combine-diff: optimize combine_diff_path sets intersection" I've already removed some slowness from it, but from the timings provided there, it could be seen, that combined diffs still cost more than an order of magnitude more cpu time, compared to diff for usual commits, and that would only be an optimistic estimate, if we take into account that for e.g. linux.git there is only one merge for several dozens of plain commits. That slowness comes from the fact that currently, while generating combined diff, a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). That's because at present, to compute combine-diff, for first finding paths, that "every parent touches", we use the following combine-diff property/definition: D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (w.r.t. paths) where D(A,P1...Pn) is combined diff between commit A, and parents Pi and D(A,Pi) is usual two-tree diff Pi..A So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n 1-parent diffs and intersecting results will be slow. And usually, for linux.git and other topic-based workflows, that D(A,P2) is huge, because, if merge-base of A and P2, is several dozens of merges (from A, via first parent) below, that D(A,P2) will be diffing sum of merges from several subsystems to 1 subsystem. The solution is to avoid computing n 1-parent diffs, and to find changed-to-all-parents paths via scanning A's and all Pi's trees simultaneously, at each step comparing their entries, and based on that comparison, populate paths result, and deduce we could *skip* *recursing* into subdirectories, if at least for 1 parent, sha1 of that dir tree is the same as in A. That would save us from doing significant amount of needless work. Such approach is very similar to what diff_tree() does, only there we deal with scanning only 2 trees simultaneously, and for n+1 tree, the logic is a bit more complex: D(T,P1...Pn) calculation scheme ------------------------------- D(T,P1...Pn) = D(T,P1) ^ ... ^ D(T,Pn) (regarding resulting paths set) D(T,Pj) - diff between T..Pj D(T,P1...Pn) - combined diff from T to parents P1,...,Pn We start from all trees, which are sorted, and compare their entries in lock-step: T P1 Pn - - - |t| |p1| |pn| |-| |--| ... |--| imin = argmin(p1...pn) | | | | | | |-| |--| |--| |.| |. | |. | . . . . . . at any time there could be 3 cases: 1) t < p[imin]; 2) t > p[imin]; 3) t = p[imin]. Schematic deduction of what every case means, and what to do, follows: 1) t < p[imin] -> ∀j t ∉ Pj -> "+t" ∈ D(T,Pj) -> D += "+t"; t↓ 2) t > p[imin] 2.1) ∃j: pj > p[imin] -> "-p[imin]" ∉ D(T,Pj) -> D += ø; ∀ pi=p[imin] pi↓ 2.2) ∀i pi = p[imin] -> pi ∉ T -> "-pi" ∈ D(T,Pi) -> D += "-p[imin]"; ∀i pi↓ 3) t = p[imin] 3.1) ∃j: pj > p[imin] -> "+t" ∈ D(T,Pj) -> only pi=p[imin] remains to investigate 3.2) pi = p[imin] -> investigate δ(t,pi) | | v 3.1+3.2) looking at δ(t,pi) ∀i: pi=p[imin] - if all != ø -> ⎧δ(t,pi) - if pi=p[imin] -> D += ⎨ ⎩"+t" - if pi>p[imin] in any case t↓ ∀ pi=p[imin] pi↓ ~ For comparison, here is how diff_tree() works: D(A,B) calculation scheme ------------------------- A B - - |a| |b| a < b -> a ∉ B -> D(A,B) += +a a↓ |-| |-| a > b -> b ∉ A -> D(A,B) += -b b↓ | | | | a = b -> investigate δ(a,b) a↓ b↓ |-| |-| |.| |.| . . . . ~~~~~~~~ This patch generalizes diff tree-walker to work with arbitrary number of parents as described above - i.e. now there is a resulting tree t, and some parents trees tp[i] i=[0..nparent). The generalization builds on the fact that usual diff D(A,B) is by definition the same as combined diff D(A,[B]), so if we could rework the code for common case and make it be not slower for nparent=1 case, usual diff(t1,t2) generation will not be slower, and multiparent diff tree-walker would greatly benefit generating combine-diff. What we do is as follows: 1) diff tree-walker ll_diff_tree_sha1() is internally reworked to be a paths generator (new name diff_tree_paths()), with each generated path being `struct combine_diff_path` with info for path, new sha1,mode and for every parent which sha1,mode it was in it. 2) From that info, we can still generate usual diff queue with struct diff_filepairs, via "exporting" generated combine_diff_path, if we know we run for nparent=1 case. (see emit_diff() which is now named emit_diff_first_parent_only()) 3) In order for diff_can_quit_early(), which checks DIFF_OPT_TST(opt, HAS_CHANGES)) to work, that exporting have to be happening not in bulk, but incrementally, one diff path at a time. For such consumers, there is a new callback in diff_options introduced: ->pathchange(opt, struct combine_diff_path *) which, if set to !NULL, is called for every generated path. (see new compat ll_diff_tree_sha1() wrapper around new paths generator for setup) 4) The paths generation itself, is reworked from previous ll_diff_tree_sha1() code according to "D(A,P1...Pn) calculation scheme" provided above: On the start we allocate [nparent] arrays in place what was earlier just for one parent tree. then we just generalize loops, and comparison according to the algorithm. Some notes(*): 1) alloca(), for small arrays, is used for "runs not slower for nparent=1 case than before" goal - if we change it to xmalloc()/free() the timings get ~1% worse. For alloca() we use just-introduced xalloca/xalloca_free compatibility wrappers, so it should not be a portability problem. 2) For every parent tree, we need to keep a tag, whether entry from that parent equals to entry from minimal parent. For performance reasons I'm keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ. Not doing so, we'd need to alloca another [nparent] array, which hurts performance. 3) For emitted paths, memory could be reused, if we know the path was processed via callback and will not be needed later. We use efficient hand-made realloc-style path_appendnew(), that saves us from ~1-1.5% of potential additional slowdown. 4) goto(s) are used in several places, as the code executes a little bit faster with lowered register pressure. Also - we should now check for FIND_COPIES_HARDER not only when two entries names are the same, and their hashes are equal, but also for a case, when a path was removed from some of all parents having it. The reason is, if we don't, that path won't be emitted at all (see "a > xi" case), and we'll just skip it, and FIND_COPIES_HARDER wants all paths - with diff or without - to be emitted, to be later analyzed for being copies sources. The new check is only necessary for nparent >1, as for nparent=1 case xmin_eqtotal always =1 =nparent, and a path is always added to diff as removal. ~~~~~~~~ Timings for # without -c, i.e. testing only nparent=1 case `git log --raw --no-abbrev --no-renames` before and after the patch are as follows: navy.git linux.git v3.10..v3.11 before 0.611s 1.889s after 0.619s 1.907s slowdown 1.3% 0.9% This timings show we did no harm to usual diff(tree1,tree2) generation. From the table we can see that we actually did ~1% slowdown, but I think I've "earned" that 1% in the previous patch ("tree-diff: reuse base str(buf) memory on sub-tree recursion", HEAD~~) so for nparent=1 case, net timings stays approximately the same. The output also stayed the same. (*) If we revert 1)-4) to more usual techniques, for nparent=1 case, we'll get ~2-2.5% of additional slowdown, which I've tried to avoid, as "do no harm for nparent=1 case" rule. For linux.git, combined diff will run an order of magnitude faster and appropriate timings will be provided in the next commit, as we'll be taking advantage of the new diff tree-walker for combined-diff generation there. P.S. and combined diff is not some exotic/for-play-only stuff - for example for a program I write to represent Git archives as readonly filesystem, there is initial scan with `git log --reverse --raw --no-abbrev --no-renames -c` to extract log of what was created/changed when, as a result building a map {} sha1 -> in which commit (and date) a content was added that `-c` means also show combined diff for merges, and without them, if a merge is non-trivial (merges changes from two parents with both having separate changes to a file), or an evil one, the map will not be full, i.e. some valid sha1 would be absent from it. That case was my initial motivation for combined diffs speedup. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-27Merge branch 'mh/safe-create-leading-directories'Libravatar Junio C Hamano1-2/+23
Code clean-up and protection against concurrent write access to the ref namespace. * mh/safe-create-leading-directories: rename_tmp_log(): on SCLD_VANISHED, retry rename_tmp_log(): limit the number of remote_empty_directories() attempts rename_tmp_log(): handle a possible mkdir/rmdir race rename_ref(): extract function rename_tmp_log() remove_dir_recurse(): handle disappearing files and directories remove_dir_recurse(): tighten condition for removing unreadable dir lock_ref_sha1_basic(): if locking fails with ENOENT, retry lock_ref_sha1_basic(): on SCLD_VANISHED, retry safe_create_leading_directories(): add new error value SCLD_VANISHED cmd_init_db(): when creating directories, handle errors conservatively safe_create_leading_directories(): introduce enum for return values safe_create_leading_directories(): always restore slash at end of loop safe_create_leading_directories(): split on first of multiple slashes safe_create_leading_directories(): rename local variable safe_create_leading_directories(): add explicit "slash" pointer safe_create_leading_directories(): reduce scope of local variable safe_create_leading_directories(): fix format of "if" chaining
2014-01-27Merge branch 'mh/retire-ref-fetch-rules'Libravatar Junio C Hamano1-3/+6
Code simplification. * mh/retire-ref-fetch-rules: refname_match(): always use the rules in ref_rev_parse_rules
2014-01-17Merge branch 'nd/shallow-clone'Libravatar Junio C Hamano1-0/+3
Fetching from a shallow-cloned repository used to be forbidden, primarily because the codepaths involved were not carefully vetted and we did not bother supporting such usage. This attempts to allow object transfer out of a shallow-cloned repository in a controlled way (i.e. the receiver become a shallow repository with truncated history). * nd/shallow-clone: (31 commits) t5537: fix incorrect expectation in test case 10 shallow: remove unused code send-pack.c: mark a file-local function static git-clone.txt: remove shallow clone limitations prune: clean .git/shallow after pruning objects clone: use git protocol for cloning shallow repo locally send-pack: support pushing from a shallow clone via http receive-pack: support pushing to a shallow clone via http smart-http: support shallow fetch/clone remote-curl: pass ref SHA-1 to fetch-pack as well send-pack: support pushing to a shallow clone receive-pack: allow pushes that update .git/shallow connected.c: add new variant that runs with --shallow-file add GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses receive/send-pack: support pushing from a shallow clone receive-pack: reorder some code in unpack() fetch: add --update-shallow to accept refs that update .git/shallow upload-pack: make sure deepening preserves shallow roots fetch: support fetching from a shallow repository clone: support remote shallow repository ...
2014-01-14refname_match(): always use the rules in ref_rev_parse_rulesLibravatar Michael Haggerty1-3/+6
We used to use two separate rules for the normal ref resolution dwimming and dwimming done to decide which remote ref to grab. The third parameter to refname_match() selected which rules to use. When these two rules were harmonized in 2011-11-04 dd621df9cd refs DWIMmery: use the same rule for both "git fetch" and others , ref_fetch_rules was #defined to avoid potential breakages for in-flight topics. It is now safe to remove the backwards-compatibility code, so remove refname_match()'s third parameter, make ref_rev_parse_rules private to refs.c, and remove ref_fetch_rules entirely. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-10Merge branch 'jk/oi-delta-base'Libravatar Junio C Hamano1-0/+1
Teach "cat-file --batch" to show delta-base object name for a packed object that is represented as a delta. * jk/oi-delta-base: cat-file: provide %(deltabase) batch format sha1_object_info_extended: provide delta base sha1s
2014-01-06safe_create_leading_directories(): add new error value SCLD_VANISHEDLibravatar Michael Haggerty1-1/+9
Add a new possible error result that can be returned by safe_create_leading_directories() and safe_create_leading_directories_const(): SCLD_VANISHED. This value indicates that a file or directory on the path existed at one point (either it already existed or the function created it), but then it disappeared. This probably indicates that another process deleted the directory while we were working. If SCLD_VANISHED is returned, the caller might want to retry the function call, as there is a chance that a new attempt will succeed. Why doesn't safe_create_leading_directories() do the retrying internally? Because an empty directory isn't really ever safe until it holds a file. So even if safe_create_leading_directories() were absolutely sure that the directory existed before it returned, there would be no guarantee that the directory still existed when the caller tried to write something in it. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-06safe_create_leading_directories(): introduce enum for return valuesLibravatar Michael Haggerty1-2/+15
Instead of returning magic integer values (which a couple of callers go to the trouble of distinguishing), return values from an enum. Add a docstring. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-26sha1_object_info_extended: provide delta base sha1sLibravatar Jeff King1-0/+1
A caller of sha1_object_info_extended technically has enough information to determine the base sha1 from the results of the call. It knows the pack, offset, and delta type of the object, which is sufficient to find the base. However, the functions to do so are not publicly available, and the code itself is intimate enough with the pack details that it should be abstracted away. We could add a public helper to allow callers to query the delta base separately, but it is simpler and slightly more efficient to optionally grab it along with the rest of the object_info data. For cases where the object is not stored as a delta, we write the null sha1 into the query field. A careful caller could check "oi.whence == OI_PACKED && oi.u.packed.is_delta" before looking at the base sha1, but using the null sha1 provides a simple alternative (and gives a better sanity check for a non-careful caller than simply returning random bytes). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-12sha1_object_info_extended(): add an "unsigned flags" parameterLibravatar Christian Couder1-1/+1
This parameter is not used yet, but it will be used to tell sha1_object_info_extended() if it should perform object replacement or not. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-12sha1_file.c: add lookup_replace_object_extended() to pass flagsLibravatar Christian Couder1-0/+6
Currently, there is only one caller to lookup_replace_object() that can benefit from passing it some flags, but we expect that there could be more. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-12rename READ_SHA1_FILE_REPLACE flag to LOOKUP_REPLACE_OBJECTLibravatar Christian Couder1-2/+2
The READ_SHA1_FILE_REPLACE flag is more related to using the lookup_replace_object() function rather than the read_sha1_file() function. We also need such a flag to be used with sha1_object_info() instead of read_sha1_file(). The name LOOKUP_REPLACE_OBJECT is therefore better for this flag. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10add GIT_SHALLOW_FILE to propagate --shallow-file to subprocessesLibravatar Nguyễn Thái Ngọc Duy1-0/+1
This may be needed when a hook is run after a new shallow pack is received, but .git/shallow is not settled yet. A temporary shallow file to plug all loose ends should be used instead. GIT_SHALLOW_FILE is overriden by --shallow-file. --shallow-file does not work in this case because the hook may spawn many git subprocesses and the launch commands do not have --shallow-file as it's a recent addition. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10shallow.c: the 8 steps to select new commits for .git/shallowLibravatar Nguyễn Thái Ngọc Duy1-0/+2
Suppose a fetch or push is requested between two shallow repositories (with no history deepening or shortening). A pack that contains necessary objects is transferred over together with .git/shallow of the sender. The receiver has to determine whether it needs to update .git/shallow if new refs needs new shallow comits. The rule here is avoid updating .git/shallow by default. But we don't want to waste the received pack. If the pack contains two refs, one needs new shallow commits installed in .git/shallow and one does not, we keep the latter and reject/warn about the former. Even if .git/shallow update is allowed, we only add shallow commits strictly necessary for the former ref (remember the sender can send more shallow commits than necessary) and pay attention not to accidentally cut the receiver history short (no history shortening is asked for) So the steps to figure out what ref need what new shallow commits are: 1. Split the sender shallow commit list into "ours" and "theirs" list by has_sha1_file. Those that exist in current repo in "ours", the remaining in "theirs". 2. Check the receiver .git/shallow, remove from "ours" the ones that also exist in .git/shallow. 3. Fetch the new pack. Either install or unpack it. 4. Do has_sha1_file on "theirs" list again. Drop the ones that fail has_sha1_file. Obviously the new pack does not need them. 5. If the pack is kept, remove from "ours" the ones that do not exist in the new pack. 6. Walk the new refs to answer the question "what shallow commits, both ours and theirs, are required in .git/shallow in order to add this ref?". Shallow commits not associated to any refs are removed from their respective list. 7. (*) Check reachability (from the current refs) of all remaining commits in "ours". Those reachable are removed. We do not want to cut any part of our (reachable) history. We only check up commits. True reachability test is done by check_everything_connected() at the end as usual. 8. Combine the final "ours" and "theirs" and add them all to .git/shallow. Install new refs. The case where some hook rejects some refs on a push is explained in more detail in the push patches. Of these steps, #6 and #7 are expensive. Both require walking through some commits, or in the worst case all commits. And we rather avoid them in at least common case, where the transferred pack does not contain any shallow commits that the sender advertises. Let's look at each scenario: 1) the sender has longer history than the receiver All shallow commits from the sender will be put into "theirs" list at step 1 because none of them exists in current repo. In the common case, "theirs" becomes empty at step 4 and exit early. 2) the sender has shorter history than the receiver All shallow commits from the sender are likely in "ours" list at step 1. In the common case, if the new pack is kept, we could empty "ours" and exit early at step 5. If the pack is not kept, we hit the expensive step 6 then exit after "ours" is emptied. There'll be only a handful of objects to walk in fast-forward case. If it's forced update, we may need to walk to the bottom. 3) the sender has same .git/shallow as the receiver This is similar to case 2 except that "ours" should be emptied at step 2 and exit early. A fetch after "clone --depth=X" is case 1. A fetch after "clone" (from a shallow repo) is case 3. Luckily they're cheap for the common case. A push from "clone --depth=X" falls into case 2, which is expensive. Some more work may be done at the sender/client side to avoid more work on the server side: if the transferred pack does not contain any shallow commits, send-pack should not send any shallow commits to the receive-pack, effectively turning it into a normal push and avoid all steps. This patch implements all steps except #3, already handled by fetch-pack and receive-pack, #6 and #7, which has their own patch due to their size. (*) in previous versions step 7 was put before step 3. I reorder it so that the common case that keeps the pack does not need to walk commits at all. In future if we implement faster commit reachability check (maybe with the help of pack bitmaps or commit cache), step 7 could become cheap and be moved up before 6 again. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-11-01Merge branch 'sb/refs-code-cleanup'Libravatar Junio C Hamano1-1/+0
* sb/refs-code-cleanup: cache: remove unused function 'have_git_dir' refs: remove unused function invalidate_ref_cache
2013-10-30Merge branch 'nd/lift-path-max'Libravatar Junio C Hamano1-0/+1
* nd/lift-path-max: checkout_entry(): clarify the use of topath[] parameter entry.c: convert checkout_entry to use strbuf
2013-10-28Merge branch 'jx/relative-path-regression-fix'Libravatar Junio C Hamano1-0/+1
* jx/relative-path-regression-fix: Use simpler relative_path when set_git_dir relative_path should honor dos-drive-prefix test: use unambigous leading path (/foo) for MSYS
2013-10-28cache: remove unused function 'have_git_dir'Libravatar Stefan Beller1-1/+0
This function was added in d2b0708 (2008-09-27, add have_git_dir() function) as a preparation for adbc0b6 (2008-09-30, cygwin: Use native Win32 API for stat). However the second referenced commit was reverted in f66450a (2013-06-22, cygwin: Remove the Win32 l/stat() implementation), so we don't need to expose this wrapper function any more as a public API. Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Acked-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-24checkout_entry(): clarify the use of topath[] parameterLibravatar Junio C Hamano1-0/+1
The said function has this signature: extern int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *topath); At first glance, it might appear that the caller of checkout_entry() can specify to which path the contents are written out by the last parameter, and it is tempting to add "const" in front of its type. In reality, however, topath[] is to point at a buffer to store the temporary path generated by the callchain originating from this function, and the temporary path is always short, much shorter than the buffer prepared by its only caller in builtin/checkout-index.c. Document the code a bit to clarify so that future callers know how to use the function better. Noticed-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-17Merge branch 'jk/format-patch-from'Libravatar Junio C Hamano1-0/+9
"format-patch --from=<whom>" forgot to omit unnecessary in-body from line, i.e. when <whom> is the same as the real author. * jk/format-patch-from: format-patch: print in-body "From" only when needed
2013-10-17Merge branch 'es/name-hash-no-trailing-slash-in-dirs'Libravatar Junio C Hamano1-0/+4
Clean up the internal of the name-hash mechanism used to work around case insensitivity on some filesystems to cleanly fix a long-standing API glitch where the caller of cache_name_exists() that ask about a directory with a counted string was required to have '/' at one location past the end of the string. * es/name-hash-no-trailing-slash-in-dirs: dir: revert work-around for retired dangerous behavior name-hash: stop storing trailing '/' on paths in index_state.dir_hash employ new explicit "exists in index?" API name-hash: refactor polymorphic index_name_exists()
2013-10-14Use simpler relative_path when set_git_dirLibravatar Jiang Xin1-0/+1
Using a relative_path as git_dir first appears in v1.5.6-1-g044bbbc. It will make git_dir shorter only if git_dir is inside work_tree, and this will increase performance. But my last refactor effort on relative_path function (commit v1.8.3-rc2-12-ge02ca72) changed that. Always use relative_path as git_dir may bring troubles like $gmane/234434. Because new relative_path is a combination of original relative_path from path.c and original path_relative from quote.c, so in order to restore the origin implementation, save the original relative_path as remove_leading_path, and call it in setup.c. Suggested-by: Karsten Blees <karsten.blees@gmail.com> Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-20Merge branch 'fc/at-head'Libravatar Junio C Hamano1-1/+1
Instead of typing four capital letters "HEAD", you can say "@" now, e.g. "git log @". * fc/at-head: Add new @ shortcut for HEAD sha1-name: pass len argument to interpret_branch_name()
2013-09-20format-patch: print in-body "From" only when neededLibravatar Jeff King1-0/+9
Commit a908047 taught format-patch the "--from" option, which places the author ident into an in-body from header, and uses the committer ident in the rfc822 from header. The documentation claims that it will omit the in-body header when it is the same as the rfc822 header, but the code never implemented that behavior. This patch completes the feature by comparing the two idents and doing nothing when they are the same (this is the same as simply omitting the in-body header, as the two are by definition indistinguishable in this case). This makes it reasonable to turn on "--from" all the time (if it matches your particular workflow), rather than only using it when exporting other people's patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-17name-hash: refactor polymorphic index_name_exists()Libravatar Eric Sunshine1-0/+4
Depending upon the absence or presence of a trailing '/' on the incoming pathname, index_name_exists() checks either if a file is present in the index or if a directory is represented within the index. Each caller explicitly chooses the mode of operation by adding or removing a trailing '/' before invoking index_name_exists(). Since these two modes of operations are disjoint and have no code in common (one searches index_state.name_hash; the other dir_hash), they can be represented more naturally as distinct functions: one to search for a file, and one for a directory. Splitting index searching into two functions relieves callers of the artificial burden of having to add or remove a slash to select the mode of operation; instead they just call the desired function. A subsequent patch will take advantage of this benefit in order to eliminate the requirement that the incoming pathname for a directory search must have a trailing slash. (In order to avoid disturbing in-flight topics, index_name_exists() is retained as a thin wrapper dispatching either to index_dir_exists() or index_file_exists().) Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-12Merge branch 'jk/config-int-range-check'Libravatar Junio C Hamano1-0/+1
"git config" did not provide a way to set or access numbers larger than a native "int" on the platform; it now provides 64-bit signed integers on all platforms. * jk/config-int-range-check: git-config: always treat --int as 64-bit internally config: make numeric parsing errors more clear config: set errno in numeric git_parse_* functions config: properly range-check integer values config: factor out integer parsing from range checks
2013-09-09Merge branch 'tg/index-struct-sizes'Libravatar Junio C Hamano1-5/+5
The code that reads from a region that mmaps an on-disk index assumed that "int"/"short" are always 32/16 bits. * tg/index-struct-sizes: read-cache: use fixed width integer types
2013-09-09Merge branch 'jl/submodule-mv'Libravatar Junio C Hamano1-24/+11
"git mv A B" when moving a submodule A does "the right thing", inclusing relocating its working tree and adjusting the paths in the .gitmodules file. * jl/submodule-mv: (53 commits) rm: delete .gitmodules entry of submodules removed from the work tree mv: update the path entry in .gitmodules for moved submodules submodule.c: add .gitmodules staging helper functions mv: move submodules using a gitfile mv: move submodules together with their work trees rm: do not set a variable twice without intermediate reading. t6131 - skip tests if on case-insensitive file system parse_pathspec: accept :(icase)path syntax pathspec: support :(glob) syntax pathspec: make --literal-pathspecs disable pathspec magic pathspec: support :(literal) syntax for noglob pathspec kill limit_pathspec_to_literal() as it's only used by parse_pathspec() parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN parse_pathspec: make sure the prefix part is wildcard-free rename field "raw" to "_raw" in struct pathspec tree-diff: remove the use of pathspec's raw[] in follow-rename codepath remove match_pathspec() in favor of match_pathspec_depth() remove init_pathspec() in favor of parse_pathspec() remove diff_tree_{setup,release}_paths convert common_prefix() to use struct pathspec ...
2013-09-09Merge branch 'jc/push-cas'Libravatar Junio C Hamano1-62/+0
Allow a safer "rewind of the remote tip" push than blind "--force", by requiring that the overwritten remote ref to be unchanged since the new history to replace it was prepared. The machinery is more or less ready. The "--force" option is again the big red button to override any safety, thanks to J6t's sanity (the original round allowed --lockref to defeat --force). The logic to choose the default implemented here is fragile (e.g. "git fetch" after seeing a failure will update the remote-tracking branch and will make the next "push" pass, defeating the safety pretty easily). It is suitable only for the simplest workflows, and it may hurt users more than it helps them. * jc/push-cas: push: teach --force-with-lease to smart-http transport send-pack: fix parsing of --force-with-lease option t5540/5541: smart-http does not support "--force-with-lease" t5533: test "push --force-with-lease" push --force-with-lease: tie it all together push --force-with-lease: implement logic to populate old_sha1_expect[] remote.c: add command line option parser for "--force-with-lease" builtin/push.c: use OPT_BOOL, not OPT_BOOLEAN cache.h: move remote/connect API out of it
2013-09-09git-config: always treat --int as 64-bit internallyLibravatar Jeff King1-0/+1
When you run "git config --int", the maximum size of integer you get depends on how git was compiled, and what it considers to be an "int". This is almost useful, because your scripts calling "git config" will behave similarly to git internally. But relying on this is dubious; you have to actually know how git treats each value internally (e.g., int versus unsigned long), which is not documented and is subject to change. And even if you know it is "unsigned long", we do not have a git-config option to match that behavior. Furthermore, you may simply be asking git to store a value on your behalf (e.g., configuration for a hook). In that case, the relevant range check has nothing at all to do with git, but rather with whatever scripting tools you are using (and git has no way of knowing what the appropriate range is there). Not only is the range check useless, but it is actively harmful, as there is no way at all for scripts to look at config variables with large values. For instance, one cannot reliably get the value of pack.packSizeLimit via git-config. On an LP64 system, git happily uses a 64-bit "unsigned long" internally to represent the value, but the script cannot read any value over 2G. Ideally, the "--int" option would simply represent an arbitrarily large integer. For practical purposes, however, a 64-bit integer is large enough, and is much easier to implement (and if somebody overflows it, we will still notice the problem, and not simply return garbage). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03sha1-name: pass len argument to interpret_branch_name()Libravatar Felipe Contreras1-1/+1
This is useful to make sure we don't step outside the boundaries of what we are interpreting at the moment. For example while interpreting foobar@{u}~1, the job of interpret_branch_name() ends right before ~1, but there's no way to figure that out inside the function, unless the len argument is passed. So let's do that. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-20read-cache: use fixed width integer typesLibravatar Thomas Gummerer1-5/+5
Use the fixed width integer types uint16_t and uint32_t for on-disk structures; unsigned short and unsigned int do not have a guaranteed size. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-01Merge branch 'ob/typofixes'Libravatar Junio C Hamano1-1/+1
* ob/typofixes: many small typofixes
2013-07-29many small typofixesLibravatar Ondřej Bílka1-1/+1
Signed-off-by: Ondřej Bílka <neleai@seznam.cz> Reviewed-by: Marc Branchaud <marcnarc@xiplink.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-24Merge branch 'jk/cat-file-batch-optim'Libravatar Junio C Hamano1-0/+2
If somebody wants to only know on-disk footprint of an object without having to know its type or payload size, we can bypass a lot of code to cheaply learn it. * jk/cat-file-batch-optim: Fix some sparse warnings sha1_object_info_extended: pass object_info to helpers sha1_object_info_extended: make type calculation optional packed_object_info: make type lookup optional packed_object_info: hoist delta type resolution to helper sha1_loose_object_info: make type lookup optional sha1_object_info_extended: rename "status" to "type" cat-file: disable object/refname ambiguity check for batch mode
2013-07-22Merge branch 'jx/clean-interactive'Libravatar Junio C Hamano1-1/+1
Add "interactive" mode to "git clean". The early part to refactor relative path related helper functions looked sensible. * jx/clean-interactive: test: run testcases with POSIX absolute paths on Windows test: add t7301 for git-clean--interactive git-clean: add documentation for interactive git-clean git-clean: add ask each interactive action git-clean: add select by numbers interactive action git-clean: add filter by pattern interactive action git-clean: use a git-add-interactive compatible UI git-clean: add colors to interactive git-clean git-clean: show items of del_list in columns git-clean: add support for -i/--interactive git-clean: refactor git-clean into two phases write_name{_quoted_relative,}(): remove redundant parameters quote_path_relative(): remove redundant parameter quote.c: substitute path_relative with relative_path path.c: refactor relative_path(), not only strip prefix test: add test cases for relative_path
2013-07-22Merge branch 'hv/config-from-blob'Libravatar Junio C Hamano1-1/+5
Allow configuration data to be read from in-tree blob objects, which would help working in a bare repository and submodule updates. * hv/config-from-blob: do not die when error in config parsing of buf occurs teach config --blob option to parse config from database config: make parsing stack struct independent from actual data source config: drop cf validity check in get_next_char() config: factor out config file stack management
2013-07-22Merge branch 'nd/const-struct-cache-entry'Libravatar Junio C Hamano1-1/+1
* nd/const-struct-cache-entry: Convert "struct cache_entry *" to "const ..." wherever possible
2013-07-22Merge branch 'tr/protect-low-3-fds'Libravatar Junio C Hamano1-0/+2
When "git" is spawned in such a way that any of the low 3 file descriptors is closed, our first open() may yield file descriptor 2, and writing error message to it would screw things up in a big way. * tr/protect-low-3-fds: git: ensure 0/1/2 are open in main() daemon/shell: refactor redirection of 0/1/2 from /dev/null
2013-07-18Merge branch 'jk/in-pack-size-measurement'Libravatar Junio C Hamano1-0/+1
"git cat-file --batch-check=<format>" is added, primarily to allow on-disk footprint of objects in packfiles (often they are a lot smaller than their true size, when expressed as deltas) to be reported. * jk/in-pack-size-measurement: pack-revindex: radix-sort the revindex pack-revindex: use unsigned to store number of objects cat-file: split --batch input lines on whitespace cat-file: add %(objectsize:disk) format atom cat-file: add --batch-check=<format> cat-file: refactor --batch option parsing cat-file: teach --batch to stream blob objects t1006: modernize output comparisons teach sha1_object_info_extended a "disk_size" query zero-initialize object_info structs
2013-07-17daemon/shell: refactor redirection of 0/1/2 from /dev/nullLibravatar Thomas Rast1-0/+2
Both daemon.c and shell.c contain logic to open FDs 0/1/2 from /dev/null if they are not already open. Move the function in daemon.c to setup.c and use it in shell.c, too. While there, remove a 'not' that inverted the meaning of the comment. The point is indeed to *avoid* messing up. Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15parse_pathspec: accept :(icase)path syntaxLibravatar Nguyễn Thái Ngọc Duy1-0/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15pathspec: support :(glob) syntaxLibravatar Nguyễn Thái Ngọc Duy1-0/+2
:(glob)path differs from plain pathspec that it uses wildmatch with WM_PATHNAME while the other uses fnmatch without FNM_PATHNAME. The difference lies in how '*' (and '**') is processed. With the introduction of :(glob) and :(literal) and their global options --[no]glob-pathspecs, the user can: - make everything literal by default via --noglob-pathspecs --literal-pathspecs cannot be used for this purpose as it disables _all_ pathspec magic. - individually turn on globbing with :(glob) - make everything globbing by default via --glob-pathspecs - individually turn off globbing with :(literal) The implication behind this is, there is no way to gain the default matching behavior (i.e. fnmatch without FNM_PATHNAME). You either get new globbing or literal. The old fnmatch behavior is considered deprecated and discouraged to use. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15parse_pathspec: make sure the prefix part is wildcard-freeLibravatar Nguyễn Thái Ngọc Duy1-0/+2
Prepending prefix to pathspec is a trick to workaround the fact that commands can be executed in a subdirectory, but all git commands run at worktree's root. The prefix part should always be treated as literal string. Make it so. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15convert add_files_to_cache to take struct pathspecLibravatar Nguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15convert refresh_index to take struct pathspecLibravatar Nguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15convert report_path_error to take struct pathspecLibravatar Nguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15convert read_cache_preload() to take struct pathspecLibravatar Nguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15move struct pathspec and related functions to pathspec.[ch]Libravatar Nguyễn Thái Ngọc Duy1-20/+2
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-12sha1_object_info_extended: make type calculation optionalLibravatar Jeff King1-0/+1
Each caller of sha1_object_info_extended sets up an object_info struct to tell the function which elements of the object it wants to get. Until now, getting the type of the object has always been required (and it is returned via the return type rather than a pointer in object_info). This can involve actually opening a loose object file to determine its type, or following delta chains to determine a packed file's base type. These effects produce a measurable slow-down when doing a "cat-file --batch-check" that does not include %(objecttype). This patch adds a "typep" query to struct object_info, so that it can be optionally queried just like size and disk_size. As a result, the return type of the function is no longer the object type, but rather 0/-1 for success/error. As there are only three callers total, we just fix up each caller rather than keep a compatibility wrapper: 1. The simpler sha1_object_info wrapper continues to always ask for and return the type field. 2. The istream_source function wants to know the type, and so always asks for it. 3. The cat-file batch code asks for the type only when %(objecttype) is part of the format string. On linux.git, the best-of-five for running: $ git rev-list --objects --all >objects $ time git cat-file --batch-check='%(objectsize:disk)' on a fully packed repository goes from: real 0m8.680s user 0m8.160s sys 0m0.512s to: real 0m7.205s user 0m6.580s sys 0m0.608s Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>