summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-10-16make add_object_array_with_context interface more saneLibravatar Jeff King3-20/+15
When you resolve a sha1, you can optionally keep any context found during the resolution, including the path and mode of a tree entry (e.g., when looking up "HEAD:subdir/file.c"). The add_object_array_with_context function lets you then attach that context to an entry in a list. Unfortunately, the interface for doing so is horrible. The object_context structure is large and most object_array users do not use it. Therefore we keep a pointer to the structure to avoid burdening other users too much. But that means when we do use it that we must allocate the struct ourselves. And the struct contains a fixed PATH_MAX-sized buffer, which makes this wholly unsuitable for any large arrays. We can observe that there is only a single user of the "with_context" variant: builtin/grep.c. And in that use case, the only element we care about is the path. We can therefore store only the path as a pointer (the context's mode field was redundant with the object_array_entry itself, and nobody actually cared about the surrounding tree). This still requires a strdup of the pathname, but at least we are only consuming the minimum amount of memory for each string. We can also handle the copying ourselves in add_object_array_*, and free it as appropriate in object_array_release_entry. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16write_sha1_file: freshen existing objectsLibravatar Jeff King2-7/+71
When we try to write a loose object file, we first check whether that object already exists. If so, we skip the write as an optimization. However, this can interfere with prune's strategy of using mtimes to mark files in progress. For example, if a branch contains a particular tree object and is deleted, that tree object may become unreachable, and have an old mtime. If a new operation then tries to write the same tree, this ends up as a noop; we notice we already have the object and do nothing. A prune running simultaneously with this operation will see the object as old, and may delete it. We can solve this by "freshening" objects that we avoid writing by updating their mtime. The algorithm for doing so is essentially the same as that of has_sha1_file. Therefore we provide a new (static) interface "check_and_freshen", which finds and optionally freshens the object. It's trivial to implement freshening and simple checking by tweaking a single parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16pack-objects: match prune logic for discarding objectsLibravatar Jeff King4-40/+98
A recent commit taught git-prune to keep non-recent objects that are reachable from recent ones. However, pack-objects, when loosening unreachable objects, tries to optimize out the write in the case that the object will be immediately pruned. It now gets this wrong, since its rule does not reflect the new prune code (and this can be seen by running t6501 with a strategically placed repack). Let's teach pack-objects similar logic. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16pack-objects: refactor unpack-unreachable expiration checkLibravatar Jeff King1-5/+12
When we are loosening unreachable packed objects, we do not bother to process objects that would simply be pruned immediately anyway. The "would be pruned" check is a simple comparison, but is about to get more complicated. Let's pull it out into a separate function. Note that this is slightly less efficient than the original, which avoided even opening old packs, since no object in them could pass the current check, which cares only about the pack mtime. But the new rules will depend on the exact object, so we need to perform the check even for old packs. Note also that we fix a minor buglet when the pack mtime is exactly the same as the expiration time. The prune code considers that worth pruning, whereas our check here considered it worth keeping. This wasn't a big deal. Besides being unlikely to happen, the result was simply that the object was loosened and then pruned, missing the optimization. Still, we can easily fix it while we are here. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16prune: keep objects reachable from recent objectsLibravatar Jeff King5-3/+204
Our current strategy with prune is that an object falls into one of three categories: 1. Reachable (from ref tips, reflogs, index, etc). 2. Not reachable, but recent (based on the --expire time). 3. Not reachable and not recent. We keep objects from (1) and (2), but prune objects in (3). The point of (2) is that these objects may be part of an in-progress operation that has not yet updated any refs. However, it is not always the case that objects for an in-progress operation will have a recent mtime. For example, the object database may have an old copy of a blob (from an abandoned operation, a branch that was deleted, etc). If we create a new tree that points to it, a simultaneous prune will leave our tree, but delete the blob. Referencing that tree with a commit will then work (we check that the tree is in the object database, but not that all of its referred objects are), as will mentioning the commit in a ref. But the resulting repo is corrupt; we are missing the blob reachable from a ref. One way to solve this is to be more thorough when referencing a sha1: make sure that not only do we have that sha1, but that we have objects it refers to, and so forth recursively. The problem is that this is very expensive. Creating a parent link would require traversing the entire object graph! Instead, this patch pushes the extra work onto prune, which runs less frequently (and has to look at the whole object graph anyway). It creates a new category of objects: objects which are not recent, but which are reachable from a recent object. We do not prune these objects, just like the reachable and recent ones. This lets us avoid the recursive check above, because if we have an object, even if it is unreachable, we should have its referent. We can make a simple inductive argument that with this patch, this property holds (that there are no objects with missing referents in the repository): 0. When we have no objects, we have nothing to refer or be referred to, so the property holds. 1. If we add objects to the repository, their direct referents must generally exist (e.g., if you create a tree, the blobs it references must exist; if you create a commit to point at the tree, the tree must exist). This is already the case before this patch. And it is not 100% foolproof (you can make bogus objects using `git hash-object`, for example), but it should be the case for normal usage. Therefore for any sequence of object additions, the property will continue to hold. 2. If we remove objects from the repository, then we will not remove a child object (like a blob) if an object that refers to it is being kept. That is the part implemented by this patch. Note, however, that our reachability check and the actual pruning are not atomic. So it _is_ still possible to violate the property (e.g., an object becomes referenced just as we are deleting it). This patch is shooting for eliminating problems where the mtimes of dependent objects differ by hours or days, and one is dropped without the other. It does nothing to help with short races. Naively, the simplest way to implement this would be to add all recent objects as tips to the reachability traversal. However, this does not perform well. In a recently-packed repository, all reachable objects will also be recent, and therefore we have to look at each object twice. This patch instead performs the reachability traversal, then follows up with a second traversal for recent objects, skipping any that have already been marked. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16sha1_file: add for_each iterators for loose and packed objectsLibravatar Jeff King2-0/+73
We typically iterate over the reachable objects in a repository by starting at the tips and walking the graph. There's no easy way to iterate over all of the objects, including unreachable ones. Let's provide a way of doing so. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16count-objects: use for_each_loose_file_in_objdirLibravatar Jeff King1-71/+30
This drops our line count considerably, and should make things more readable by keeping the counting logic separate from the traversal. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16count-objects: do not use xsize_t when counting object sizeLibravatar Jeff King1-1/+1
The point of xsize_t is to safely cast an off_t into a size_t (because we are about to mmap). But in count-objects, we are summing the sizes in an off_t. Using xsize_t means that count-objects could fail on a 32-bit system with a 4G object (not likely, as other parts of git would fail, but we should at least be correct here). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16prune-packed: use for_each_loose_file_in_objdirLibravatar Jeff King1-46/+23
This saves us from manually traversing the directory structure ourselves. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16reachable: mark index blobs as SEENLibravatar Jeff King1-1/+6
When we mark all reachable objects for pruning, that includes blobs mentioned by the index. However, we do not mark these with the SEEN flag, as we do for objects that we find by traversing (we also do not add them to the pending list, but that is because there is nothing further to traverse with them). This doesn't cause any problems with prune, because it checks only that the object exists in the global object hash, and not its flags. However, let's mark these objects to be consistent and avoid any later surprises. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16prune: factor out loose-object directory traversalLibravatar Jeff King3-61/+143
Prune has to walk $GIT_DIR/objects/?? in order to find the set of loose objects to prune. Other parts of the code (e.g., count-objects) want to do the same. Let's factor it out into a reusable for_each-style function. Note that this is not quite a straight code movement. The original code had strange behavior when it found a file of the form "[0-9a-f]{2}/.{38}" that did _not_ contain all hex digits. It executed a "break" from the loop, meaning that we stopped pruning in that directory (but still pruned other directories!). This was probably a bug; we do not want to process the file as an object, but we should keep going otherwise (and that is how the new code handles it). We are also a little more careful with loose object directories which fail to open. The original code silently ignored any failures, but the new code will complain about any problems besides ENOENT. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16reachable: reuse revision.c "add all reflogs" codeLibravatar Jeff King3-25/+4
We want to add all reflog entries as tips for finding reachable objects. The revision machinery can already do this (to support "rev-list --reflog"); we can reuse that code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16reachable: use traverse_commit_list instead of custom walkLibravatar Jeff King1-113/+17
To find the set of reachable objects, we add a bunch of possible sources to our rev_info, call prepare_revision_walk, and then launch into a custom walker that handles each object top. This is a subset of what traverse_commit_list does, so we can just reuse that code (it can also handle more complex cases like UNINTERESTING commits and pathspecs, but we don't use those features). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16clean up name allocation in prepare_revision_walkLibravatar Jeff King1-7/+7
When we enter prepare_revision_walk, we have zero or more entries in our "pending" array. We disconnect that array from the rev_info, and then process each entry: 1. If the entry is a commit and the --source option is in effect, we keep a pointer to the object name. 2. Otherwise, we re-add the item to the pending list with a blank name. We then throw away the old array by freeing the array itself, but do not touch the "name" field of each entry. For any items of type (2), we leak the memory associated with the name. This commit fixes that by calling object_array_clear, which handles the cleanup for us. That breaks (1), though, because it depends on the memory pointed to by the name to last forever. We can solve that by making a copy of the name. This is slightly less efficient, but it shouldn't matter in practice, as we do it only for the tip commits of the traversal. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16object_array: add a "clear" functionLibravatar Jeff King3-6/+17
There's currently no easy way to free the memory associated with an object_array (and in most cases, we simply leak the memory in a rev_info's pending array). Let's provide a helper to make this easier to handle. We can make use of it in list-objects.c, which does the same thing by hand (but fails to free the "name" field of each entry, potentially leaking memory). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16object_array: factor out slopbuf-freeing logicLibravatar Jeff King1-4/+12
This is not a lot of code, but it's a logical construct that should not need to be repeated (and we are about to add a third repetition). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16isxdigit: cast input to unsigned charLibravatar Jeff King2-5/+5
Otherwise, callers must do so or risk triggering warnings -Wchar-subscript (and rightfully so; a signed char might cause us to use a bogus negative index into the hexval_table). While we are dropping the now-unnecessary casts from the caller in urlmatch.c, we can get rid of similar casts in actually parsing the hex by using the hexval() helper, which implicitly casts to unsigned (but note that we cannot implement isxdigit in terms of hexval(), as it also casts its return value to unsigned). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16foreach_alt_odb: propagate return value from callbackLibravatar Jeff King2-5/+9
We check the return value of the callback and stop iterating if it is non-zero. However, we do not make the non-zero return value available to the caller, so they have no way of knowing whether the operation succeeded or not (technically they can keep their own error flag in the callback data, but that is unlike our other for_each functions). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-13test-lib.sh: support -x option for shell-tracingLibravatar Jeff King2-4/+44
Usually running a test under "-v" makes it clear which command is failing. However, sometimes it can be useful to also see a complete trace of the shell commands being run in the test. You can do so without any support from the test suite by running "sh -x tXXXX-foo.sh". However, this produces quite a large bit of output, as we see a trace of the entire test suite. This patch instead introduces a "-x" option to the test scripts (i.e., "./tXXXX-foo.sh -x"). When enabled, this turns on "set -x" only for the tests themselves. This can still be a bit verbose, but should keep things to a more manageable level. You can even use "--verbose-only" to see the trace only for a specific test. The implementation is a little invasive. We turn on the "set -x" inside the "eval" of the test code. This lets the eval itself avoid being reported in the trace (which would be long, and redundant with the verbose listing we already showed). And then after the eval runs, we do some trickery with stderr to avoid showing the "set +x" to the user. We also show traces for test_cleanup functions (since they can impact the test outcome, too). However, we do avoid running the noop ":" cleanup (the default if the test does not use test_cleanup at all), as it creates unnecessary noise in the "set -x" output. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-13t5304: use helper to report failure of "test foo = bar"Libravatar Jeff King2-8/+17
For small outputs, we sometimes use: test "$(some_cmd)" = "something we expect" instead of a full test_cmp. The downside of this is that when it fails, there is no output at all from the script. Let's introduce a small helper to make tests easier to debug. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-13t5304: use test_path_is_* instead of "test -f"Libravatar Jeff King1-23/+23
This is slightly more robust (checking "! test -f" would not notice a directory of the same name, though that is not likely to happen here). It also makes debugging easier, as the test script will output a message on failure. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-08Update draft release notes to 2.2Libravatar Junio C Hamano1-0/+13
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-08Merge branch 'sp/stream-clean-filter'Libravatar Junio C Hamano10-52/+164
When running a required clean filter, we do not have to mmap the original before feeding the filter. Instead, stream the file contents directly to the filter and process its output. * sp/stream-clean-filter: sha1_file: don't convert off_t to size_t too early to avoid potential die() convert: stream from fd to required clean filter to reduce used address space copy_fd(): do not close the input file descriptor mmap_limit: introduce GIT_MMAP_LIMIT to allow testing expected mmap size memory_limit: use git_env_ulong() to parse GIT_ALLOC_LIMIT config.c: add git_env_ulong() to parse environment variable convert: drop arguments other than 'path' from would_convert_to_git()
2014-10-08Merge branch 'bw/use-write-script-in-tests'Libravatar Junio C Hamano1-3/+1
* bw/use-write-script-in-tests: t/lib-credential: use write_script
2014-10-08Merge branch 'nd/archive-pathspec'Libravatar Junio C Hamano2-3/+108
"git archive" learned to filter what gets archived with pathspec. * nd/archive-pathspec: archive: support filtering paths with glob
2014-10-08Merge branch 'jc/push-cert'Libravatar Junio C Hamano23-159/+932
Allow "git push" request to be signed, so that it can be verified and audited, using the GPG signature of the person who pushed, that the tips of branches at a public repository really point the commits the pusher wanted to, without having to "trust" the server. * jc/push-cert: (24 commits) receive-pack::hmac_sha1(): copy the entire SHA-1 hash out signed push: allow stale nonce in stateless mode signed push: teach smart-HTTP to pass "git push --signed" around signed push: fortify against replay attacks signed push: add "pushee" header to push certificate signed push: remove duplicated protocol info send-pack: send feature request on push-cert packet receive-pack: GPG-validate push certificates push: the beginning of "git push --signed" pack-protocol doc: typofix for PKT-LINE gpg-interface: move parse_signature() to where it should be gpg-interface: move parse_gpg_output() to where it should be send-pack: clarify that cmds_sent is a boolean send-pack: refactor inspecting and resetting status and sending commands send-pack: rename "new_refs" to "need_pack_data" receive-pack: factor out capability string generation send-pack: factor out capability string generation send-pack: always send capabilities send-pack: refactor decision to send update per ref send-pack: move REF_STATUS_REJECT_NODELETE logic a bit higher ...
2014-10-07Sync with maintLibravatar Junio C Hamano1-1/+1
* maint: git-tag.txt: Add a missing hyphen to `-s`
2014-10-07Merge branch 'maint-2.0' into maintLibravatar Junio C Hamano1-1/+1
* maint-2.0: git-tag.txt: Add a missing hyphen to `-s`
2014-10-07Merge branch 'maint-1.9' into maint-2.0Libravatar Junio C Hamano1-1/+1
* maint-1.9: git-tag.txt: Add a missing hyphen to `-s`
2014-10-07Merge branch 'maint-1.8.5' into maint-1.9Libravatar Junio C Hamano1-1/+1
* maint-1.8.5: git-tag.txt: Add a missing hyphen to `-s`
2014-10-07Merge branch 'jk/mbox-from-line' into maintLibravatar Junio C Hamano6-1/+66
Some MUAs mangled a line in a message that begins with "From " to ">From " when writing to a mailbox file and feeding such an input to "git am" used to lose such a line. * jk/mbox-from-line: mailinfo: work around -Wstring-plus-int warning mailinfo: make ">From" in-body header check more robust
2014-10-07git-tag.txt: Add a missing hyphen to `-s`Libravatar Wieland Hoffmann1-1/+1
Signed-off-by: Wieland Hoffmann <themineo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-29Sync with 2.1.2Libravatar Junio C Hamano2-1/+22
* maint: Git 2.1.2
2014-09-29Merge branch 'jt/itimer-autoconf'Libravatar Junio C Hamano2-1/+15
setitmer(2) and related API elements can be configured from Makefile but autoconf did not know about it. * jt/itimer-autoconf: autoconf: check for setitimer() autoconf: check for struct itimerval git-compat-util.h: add missing semicolon after struct itimerval
2014-09-29Merge branch 'jc/test-lazy-prereq'Libravatar Junio C Hamano2-4/+0
Test-script clean-up. * jc/test-lazy-prereq: tests: drop GIT_*_TIMING_TESTS environment variable support
2014-09-29Merge branch 'sb/merge-recursive-copy-paste-fix'Libravatar Junio C Hamano1-5/+1
"git merge-recursive" had a small bug that could have made it mishandle "one side deleted, the other side did not touch it" in a rare corner case, where the other side actually did touch to cause the blob object names to be different but both blobs before and after the change normalize to the same (e.g. correcting mistake to check in a blob with CRLF line endings by replacing it with another blob that records the same contents with LF line endings). * sb/merge-recursive-copy-paste-fix: merge-recursive: remove stale commented debugging code merge-recursive: fix copy-paste mistake
2014-09-29Merge branch 'pr/use-default-sigpipe-setting'Libravatar Junio C Hamano3-1/+50
We used to get confused when a process called us with SIGPIPE ignored; we do want to die with SIGPIPE when the output is not read by default, and do ignore the signal when appropriate. * pr/use-default-sigpipe-setting: mingw.h: add dummy functions for sigset_t operations unblock and unignore SIGPIPE
2014-09-29Git 2.1.2Libravatar Junio C Hamano4-3/+24
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-29Merge branch 'jk/fsck-exit-code-fix' into maintLibravatar Junio C Hamano3-9/+87
"git fsck" failed to report that it found corrupt objects via its exit status in some cases. * jk/fsck-exit-code-fix: fsck: return non-zero status on missing ref tips fsck: exit with non-zero status upon error from fsck_obj()
2014-09-29Merge branch 'ta/config-add-to-empty-or-true-fix' into maintLibravatar Junio C Hamano4-7/+41
"git config --add section.var val" used to lose existing section.var whose value was an empty string. * ta/config-add-to-empty-or-true-fix: config: avoid a funny sentinel value "a^" make config --add behave correctly for empty and NULL values
2014-09-29Merge branch 'mk/reachable-protect-detached-head' into maintLibravatar Junio C Hamano2-0/+25
Reachability check (used in "git prune" and friends) did not add a detached HEAD as a starting point to traverse objects still in use. * mk/reachable-protect-detached-head: reachable.c: add HEAD to reachability starting commits
2014-09-29Merge branch 'mb/fast-import-delete-root' into maintLibravatar Junio C Hamano2-1/+109
An attempt to remove the entire tree in the "git fast-import" input stream caused it to misbehave. * mb/fast-import-delete-root: fast-import: fix segfault in store_tree() t9300: test filedelete command
2014-09-29Merge branch 'jk/index-pack-threading-races' into maintLibravatar Junio C Hamano1-2/+31
When receiving an invalid pack stream that records the same object twice, multiple threads got confused due to a race. * jk/index-pack-threading-races: index-pack: fix race condition with duplicate bases
2014-09-29Merge branch 'jk/send-pack-many-refspecs' into maintLibravatar Junio C Hamano5-2/+153
"git push" over HTTP transport had an artificial limit on number of refs that can be pushed imposed by the command line length. * jk/send-pack-many-refspecs: send-pack: take refspecs over stdin
2014-09-29Merge branch 'so/rebase-doc' into maintLibravatar Junio C Hamano1-6/+3
* so/rebase-doc: Documentation/git-rebase.txt: <upstream> must be given to specify <branch> Documentation/git-rebase.txt: -f forces a rebase that would otherwise be a no-op
2014-09-29Update draft release notes to 2.2Libravatar Junio C Hamano1-0/+12
2014-09-29Merge branch 'jk/mbox-from-line'Libravatar Junio C Hamano6-1/+66
Some MUAs mangled a line in a message that begins with "From " to ">From " when writing to a mailbox file and feeding such an input to "git am" used to lose such a line. * jk/mbox-from-line: mailinfo: work around -Wstring-plus-int warning mailinfo: make ">From" in-body header check more robust
2014-09-29Merge branch 'sb/t6031-typofix'Libravatar Junio C Hamano1-0/+1
* sb/t6031-typofix: t6031-test-merge-recursive: do not forget to add file to be committed
2014-09-29Merge branch 'sb/t9300-typofix'Libravatar Junio C Hamano1-1/+1
* sb/t9300-typofix: t9300-fast-import: fix typo in test description
2014-09-29Merge branch 'rs/remote-simplify'Libravatar Junio C Hamano1-12/+5
* rs/remote-simplify: remote: simplify match_name_with_pattern() using strbuf