summaryrefslogtreecommitdiff
path: root/t
AgeCommit message (Collapse)AuthorFilesLines
2014-10-08Merge branch 'jc/push-cert'Libravatar Junio C Hamano4-1/+171
Allow "git push" request to be signed, so that it can be verified and audited, using the GPG signature of the person who pushed, that the tips of branches at a public repository really point the commits the pusher wanted to, without having to "trust" the server. * jc/push-cert: (24 commits) receive-pack::hmac_sha1(): copy the entire SHA-1 hash out signed push: allow stale nonce in stateless mode signed push: teach smart-HTTP to pass "git push --signed" around signed push: fortify against replay attacks signed push: add "pushee" header to push certificate signed push: remove duplicated protocol info send-pack: send feature request on push-cert packet receive-pack: GPG-validate push certificates push: the beginning of "git push --signed" pack-protocol doc: typofix for PKT-LINE gpg-interface: move parse_signature() to where it should be gpg-interface: move parse_gpg_output() to where it should be send-pack: clarify that cmds_sent is a boolean send-pack: refactor inspecting and resetting status and sending commands send-pack: rename "new_refs" to "need_pack_data" receive-pack: factor out capability string generation send-pack: factor out capability string generation send-pack: always send capabilities send-pack: refactor decision to send update per ref send-pack: move REF_STATUS_REJECT_NODELETE logic a bit higher ...
2014-09-29Merge branch 'jc/test-lazy-prereq'Libravatar Junio C Hamano2-4/+0
Test-script clean-up. * jc/test-lazy-prereq: tests: drop GIT_*_TIMING_TESTS environment variable support
2014-09-29Merge branch 'pr/use-default-sigpipe-setting'Libravatar Junio C Hamano1-0/+22
We used to get confused when a process called us with SIGPIPE ignored; we do want to die with SIGPIPE when the output is not read by default, and do ignore the signal when appropriate. * pr/use-default-sigpipe-setting: mingw.h: add dummy functions for sigset_t operations unblock and unignore SIGPIPE
2014-09-29Merge branch 'jk/mbox-from-line'Libravatar Junio C Hamano5-0/+49
Some MUAs mangled a line in a message that begins with "From " to ">From " when writing to a mailbox file and feeding such an input to "git am" used to lose such a line. * jk/mbox-from-line: mailinfo: work around -Wstring-plus-int warning mailinfo: make ">From" in-body header check more robust
2014-09-29Merge branch 'sb/t6031-typofix'Libravatar Junio C Hamano1-0/+1
* sb/t6031-typofix: t6031-test-merge-recursive: do not forget to add file to be committed
2014-09-29Merge branch 'sb/t9300-typofix'Libravatar Junio C Hamano1-1/+1
* sb/t9300-typofix: t9300-fast-import: fix typo in test description
2014-09-29Merge branch 'da/rev-parse-verify-quiet'Libravatar Junio C Hamano1-5/+32
"rev-parse --verify --quiet $name" is meant to quietly exit with a non-zero status when $name is not a valid object name, but still gave error messages in some cases. * da/rev-parse-verify-quiet: stash: prefer --quiet over shell redirection of the standard error stream refs: make rev-parse --quiet actually quiet t1503: use test_must_be_empty Documentation: a note about stdout for git rev-parse --verify --quiet
2014-09-29Merge branch 'hj/pretty-naked-decoration'Libravatar Junio C Hamano1-0/+11
The pretty-format specifier "%d", which expanded to " (tagname)" for a tagged commit, gained a cousin "%D" that just gives the "tagname" without frills. * hj/pretty-naked-decoration: pretty: add %D format specifier
2014-09-26Merge branch 'jk/branch-verbose-merged'Libravatar Junio C Hamano1-0/+29
The "--verbose" option no longer breaks "git branch --merged $it". * jk/branch-verbose-merged: branch: clean up commit flags after merge-filter walk
2014-09-26Merge branch 'jc/ignore-sigpipe-while-running-hooks'Libravatar Junio C Hamano1-0/+13
pre- and post-receive hooks are no longer required to read all their inputs. * jc/ignore-sigpipe-while-running-hooks: receive-pack: allow hooks to ignore its standard input stream
2014-09-26Merge branch 'jc/hash-object-fsck-tag'Libravatar Junio C Hamano1-0/+19
Using "hash-object --literally", test one of the new breakages js/fsck-tag-validation topic teaches "fsck" to catch is caught. * jc/hash-object-fsck-tag: t1450: make sure fsck detects a malformed tagger line
2014-09-26Merge branch 'js/fsck-tag-validation'Libravatar Junio C Hamano2-0/+38
Teach "git fsck" to inspect the contents of annotated tag objects. * js/fsck-tag-validation: Make sure that index-pack --strict checks tag objects Add regression tests for stricter tag fsck'ing fsck: check tag objects' headers Make sure fsck_commit_buffer() does not run out of the buffer fsck_object(): allow passing object data separately from the object itself Refactor type_from_string() to allow continuing after detecting an error
2014-09-26Merge branch 'jk/faster-name-conflicts'Libravatar Junio C Hamano1-1/+30
Optimize the check to see if a ref $F can be created by making sure no existing ref has $F/ as its prefix, which especially matters in a repository with a large number of existing refs. * jk/faster-name-conflicts: refs: speed up is_refname_available
2014-09-22mingw.h: add dummy functions for sigset_t operationsLibravatar Johannes Sixt1-2/+2
Windows does not have POSIX-like signals, and so we ignore all operations on the non-existent signal mask machinery. Do not turn sigemptyset into a function, but leave it a macro that erases the code in the argument because it is used to set sa_mask of a struct sigaction, but our dummy in mingw.h does not have that member. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-22t6031-test-merge-recursive: do not forget to add file to be committedLibravatar Stefan Beller1-0/+1
Signed-off-by: Stefan Beller <stefanbeller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-22t9300-fast-import: fix typo in test descriptionLibravatar Stefan Beller1-1/+1
Signed-off-by: Stefan Beller <stefanbeller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-19Merge branch 'jk/fsck-exit-code-fix'Libravatar Junio C Hamano2-7/+83
"git fsck" failed to report that it found corrupt objects via its exit status in some cases. * jk/fsck-exit-code-fix: fsck: return non-zero status on missing ref tips fsck: exit with non-zero status upon error from fsck_obj()
2014-09-19Merge branch 'js/no-test-cmp-for-binaries'Libravatar Junio C Hamano1-1/+1
* js/no-test-cmp-for-binaries: t9300: use test_cmp_bin instead of test_cmp to compare binary files
2014-09-19Merge branch 'ta/config-add-to-empty-or-true-fix'Libravatar Junio C Hamano1-0/+20
"git config --add section.var val" used to lose existing section.var whose value was an empty string. * ta/config-add-to-empty-or-true-fix: config: avoid a funny sentinel value "a^" make config --add behave correctly for empty and NULL values
2014-09-19Merge branch 'jc/parseopt-verify-short-name'Libravatar Junio C Hamano1-2/+2
Add checks for a common programming mistake to assign the same short option name to two separate options to help developers. * jc/parseopt-verify-short-name: parse-options: detect attempt to add a duplicate short option name
2014-09-19Merge branch 'mk/reachable-protect-detached-head'Libravatar Junio C Hamano1-0/+22
* mk/reachable-protect-detached-head: reachable.c: add HEAD to reachability starting commits
2014-09-19Merge branch 'tb/crlf-tests'Libravatar Junio C Hamano4-120/+175
* tb/crlf-tests: MinGW: update tests to handle a native eol of crlf Makefile: propagate NATIVE_CRLF to C t0027: Tests for core.eol=native, eol=lf, eol=crlf
2014-09-19Merge branch 'mb/fast-import-delete-root'Libravatar Junio C Hamano1-0/+104
An attempt to remove the entire tree in the "git fast-import" input stream caused it to misbehave. * mb/fast-import-delete-root: fast-import: fix segfault in store_tree() t9300: test filedelete command
2014-09-19Merge branch 'bb/date-iso-strict'Libravatar Junio C Hamano1-0/+8
"log --date=iso" uses a slight variant of ISO 8601 format that is made more human readable. A new "--date=iso-strict" option gives datetime output that is more strictly conformant. * bb/date-iso-strict: pretty: provide a strict ISO 8601 date format
2014-09-19Merge branch 'jk/fast-export-anonymize'Libravatar Junio C Hamano1-0/+112
Sometimes users want to report a bug they experience on their repository, but they are not at liberty to share the contents of the repository. "fast-export" was taught an "--anonymize" option to replace blob contents, names of people and paths and log messages with bland and simple strings to help them. * jk/fast-export-anonymize: docs/fast-export: explain --anonymize more completely teach fast-export an --anonymize option
2014-09-19Merge branch 'jk/send-pack-many-refspecs'Libravatar Junio C Hamano2-0/+107
The number of refs that can be pushed at once over smart HTTP was limited by the command line length. The limitation has been lifted by passing these refs from the standard input of send-pack. * jk/send-pack-many-refspecs: send-pack: take refspecs over stdin
2014-09-19refs: make rev-parse --quiet actually quietLibravatar David Aguilar1-0/+27
When a reflog is deleted, e.g. when "git stash" clears its stashes, "git rev-parse --verify --quiet" dies: fatal: Log for refs/stash is empty. The reason is that the get_sha1() code path does not allow us to suppress this message. Pass the flags bitfield through get_sha1_with_context() so that read_ref_at() can suppress the message. Use get_sha1_with_context1() instead of get_sha1() in rev-parse so that the --quiet flag is honored. Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-18pretty: add %D format specifierLibravatar Harry Jeffery1-0/+11
Add a new format specifier, '%D' that is identical in behaviour to '%d', except that it does not include the ' (' prefix or ')' suffix provided by '%d'. Signed-off-by: Harry Jeffery <harry@exec64.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-18unblock and unignore SIGPIPELibravatar Patrick Reynolds1-0/+22
Blocked and ignored signals -- but not caught signals -- are inherited across exec. Some callers with sloppy signal-handling behavior can call git with SIGPIPE blocked or ignored, even non-deterministically. When SIGPIPE is blocked or ignored, several git commands can run indefinitely, ignoring EPIPE returns from write() calls, even when the process that called them has gone away. Our specific case involved a pipe of git diff-tree output to a script that reads a limited amount of diff data. In an ideal world, git would never be called with SIGPIPE blocked or ignored. But in the real world, several real potential callers, including Perl, Apache, and Unicorn, sometimes spawn subprocesses with SIGPIPE ignored. It is easier and more productive to harden git against this mistake than to clean it up in every potential parent process. Signed-off-by: Patrick Reynolds <patrick.reynolds@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-18branch: clean up commit flags after merge-filter walkLibravatar Jeff King1-0/+29
When we run `branch --merged`, we use prepare_revision_walk with the merge-filter marked as UNINTERESTING. Any branch tips that are marked UNINTERESTING after it returns must be ancestors of that commit. As we iterate through the list of refs to show, we check item->commit->object.flags to see whether it was marked. This interacts badly with --verbose, which will do a separate walk to find the ahead/behind information for each branch. There are two bad things that can happen: 1. The ahead/behind walk may get the wrong results, because it can see a bogus UNINTERESTING flag leftover from the merge-filter walk. 2. We may omit some branches if their tips are involved in the ahead/behind traversal of a branch shown earlier. The ahead/behind walk carefully cleans up its commit flags, meaning it may also erase the UNINTERESTING flag that we expect to check later. We can solve this by moving the merge-filter state for each ref into its "struct ref_item" as soon as we finish the merge-filter walk. That fixes (2). Then we are free to clear the commit flags we used in the walk, fixing (1). Note that we actually do away with the matches_merge_filter helper entirely here, and inline it between the revision walk and the flag-clearing. This ensures that nobody accidentally calls it at the wrong time (it is only safe to check in that instant between the setting and clearing of the global flag). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-17signed push: allow stale nonce in stateless modeLibravatar Junio C Hamano1-2/+7
When operating with the stateless RPC mode, we will receive a nonce issued by another instance of us that advertised our capability and refs some time ago. Update the logic to check received nonce to detect this case, compute how much time has passed since the nonce was issued and report the status with a new environment variable GIT_PUSH_CERT_NONCE_SLOP to the hooks. GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case. The hooks are free to decide how large a slop it is willing to accept. Strictly speaking, the "nonce" is not really a "nonce" anymore in the stateless RPC mode, as it will happily take any "nonce" issued by it (which is protected by HMAC and its secret key) as long as it is fresh enough. The degree of this security degradation, relative to the native protocol, is about the same as the "we make sure that the 'git push' decided to update our refs with new objects based on the freshest observation of our refs by making sure the values they claim the original value of the refs they ask us to update exactly match the current state" security is loosened to accomodate the stateless RPC mode in the existing code without this series, so there is no need for those who are already using smart HTTP to push to their repositories to be alarmed any more than they already are. In addition, the server operator can set receive.certnonceslop configuration variable to specify how stale a nonce can be (in seconds). When this variable is set, and if the nonce received in the certificate that passes the HMAC check was less than that many seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS (instead of "SLOP") and the received nonce value is given in GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded hook to check if the certificate we received is recent enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-17signed push: teach smart-HTTP to pass "git push --signed" aroundLibravatar Junio C Hamano3-1/+39
The "--signed" option received by "git push" is first passed to the transport layer, which the native transport directly uses to notice that a push certificate needs to be sent. When the transport-helper is involved, however, the option needs to be told to the helper with set_helper_option(), and the helper needs to take necessary action. For the smart-HTTP helper, the "necessary action" involves spawning the "git send-pack" subprocess with the "--signed" option. Once the above all gets wired in, the smart-HTTP transport now can use the push certificate mechanism to authenticate its pushes. Add a test that is modeled after tests for the native transport in t5534-push-signed.sh to t5541-http-push-smart.sh. Update the test Apache configuration to pass GNUPGHOME environment variable through. As PassEnv would trigger warnings for an environment variable that is not set, export it from test-lib.sh set to a harmless value when GnuPG is not being used in the tests. Note that the added test is deliberately loose and does not check the nonce in this step. This is because the stateless RPC mode is inevitably flaky and a nonce that comes back in the actual push processing is one issued by a different process; if the two interactions with the server crossed a second boundary, the nonces will not match and such a check will fail. A later patch in the series will work around this shortcoming. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-17signed push: fortify against replay attacksLibravatar Junio C Hamano1-8/+14
In order to prevent a valid push certificate for pushing into an repository from getting replayed in a different push operation, send a nonce string from the receive-pack process and have the signer include it in the push certificate. The receiving end uses an HMAC hash of the path to the repository it serves and the current time stamp, hashed with a secret seed (the secret seed does not have to be per-repository but can be defined in /etc/gitconfig) to generate the nonce, in order to ensure that a random third party cannot forge a nonce that looks like it originated from it. The original nonce is exported as GIT_PUSH_CERT_NONCE for the hooks to examine and match against the value on the "nonce" header in the certificate to notice a replay, but returned "nonce" header in the push certificate is examined by receive-pack and the result is exported as GIT_PUSH_CERT_NONCE_STATUS, whose value would be "OK" if the nonce recorded in the certificate matches what we expect, so that the hooks can more easily check. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-16receive-pack: allow hooks to ignore its standard input streamLibravatar Junio C Hamano1-0/+13
The pre-receive and post-receive hooks were designed to be an improvement over old style update and post-update hooks, which take the update information on their command line and are limited by the command line length limit. The same information is fed from the standard input to pre/post-receive hooks instead to lift this limitation. It has been mandatory for these new style hooks to consume the update information fully from the standard input stream. Otherwise, they would risk killing the receive-pack process via SIGPIPE. If a hook does not want to look at all the information, it is easy to send its standard input to /dev/null (perhaps a niche use of hook might need to know only the fact that a push was made, without having to know what objects have been pushed to update which refs), and this has already been done by existing hooks that are written carefully. However, because there is no good way to consistently fail hooks that do not consume the input fully (a small push may result in a short update record that may fit within the pipe buffer, to which the receive-pack process may manage to write before the hook has a chance to exit without reading anything, which will not result in a death-by-SIGPIPE of receive-pack), it can lead to a hard to diagnose "once in a blue moon" phantom failure. Lift this "hooks must consume their input fully" mandate. A mandate that is not enforced strictly is not helping us to catch mistakes in hooks. If a hook has a good reason to decide the outcome of its operation without reading the information we feed it, let it do so as it pleases. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-16t1503: use test_must_be_emptyLibravatar David Aguilar1-5/+5
Use `test_must_be_be_empty <file>` instead of `test -z "$(cat <file>)"`. Suggested-by: Fabian Ruch <bafain@gmail.com> Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-16mailinfo: make ">From" in-body header check more robustLibravatar Jeff King5-0/+49
Since commit 81c5cf7 (mailinfo: skip bogus UNIX From line inside body, 2006-05-21), we have treated lines like ">From" in the body as headers. This makes "git am" work for people who erroneously paste the whole output from format-patch: From 12345abcd...fedcba543210 Mon Sep 17 00:00:00 2001 From: them Subject: [PATCH] whatever into their email body (assuming that an mbox writer then quotes "From" as ">From", as otherwise we would actually mailsplit on the in-body line). However, this has false positives if somebody actually has a commit body that starts with "From "; in this case we erroneously remove the line entirely from the commit message. We can make this check more robust by making sure the line actually looks like a real mbox "From" line. Inspect the line that begins with ">From " a more carefully to only skip lines that match the expected pattern (note that the datestamp part of the format-patch output is designed to be kept constant to help those who write magic(5) entries). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-15send-pack: send feature request on push-cert packetLibravatar Junio C Hamano1-0/+13
We would want to update the interim protocol so that we do not send the usual update commands when the push certificate feature is in use, as the same information is in the certificate. Once that happens, the push-cert packet may become the only protocol command, but then there is no packet to put the feature request behind, like we always did. As we have prepared the receiving end that understands the push-cert feature to accept the feature request on the first protocol packet (other than "shallow ", which was an unfortunate historical mistake that has to come before everything else), we can give the feature request on the push-cert packet instead of the first update protocol packet, in preparation for the next step to actually update to the final protocol. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-15receive-pack: GPG-validate push certificatesLibravatar Junio C Hamano1-2/+16
Reusing the GPG signature check helpers we already have, verify the signature in receive-pack and give the results to the hooks via GIT_PUSH_CERT_{SIGNER,KEY,STATUS} environment variables. Policy decisions, such as accepting or rejecting a good signature by a key that is not fully trusted, is left to the hook and kept outside of the core. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-15push: the beginning of "git push --signed"Libravatar Junio C Hamano1-0/+94
While signed tags and commits assert that the objects thusly signed came from you, who signed these objects, there is not a good way to assert that you wanted to have a particular object at the tip of a particular branch. My signing v2.0.1 tag only means I want to call the version v2.0.1, and it does not mean I want to push it out to my 'master' branch---it is likely that I only want it in 'maint', so the signature on the object alone is insufficient. The only assurance to you that 'maint' points at what I wanted to place there comes from your trust on the hosting site and my authentication with it, which cannot easily audited later. Introduce a mechanism that allows you to sign a "push certificate" (for the lack of better name) every time you push, asserting that what object you are pushing to update which ref that used to point at what other object. Think of it as a cryptographic protection for ref updates, similar to signed tags/commits but working on an orthogonal axis. The basic flow based on this mechanism goes like this: 1. You push out your work with "git push --signed". 2. The sending side learns where the remote refs are as usual, together with what protocol extension the receiving end supports. If the receiving end does not advertise the protocol extension "push-cert", an attempt to "git push --signed" fails. Otherwise, a text file, that looks like the following, is prepared in core: certificate version 0.1 pusher Junio C Hamano <gitster@pobox.com> 1315427886 -0700 7339ca65... 21580ecb... refs/heads/master 3793ac56... 12850bec... refs/heads/next The file begins with a few header lines, which may grow as we gain more experience. The 'pusher' header records the name of the signer (the value of user.signingkey configuration variable, falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the certificate generation. After the header, a blank line follows, followed by a copy of the protocol message lines. Each line shows the old and the new object name at the tip of the ref this push tries to update, in the way identical to how the underlying "git push" protocol exchange tells the ref updates to the receiving end (by recording the "old" object name, the push certificate also protects against replaying). It is expected that new command packet types other than the old-new-refname kind will be included in push certificate in the same way as would appear in the plain vanilla command packets in unsigned pushes. The user then is asked to sign this push certificate using GPG, formatted in a way similar to how signed tag objects are signed, and the result is sent to the other side (i.e. receive-pack). In the protocol exchange, this step comes immediately before the sender tells what the result of the push should be, which in turn comes before it sends the pack data. 3. When the receiving end sees a push certificate, the certificate is written out as a blob. The pre-receive hook can learn about the certificate by checking GIT_PUSH_CERT environment variable, which, if present, tells the object name of this blob, and make the decision to allow or reject this push. Additionally, the post-receive hook can also look at the certificate, which may be a good place to log all the received certificates for later audits. Because a push certificate carry the same information as the usual command packets in the protocol exchange, we can omit the latter when a push certificate is in use and reduce the protocol overhead. This however is not included in this patch to make it easier to review (in other words, the series at this step should never be released without the remainder of the series, as it implements an interim protocol that will be incompatible with the final one). As such, the documentation update for the protocol is left out of this step. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-14git svn: info: correctly handle absolute path argsLibravatar Eric Wong1-0/+30
Calling "git svn info $(pwd)" would hit: "Reading from filehandle failed at ..." errors due to improper prefixing and canonicalization. Strip the toplevel path from absolute filesystem paths to ensure downstream canonicalization routines are only exposed to paths tracked in git (or SVN). v2: Thanks to Andrej Manduch for originally noticing the issue and fixing my original version of this to handle more corner cases such as "/path/to/top/../top" and "/path/to/top/../top/file" as shown in the new test cases. v3: Fix pathname portability problems pointed out by Johannes Sixt with a hint from brian m. carlson. Cc: Johannes Sixt <j6t@kdbg.org> Cc: "brian m. carlson" <sandals@crustytoothpaste.net> Signed-off-by: Andrej Manduch <amanduch@gmail.com> Signed-off-by: Eric Wong <normalperson@yhbt.net>
2014-09-12t9300: use test_cmp_bin instead of test_cmp to compare binary filesLibravatar Johannes Sixt1-1/+1
test_cmp is intended to produce diff output for human consumption. The input in one instance in t9300-fast-import.sh are binary files, however. Use test_cmp_bin to compare the files. This was noticed because on Windows we have a special implementation of test_cmp in pure bash code (to ignore differences due to intermittent CR in actual output), and bash runs into an infinite loop due to the binary nature of the input. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-12refs: speed up is_refname_availableLibravatar Jeff King1-1/+30
Our filesystem ref storage does not allow D/F conflicts; so if "refs/heads/a/b" exists, we do not allow "refs/heads/a" to exist (and vice versa). This falls out naturally for loose refs, where the filesystem enforces the condition. But for packed-refs, we have to make the check ourselves. We do so by iterating over the entire packed-refs namespace and checking whether each name creates a conflict. If you have a very large number of refs, this is quite inefficient, as you end up doing a large number of comparisons with uninteresting bits of the ref tree (e.g., we know that all of "refs/tags" is uninteresting in the example above, yet we check each entry in it). Instead, let's take advantage of the fact that we have the packed refs stored as a trie of ref_entry structs. We can find each component of the proposed refname as we walk through the trie, checking for D/F conflicts as we go. For a refname of depth N (i.e., 4 in the above example), we only have to visit N nodes. And at each visit, we can binary search the M names at that level, for a total complexity of O(N lg M). ("M" is different at each level, of course, but we can take the worst-case "M" as a bound). In a pathological case of fetching 30,000 fresh refs into a repository with 8.5 million refs, this dropped the time to run "git fetch" from tens of minutes to ~30s. This may also help smaller cases in which we check against loose refs (which we do when renaming a ref), as we may avoid a disk access for unrelated loose directories. Note that the tests we add appear at first glance to be redundant with what is already in t3210. However, the early tests are not robust; they are run with reflogs turned on, meaning that we are not actually testing is_refname_available at all! The operations will still fail because the reflogs will hit D/F conflicts in the filesystem. To get a true test, we must turn off reflogs (but we don't want to do so for the entire script, because the point of turning them on was to cover some other cases). Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-12t1450: make sure fsck detects a malformed tagger lineLibravatar Junio C Hamano1-0/+19
With "hash-object --literally", write a tag object that is not supposed to pass one of the new checks added to "fsck", and make sure that the new check catches the breakage. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-12Merge branch 'js/fsck-tag-validation' into HEADLibravatar Junio C Hamano2-0/+38
* js/fsck-tag-validation: Make sure that index-pack --strict checks tag objects Add regression tests for stricter tag fsck'ing fsck: check tag objects' headers Make sure fsck_commit_buffer() does not run out of the buffer fsck_object(): allow passing object data separately from the object itself Refactor type_from_string() to allow continuing after detecting an error
2014-09-12Make sure that index-pack --strict checks tag objectsLibravatar Johannes Schindelin1-0/+19
One of the most important use cases for the strict tag object checking is when transfer.fsckobjects is set to true to catch invalid objects early on. This new regression test essentially tests the same code path by directly calling 'index-pack --strict' on a pack containing an tag object without a 'tagger' line. Technically, this test is not enough: it only exercises a code path that *warns*, not one that *fails*. The reason is that hash-object and pack-objects both insist on parsing the tag objects and would fail on invalid tag objects at this time. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-12fsck: return non-zero status on missing ref tipsLibravatar Jeff King1-0/+56
Fsck tries hard to detect missing objects, and will complain (and exit non-zero) about any inter-object links that are missing. However, it will not exit non-zero for any missing ref tips, meaning that a severely broken repository may still pass "git fsck && echo ok". The problem is that we use for_each_ref to iterate over the ref tips, which hides broken tips. It does at least print an error from the refs.c code, but fsck does not ever see the ref and cannot note the problem in its exit code. We can solve this by using for_each_rawref and noting the error ourselves. In addition to adding tests for this case, we add tests for all types of missing-object links (all of which worked, but which we were not testing). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-11Add regression tests for stricter tag fsck'ingLibravatar Johannes Schindelin1-0/+19
The intent of the new test case is to catch general breakages in the fsck_tag() function, not so much to test it extensively, trying to strike the proper balance between thoroughness and speed. While it *would* have been nice to test the code path where fsck_object() encounters an invalid tag object, this is not possible using git fsck: tag objects are parsed already before fsck'ing (and the parser already fails upon such objects). Even worse: we would not even be able write out invalid tag objects because git hash-object parses those objects, too, unless we resorted to really ugly hacks such as using something like this in the unit tests (essentially depending on Perl *and* Compress::Zlib): hash_invalid_object () { contents="$(printf '%s %d\0%s' "$1" ${#2} "$2")" && sha1=$(echo "$contents" | test-sha1) && suffix=${sha1#??} && mkdir -p .git/objects/${sha1%$suffix} && echo "$contents" | perl -MCompress::Zlib -e 'undef $/; print compress(<>)' \ > .git/objects/${sha1%$suffix}/$suffix && echo $sha1 } Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-11Merge branch 'jn/unpack-trees-checkout-m-carry-deletion'Libravatar Junio C Hamano1-0/+17
"git checkout -m" did not switch to another branch while carrying the local changes forward when a path was deleted from the index. * jn/unpack-trees-checkout-m-carry-deletion: checkout -m: attempt merge when deletion of path was staged unpack-trees: use 'cuddled' style for if-else cascade unpack-trees: simplify 'all other failures' case
2014-09-11Merge branch 'jk/prune-top-level-refs-after-packing'Libravatar Junio C Hamano1-0/+7
After "pack-refs --prune" packed refs at the top-level, it failed to prune them. * jk/prune-top-level-refs-after-packing: pack-refs: prune top-level refs like "refs/foo"
2014-09-11Merge branch 'nd/large-blobs'Libravatar Junio C Hamano1-0/+20
Teach a few codepaths to punt (instead of dying) when large blobs that would not fit in core are involved in the operation. * nd/large-blobs: diff: shortcut for diff'ing two binary SHA-1 objects diff --stat: mark any file larger than core.bigfilethreshold binary diff.c: allow to pass more flags to diff_populate_filespec sha1_file.c: do not die failing to malloc in unpack_compressed_entry wrapper.c: introduce gentle xmallocz that does not die()