summaryrefslogtreecommitdiff
path: root/builtin/fetch-pack.c
AgeCommit message (Collapse)AuthorFilesLines
2011-12-13fetch-pack: match refs exactlyLibravatar Jeff King1-4/+9
When we are determining the list of refs to fetch via fetch-pack, we have two sets of refs to compare: those on the remote side, and a "match" list of things we want to fetch. We iterate through the remote refs alphabetically, seeing if each one is wanted by the "match" list. Since def88e9 (Commit first cut at "git-fetch-pack", 2005-07-04), we have used the "path_match" function to do a suffix match, where a remote ref is considered wanted if any of the "match" elements is a suffix of the remote refname. This enables callers of fetch-pack to specify unqualified refs and have them matched up with remote refs (e.g., ask for "A" and get remote's "refs/heads/A"). However, if you provide a fully qualified ref, then there are corner cases where we provide the wrong answer. For example, given a remote with two refs: refs/foo/refs/heads/master refs/heads/master asking for "refs/heads/master" will first match "refs/foo/refs/heads/master" by the suffix rule, and we will erroneously fetch it instead of refs/heads/master. As it turns out, all callers of fetch_pack do provide fully-qualified refs for the match list. There are two ways fetch_pack can get match lists: 1. Through the transport code (i.e., via git-fetch) 2. On the command-line of git-fetch-pack In the first case, we will always be providing the names of fully-qualified refs from "struct ref" objects. We will have pre-matched those ref objects already (since we have to handle more advanced matching, like wildcard refspecs), and are just providing a list of the refs whose objects we need. In the second case, users could in theory be providing non-qualified refs on the command-line. However, the fetch-pack documentation claims that refs should be fully qualified (and has always done so since it was written in 2005). Let's change this path_match call to simply check for string equality, matching what the callers of fetch_pack are expecting. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-13drop "match" parameter from get_remote_headsLibravatar Jeff King1-1/+1
The get_remote_heads function reads the list of remote refs during git protocol session. It dates all the way back to def88e9 (Commit first cut at "git-fetch-pack", 2005-07-04). At that time, the idea was to come up with a list of refs we were interested in, and then filter the list as we got it from the remote side. Later, 1baaae5 (Make maximal use of the remote refs, 2005-10-28) stopped filtering at the get_remote_heads layer, letting us use the non-matching refs to find common history. As a result, all callers now simply pass an empty match list (and any future callers will want to do the same). So let's drop these now-useless parameters. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-10Merge branch 'mh/check-ref-format-3'Libravatar Junio C Hamano1-1/+1
* mh/check-ref-format-3: (23 commits) add_ref(): verify that the refname is formatted correctly resolve_ref(): expand documentation resolve_ref(): also treat a too-long SHA1 as invalid resolve_ref(): emit warnings for improperly-formatted references resolve_ref(): verify that the input refname has the right format remote: avoid passing NULL to read_ref() remote: use xstrdup() instead of strdup() resolve_ref(): do not follow incorrectly-formatted symbolic refs resolve_ref(): extract a function get_packed_ref() resolve_ref(): turn buffer into a proper string as soon as possible resolve_ref(): only follow a symlink that contains a valid, normalized refname resolve_ref(): use prefixcmp() resolve_ref(): explicitly fail if a symlink is not readable Change check_refname_format() to reject unnormalized refnames Inline function refname_format_print() Make collapse_slashes() allocate memory for its result Do not allow ".lock" at the end of any refname component Refactor check_refname_format() Change check_ref_format() to take a flags argument Change bad_ref_char() to return a boolean value ...
2011-10-05Change check_ref_format() to take a flags argumentLibravatar Michael Haggerty1-1/+1
Change check_ref_format() to take a flags argument that indicates what is acceptable in the reference name (analogous to "git check-ref-format"'s "--allow-onelevel" and "--refspec-pattern"). This is more convenient for callers and also fixes a failure in the test suite (and likely elsewhere in the code) by enabling "onelevel" and "refspec-pattern" to be allowed independently of each other. Also rename check_ref_format() to check_refname_format() to make it obvious that it deals with refnames rather than references themselves. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-05Merge branch 'jc/fetch-pack-fsck-objects'Libravatar Junio C Hamano1-1/+19
* jc/fetch-pack-fsck-objects: test: fetch/receive with fsckobjects transfer.fsckobjects: unify fetch/receive.fsckobjects fetch.fsckobjects: verify downloaded objects Conflicts: Documentation/config.txt builtin/fetch-pack.c
2011-09-04transfer.fsckobjects: unify fetch/receive.fsckobjectsLibravatar Junio C Hamano1-2/+12
This single variable can be used to set instead of setting fsckobjects variable for fetch & receive independently. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-04fetch.fsckobjects: verify downloaded objectsLibravatar Junio C Hamano1-0/+8
This corresponds to receive.fsckobjects configuration variable added (a lot) earlier in 20dc001 (receive-pack: allow using --strict mode for unpacking objects, 2008-02-25). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-28Merge branch 'nd/decorate-grafts'Libravatar Junio C Hamano1-0/+30
* nd/decorate-grafts: log: Do not decorate replacements with --no-replace-objects log: decorate "replaced" on to replaced commits log: decorate grafted commits with "grafted" Move write_shallow_commits to fetch-pack.c Add for_each_commit_graft() to iterate all grafts decoration: do not mis-decorate refs with same prefix
2011-08-18fetch-pack: check for valid commit from serverLibravatar Nguyễn Thái Ngọc Duy1-0/+2
A malicious server can return ACK with non-existent SHA-1 or not a commit. lookup_commit() in this case may return NULL. Do not let fetch-pack crash by accessing NULL address in this case. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-18Move write_shallow_commits to fetch-pack.cLibravatar Nguyễn Thái Ngọc Duy1-0/+30
This function produces network traffic and should be in fetch-pack. It has been in commit.c because it needs to iterate (private) graft list. It can now do so using for_each_commit_graft(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-29Merge branch 'jk/haves-from-alternate-odb'Libravatar Junio C Hamano1-1/+1
* jk/haves-from-alternate-odb: receive-pack: eliminate duplicate .have refs bisect: refactor sha1_array into a generic sha1 list refactor refs_from_alternate_cb to allow passing extra data
2011-05-19refactor refs_from_alternate_cb to allow passing extra dataLibravatar Jeff King1-1/+1
The foreach_alt_odb function triggers a callback for each alternate object db we have, with room for a single void pointer as data. Currently, we always call refs_from_alternate_cb as the callback function, and then pass another callback (to receive each ref individually) as the void pointer. This has two problems: 1. C technically forbids stuffing a function pointer into a "void *". In practice, this probably doesn't matter on any architectures git runs on, but it never hurts to follow the letter of the law. 2. There is no room for an extra data pointer. Indeed, the alternate_ref_fn that refs_from_alternate_cb calls takes a void* for data, but we always pass it NULL. Instead, let's properly stuff our function pointer into a data struct, which also leaves room for an extra caller-supplied data pointer. And to keep things simple for existing callers, let's make a for_each_alternate_ref function that takes care of creating the extra struct. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-19fetch: avoid repeated commits in mark_completeLibravatar Jeff King1-2/+4
We add every local ref to a list so that we can mark them and all of their ancestors back to a certain cutoff point. However, if some refs point to the same commit, we will end up adding them to the list many times. Furthermore, since commit_lists are stored as linked lists, we must do an O(n) traversal of the list in order to find the right place to insert each commit. This makes building the list O(n^2) in the number of refs. For normal repositories, this isn't a big deal. We have a few hundreds refs at most, and most of them are unique. But consider an "alternates" repo that serves as an object database for many other similar repos. For reachability, it needs to keep a copy of the refs in each child repo. This means it may have a large number of refs, many of which point to the same commits. By noting commits we have already added to the list, we can shrink the size of "n" in such a repo to the number of unique commits, which is on the order of what a normal repo would contain (it's actually more than a normal repo, since child repos may have branches at different states, but in practice it tends to be much smaller than the list with duplicates). Here are the results on one particular giant repo (containing objects for all Rails forks on GitHub): $ git for-each-ref | wc -l 112514 [before] $ git fetch --no-tags ../remote.git 63.52user 0.12system 1:03.68elapsed 99%CPU (0avgtext+0avgdata 137648maxresident)k 1856inputs+48outputs (11major+19603minor)pagefaults 0swaps $ git fetch --no-tags ../remote.git 6.15user 0.08system 0:06.25elapsed 99%CPU (0avgtext+0avgdata 123856maxresident)k 0inputs+40outputs (0major+18872minor)pagefaults 0swaps Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-29Merge branch 'jc/fetch-progressive-stride'Libravatar Junio C Hamano1-4/+5
* jc/fetch-progressive-stride: Fix potential local deadlock during fetch-pack
2011-03-29Merge branches 'sp/maint-fetch-pack-stop-early' and ↵Libravatar Junio C Hamano1-1/+2
'sp/maint-upload-pack-stop-early' * sp/maint-fetch-pack-stop-early: enable "no-done" extension only when fetching over smart-http * sp/maint-upload-pack-stop-early: enable "no-done" extension only when serving over smart-http
2011-03-29Revert two "no-done" revertsLibravatar Junio C Hamano1-3/+15
Last night I had to make these two emergency reverts, but now we have a better understanding of which part of the topic was broken, let's get rid of the revert to fix it correctly. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-29Fix potential local deadlock during fetch-packLibravatar Junio C Hamano1-4/+5
The fetch-pack/upload-pack protocol relies on the underlying transport (local pipe or TCP socket) to have enough slack to allow one window worth of data in flight without blocking the writer. Traditionally we always relied on being able to have two windows of 32 "have"s in flight (roughly 3k bytes) to stream. The recent "progressive-stride" change allows "fetch-pack" to send up to 1024 "have"s without reading any response from "upload-pack". The outgoing pipe of "upload-pack" can be clogged with many ACK and NAK that are unread, while "fetch-pack" is still stuffing its outgoing pipe with more "have"s, leading to a deadlock. Revert the change unless we are in stateless rpc (aka smart-http) mode, as using a large window full of "have"s is still a good way to help reduce the number of back-and-forth, and there is no buffering issue there (it is strictly "ping-pong" without an overlap). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-29enable "no-done" extension only when fetching over smart-httpLibravatar Junio C Hamano1-1/+2
When 'no-done' protocol extension is used, the upload-pack (i.e. the server side) process stops listening to the fetch-pack after issuing the final NAK, and starts sending the generated pack data back, but there may be more "have" send by the latter in flight that the fetch-pack is expecting to be responded with ACK/NAK. This will typically result in a deadlock (both will block on write that the other end never reads) or SIGPIPE on the fetch-pack end (upload-pack will finish writing a small pack and goes away). Disable it unless fetch-pack is running under smart-http, where there is no such streaming issue. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Shawn O. Pearce <spearce@spearce.org>
2011-03-28Revert "fetch-pack: Implement no-done capability"Libravatar Junio C Hamano1-15/+3
This reverts commit 761ecf0bc7b6cddf311f00877c59e6381cdbdeea.
2011-03-26Merge branch 'jc/fetch-progressive-stride'Libravatar Junio C Hamano1-3/+18
* jc/fetch-progressive-stride: fetch-pack: use smaller handshake window for initial request fetch-pack: progressively use larger handshake windows fetch-pack: factor out hardcoded handshake window size Conflicts: builtin/fetch-pack.c
2011-03-22Merge branch 'sp/maint-fetch-pack-stop-early'Libravatar Junio C Hamano1-2/+16
* sp/maint-fetch-pack-stop-early: fetch-pack: Implement no-done capability fetch-pack: Finish negotation if remote replies "ACK %s ready"
2011-03-22Merge branch 'jc/maint-fetch-alt'Libravatar Junio C Hamano1-0/+12
* jc/maint-fetch-alt: fetch-pack: objects in our alternates are available to us refs_from_alternate: helper to use refs from alternates Conflicts: builtin/receive-pack.c
2011-03-22Fix sparse warningsLibravatar Stephen Boyd1-1/+1
Fix warnings from 'make check'. - These files don't include 'builtin.h' causing sparse to complain that cmd_* isn't declared: builtin/clone.c:364, builtin/fetch-pack.c:797, builtin/fmt-merge-msg.c:34, builtin/hash-object.c:78, builtin/merge-index.c:69, builtin/merge-recursive.c:22 builtin/merge-tree.c:341, builtin/mktag.c:156, builtin/notes.c:426 builtin/notes.c:822, builtin/pack-redundant.c:596, builtin/pack-refs.c:10, builtin/patch-id.c:60, builtin/patch-id.c:149, builtin/remote.c:1512, builtin/remote-ext.c:240, builtin/remote-fd.c:53, builtin/reset.c:236, builtin/send-pack.c:384, builtin/unpack-file.c:25, builtin/var.c:75 - These files have symbols which should be marked static since they're only file scope: submodule.c:12, diff.c:631, replace_object.c:92, submodule.c:13, submodule.c:14, trace.c:78, transport.c:195, transport-helper.c:79, unpack-trees.c:19, url.c:3, url.c:18, url.c:104, url.c:117, url.c:123, url.c:129, url.c:136, thread-utils.c:21, thread-utils.c:48 - These files redeclare symbols to be different types: builtin/index-pack.c:210, parse-options.c:564, parse-options.c:571, usage.c:49, usage.c:58, usage.c:63, usage.c:72 - These files use a literal integer 0 when they really should use a NULL pointer: daemon.c:663, fast-import.c:2942, imap-send.c:1072, notes-merge.c:362 While we're in the area, clean up some unused #includes in builtin files (mostly exec_cmd.h). Signed-off-by: Stephen Boyd <bebarino@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-20fetch-pack: use smaller handshake window for initial requestLibravatar Junio C Hamano1-2/+4
Start the initial request small by halving the INITIAL_FLUSH (we will try to stay one window ahead of the server, so we would end up giving twice as many "have" in flight at the very beginning). We may want to tweak these values even more, taking MTU into account. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Shawn Pearce <spearce@spearce.org>
2011-03-20fetch-pack: progressively use larger handshake windowsLibravatar Junio C Hamano1-1/+6
The client has to dig the history deeper when more recent parts of its history do not have any overlap with the server it is fetching from. Make the handshake window exponentially larger as we dig deeper, with a reasonable upper cap. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Shawn Pearce <spearce@spearce.org>
2011-03-20fetch-pack: factor out hardcoded handshake window sizeLibravatar Junio C Hamano1-3/+11
The "git fetch" client presents the most recent 32 commits it has to the server and gives a chance to the server to say "ok, we heard enough", and continues reporting what it has in chunks of 32 commits, digging its history down to older commits. Move the hardcoded size of the handshake window outside the code, so that we can tweak it more easily. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Shawn Pearce <spearce@spearce.org>
2011-03-17fetch-pack: objects in our alternates are available to usLibravatar Junio C Hamano1-0/+12
Use the helper function split from the receiving end of "git push" to allow the same optimization on the receiving end of "git fetch". Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Shawn O. Pearce <spearce@spearce.org>
2011-03-15fetch-pack: Implement no-done capabilityLibravatar Shawn O. Pearce1-3/+15
If enabled on the connection "multi_ack_detailed no-done" as a pair allows the remote upload-pack process to send a PACK down to the client as soon as a "ACK %s ready" message was also sent. Over git:// and ssh:// where a bi-directional stream is in place this has very little difference over the classical version that waits for the client to send a "done\n" line by itself. It does slightly reduce the latency involved to start the pack stream as there is one less round-trip from client->server required. Over smart HTTP this avoids needing to send a final RPC that has all of the prior common objects. Instead the server is able to return a pack as soon as its ready to. For many common users the smart HTTP fetch is now just 2 requests: GET .../info/refs, and a POST .../git-upload-pack to not only negotiate but also receive the pack stream. Only users who have more than 32 local unshared commits with the remote will need additional requests to negotiate a common merge base. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-14fetch-pack: Finish negotation if remote replies "ACK %s ready"Libravatar Shawn O. Pearce1-0/+2
If multi_ack_detailed was selected in the protocol capabilities (both client and server are >= Git 1.6.6) the upload-pack side will send "ACK %s ready" when it knows how to safely cut the graph and produce a reasonable pack for the want list that was already sent on the connection. Upon receiving "ACK %s ready" there is no point in looking at the remaining commits inside of rev_list. Sending additional "have %s" lines to the remote will not construct a smaller pack. It is unlikely a commit older than the current cut point will have a better delta base than the cut point itself has. The original design of this code had fetch-pack empty rev_list by marking a commit and its transitive ancestors COMMON whenever the remote side said "ACK %s {continue,common}" and skipping over any already COMMON commits during get_rev(). This approach does not work when most of rev_list is actually COMMON_REF, commits that are pointed to by a reference on the remote, which exist locally, and which have not yet been sent to the remote as a "have %s" line. Most of the common references are tags in the ref/tags namespace, using points in the commit graph that are more than 1 commit apart. In git.git itself, this is currently 340 tags, 339 of which point to commits in the commit graph. fetch-pack pushes all of these into rev_list, but is unable to mark them COMMON and discard during a remote's "ACK %s {continue,common}" because it does not parse through the entire parent chain. Not parsing the entire parent chain is an optimization to avoid walking back to the roots of the repository. Assuming the client is only following the remote (and does not make its own local commits), the client needs 11 rounds to spin through the entire list of tags (32 commits per round, ceil(339/32) == 11). Unfortunately the server knows on the first "have %s" line that it can produce a good pack, and does not need to see the remaining 320 tags in the other 10 rounds. Over git:// and ssh:// this isn't as bad as it sounds, the client is only transmitting an extra 16,000 bytes that it doesn't need to send. Over smart HTTP, the client must do an additional 10 HTTP POST requests, each of which incurs round-trip latency, and must upload the entire state vector of all known common objects. On the final POST request, this is 16 KiB worth of data. Fix all of this by clearing rev_list as soon as the remote side says it can construct a pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-08add packet tracing debug codeLibravatar Jeff King1-0/+2
This shows a trace of all packets coming in or out of a given program. This can help with debugging object negotiation or other protocol issues. To keep the code changes simple, we operate at the lowest level, meaning we don't necessarily understand what's in the packets. The one exception is a packet starting with "PACK", which causes us to skip that packet and turn off tracing (since the gigantic pack data will not be interesting to read, at least not in the trace format). We show both written and read packets. In the local case, this may mean you will see packets twice (written by the sender and read by the receiver). However, for cases where the other end is remote, this allows you to see the full conversation. Packet tracing can be enabled with GIT_TRACE_PACKET=<foo>, where <foo> takes the same arguments as GIT_TRACE. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-29commit: Add commit_list prefix in two function names.Libravatar Thiago Farina1-2/+2
Add commit_list prefix to insert_by_date function and to sort_by_date, so it's clear that these functions refer to commit_list structure. Signed-off-by: Thiago Farina <tfransosi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-22Move 'builtin-*' into a 'builtin/' subdirectoryLibravatar Linus Torvalds1-0/+976
This shrinks the top-level directory a bit, and makes it much more pleasant to use auto-completion on the thing. Instead of [torvalds@nehalem git]$ em buil<tab> Display all 180 possibilities? (y or n) [torvalds@nehalem git]$ em builtin-sh builtin-shortlog.c builtin-show-branch.c builtin-show-ref.c builtin-shortlog.o builtin-show-branch.o builtin-show-ref.o [torvalds@nehalem git]$ em builtin-shor<tab> builtin-shortlog.c builtin-shortlog.o [torvalds@nehalem git]$ em builtin-shortlog.c you get [torvalds@nehalem git]$ em buil<tab> [type] builtin/ builtin.h [torvalds@nehalem git]$ em builtin [auto-completes to] [torvalds@nehalem git]$ em builtin/sh<tab> [type] shortlog.c shortlog.o show-branch.c show-branch.o show-ref.c show-ref.o [torvalds@nehalem git]$ em builtin/sho [auto-completes to] [torvalds@nehalem git]$ em builtin/shor<tab> [type] shortlog.c shortlog.o [torvalds@nehalem git]$ em builtin/shortlog.c which doesn't seem all that different, but not having that annoying break in "Display all 180 possibilities?" is quite a relief. NOTE! If you do this in a clean tree (no object files etc), or using an editor that has auto-completion rules that ignores '*.o' files, you won't see that annoying 'Display all 180 possibilities?' message - it will just show the choices instead. I think bash has some cut-off around 100 choices or something. So the reason I see this is that I'm using an odd editory, and thus don't have the rules to cut down on auto-completion. But you can simulate that by using 'ls' instead, or something similar. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>