Age | Commit message (Collapse) | Author | Files | Lines |
|
"git fetch $there $refspec" that talks over protocol v2 can take
advantage of server-side ref filtering; the code has been extended
so that this mechanism triggers also when fetching with configured
refspec.
* bw/ref-prefix-for-configured-refspec: (38 commits)
fetch: generate ref-prefixes when using a configured refspec
refspec: consolidate ref-prefix generation logic
submodule: convert push_unpushed_submodules to take a struct refspec
remote: convert check_push_refs to take a struct refspec
remote: convert match_push_refs to take a struct refspec
http-push: store refspecs in a struct refspec
transport: remove transport_verify_remote_names
send-pack: store refspecs in a struct refspec
transport: convert transport_push to take a struct refspec
push: convert to use struct refspec
push: check for errors earlier
remote: convert match_explicit_refs to take a struct refspec
remote: convert get_ref_match to take a struct refspec
remote: convert query_refspecs to take a struct refspec
remote: convert apply_refspecs to take a struct refspec
remote: convert get_stale_heads to take a struct refspec
fetch: convert prune_refs to take a struct refspec
fetch: convert get_ref_map to take a struct refspec
fetch: convert do_fetch to take a struct refspec
refspec: remove the deprecated functions
...
|
|
Remove 'transprot_verify_remote_names()' because all callers have
migrated to using 'struct refspec' which performs the same checks in
'parse_refspec()'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Convert 'transport_push()' to take a 'struct refspec' as a
parameter instead of an array of strings which represent
refspecs.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The beginning of the next-gen transfer protocol.
* bw/protocol-v2: (35 commits)
remote-curl: don't request v2 when pushing
remote-curl: implement stateless-connect command
http: eliminate "# service" line when using protocol v2
http: don't always add Git-Protocol header
http: allow providing extra headers for http requests
remote-curl: store the protocol version the server responded with
remote-curl: create copy of the service name
pkt-line: add packet_buf_write_len function
transport-helper: introduce stateless-connect
transport-helper: refactor process_connect_service
transport-helper: remove name parameter
connect: don't request v2 when pushing
connect: refactor git_connect to only get the protocol version once
fetch-pack: support shallow requests
fetch-pack: perform a fetch using v2
upload-pack: introduce fetch server command
push: pass ref prefixes when pushing
fetch: pass ref prefixes when fetching
ls-remote: pass ref prefixes when requesting a remote's refs
transport: convert transport_get_remote_refs to take a list of ref prefixes
...
|
|
Teach ls-remote to optionally accept server options by specifying them
on the cmdline via '-o' or '--server-option'. These server options are
sent to the remote end when querying for the remote end's refs using
protocol version 2.
If communicating using a protocol other than v2 the provided options are
ignored and not sent to the remote end.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* bw/protocol-v2: (35 commits)
remote-curl: don't request v2 when pushing
remote-curl: implement stateless-connect command
http: eliminate "# service" line when using protocol v2
http: don't always add Git-Protocol header
http: allow providing extra headers for http requests
remote-curl: store the protocol version the server responded with
remote-curl: create copy of the service name
pkt-line: add packet_buf_write_len function
transport-helper: introduce stateless-connect
transport-helper: refactor process_connect_service
transport-helper: remove name parameter
connect: don't request v2 when pushing
connect: refactor git_connect to only get the protocol version once
fetch-pack: support shallow requests
fetch-pack: perform a fetch using v2
upload-pack: introduce fetch server command
push: pass ref prefixes when pushing
fetch: pass ref prefixes when fetching
ls-remote: pass ref prefixes when requesting a remote's refs
transport: convert transport_get_remote_refs to take a list of ref prefixes
...
|
|
Introduce the transport-helper capability 'stateless-connect'. This
capability indicates that the transport-helper can be requested to run
the 'stateless-connect' command which should attempt to make a
stateless connection with a remote end. Once established, the
connection can be used by the git client to communicate with
the remote end natively in a stateless-rpc manner as supported by
protocol v2. This means that the client must send everything the server
needs in a single request as the client must not assume any
state-storing on the part of the server or transport.
If a stateless connection cannot be established then the remote-helper
will respond in the same manner as the 'connect' command indicating that
the client should fallback to using the dumb remote-helper commands.
A future patch will implement the 'stateless-connect' capability in our
http remote-helper (remote-curl) so that protocol v2 can be used using
the http transport.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Teach transport_get_remote_refs() to accept a list of ref prefixes,
which will be sent to the server for use in filtering when using
protocol v2. (This list will be ignored when not using protocol v2.)
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The machinery to clone & fetch, which in turn involves packing and
unpacking objects, have been told how to omit certain objects using
the filtering mechanism introduced by the jh/object-filtering
topic, and also mark the resulting pack as a promisor pack to
tolerate missing objects, taking advantage of the mechanism
introduced by the jh/fsck-promisors topic.
* jh/partial-clone:
t5616: test bulk prefetch after partial fetch
fetch: inherit filter-spec from partial clone
t5616: end-to-end tests for partial clone
fetch-pack: restore save_commit_buffer after use
unpack-trees: batch fetching of missing blobs
clone: partial clone
partial-clone: define partial clone settings in config
fetch: support filters
fetch: refactor calculation of remote list
fetch-pack: test support excluding large blobs
fetch-pack: add --no-filter
fetch-pack, index-pack, transport: partial clone
upload-pack: add object filtering for partial clone
|
|
In preparation for implementing narrow/partial clone, the machinery
for checking object connectivity used by gc and fsck has been
taught that a missing object is OK when it is referenced by a
packfile specially marked as coming from trusted repository that
promises to make them available on-demand and lazily.
* jh/fsck-promisors:
gc: do not repack promisor packfiles
rev-list: support termination at promisor objects
sha1_file: support lazily fetching missing objects
introduce fetch-object: fetch one promisor object
index-pack: refactor writing of .keep files
fsck: support promisor objects as CLI argument
fsck: support referenced promisor objects
fsck: support refs pointing to promisor objects
fsck: introduce partialclone extension
extension.partialclone: introduce partial clone extension
|
|
Move the definition of the transport-specific functions provided by
transports, whether declared in transport.c or transport-helper.c, into
an internal header. This means that transport-using code (as opposed to
transport-declaring code) can no longer access these functions (without
importing the internal header themselves), making it clear that they
should use the transport_*() functions instead, and also allowing the
interface between the transport mechanism and an individual transport to
independently evolve.
This is superficially a reversal of commit 824d5776c3f2 ("Refactor
struct transport_ops inlined into struct transport", 2007-09-19).
However, the scope of the involved variables was neither affected nor
discussed in that commit, and I think that the advantages in making
those functions more private outweigh the advantages described in that
commit's commit message. A minor additional point is that the code has
gotten more complicated since then, in that the function-pointer
variables are potentially mutated twice (once initially and once if
transport_take_over() is invoked), increasing the value of corralling
them into their own struct.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
After commit 0d0bac67ce3b ("transport: drop support for git-over-rsync",
2016-02-01), no transport in Git populates the "push" entry in the
transport vtable. Remove this entry.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Introduce fetch-object, providing the ability to fetch one object from a
promisor remote.
This uses fetch-pack. To do this, the transport mechanism has been
updated with 2 flags, "from-promisor" to indicate that the resulting
pack comes from a promisor remote (and thus should be annotated as such
by index-pack), and "no-dependents" to indicate that only the objects
themselves need to be fetched (but fetching additional objects is
nevertheless safe).
Whenever "no-dependents" is used, fetch-pack will refrain from using any
object flags, because it is most likely invoked as part of a dynamic
object fetch by another Git command (which may itself use object flags).
An alternative to this is to leave fetch-pack alone, and instead update
the allocation of flags so that fetch-pack's flags never overlap with
any others, but this will end up shrinking the number of flags available
to nearly every other Git command (that is, every Git command that
accesses objects), so the approach in this commit was used instead.
This will be tested in a subsequent commit.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Breaking down the fields in the interface makes it easier to
change the backend of for_each_alternate_ref to something
that doesn't use "struct ref" internally.
The only field that callers actually look at is the oid,
anyway. The refname is kept in the interface as a plausible
thing for future code to want.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git submodule push" learned "--recurse-submodules=only option to
push submodules out without pushing the top-level superproject.
* bw/push-submodule-only:
push: add option to push only submodules
submodules: add RECURSE_SUBMODULES_ONLY value
transport: reformat flag #defines to be more readable
|
|
Finer-grained control of what protocols are allowed for transports
during clone/fetch/push have been enabled via a new configuration
mechanism.
* bw/transport-protocol-policy:
http: respect protocol.*.allow=user for http-alternates
transport: add from_user parameter to is_transport_allowed
http: create function to get curl allowed protocols
transport: add protocol policy config option
http: always warn if libcurl version is too old
lib-proto-disable: variable name fix
|
|
Teach push the --recurse-submodules=only option. This enables push
to recursively push all unpushed submodules while leaving the
superproject unpushed.
This is a desirable feature in a scenario where updates to the
superproject are handled automatically by some other means, perhaps
a tool like Gerrit code review. In this scenario, a developer could
make a change which spans multiple submodules and then push their
commits for code review. Upon completion of the code review, their
commits can be accepted and applied to their respective submodules
while the code review tool can then automatically update the
superproject to the most recent SHA1 of each submodule. This would
reduce the merge conflicts in the superproject that could occur if
multiple people are contributing to the same submodule.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
All of the #defines for the TRANSPORT_* flags are hardcoded to be
powers of two. This can be error prone when adding a new flag and
is difficult to read. Update these defines to instead use a shift
operation to generate the flags and reformat them.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a from_user parameter to is_transport_allowed() to allow http to be
able to distinguish between protocol restrictions for redirects versus
initial requests. CURLOPT_REDIR_PROTOCOLS can now be set differently
from CURLOPT_PROTOCOLS to disallow use of protocols with the "user"
policy in redirects.
This change allows callers to query if a transport protocol is allowed,
given that the caller knows that the protocol is coming from the user
(1) or not from the user (0) such as redirects in libcurl. If unknown a
-1 should be provided which falls back to reading
`GIT_PROTOCOL_FROM_USER` to determine if the protocol came from the
user.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Always warn if libcurl version is too old because:
1. Even without a protocol whitelist, newer versions of curl have all
non-standard protocols disabled by default.
2. A future patch will introduce default "known-good" and "known-bad"
protocols which are allowed/disallowed by 'is_transport_allowed'
which older version of libcurl can't respect.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git push" and "git fetch" reports from what old object to what new
object each ref was updated, using abbreviated refnames, and they
attempt to align the columns for this and other pieces of
information. The way these codepaths compute how many display
columns to allocate for the object names portion of this output has
been updated to match the recent "auto scale the default
abbreviation length" change.
* jc/abbrev-auto:
transport: compute summary-width dynamically
transport: allow summary-width to be computed dynamically
fetch: pass summary_width down the callchain
transport: pass summary_width down the callchain
|
|
Allow the default abbreviation length, which has historically been
7, to scale as the repository grows. The logic suggests to use 12
hexdigits for the Linux kernel, and 9 to 10 for Git itself.
* lt/abbrev-auto:
abbrev: auto size the default abbreviation
abbrev: prepare for new world order
abbrev: add FALLBACK_DEFAULT_ABBREV to prepare for auto sizing
|
|
Now we have identified three callchains that have a set of refs that
they want to show their <old, new> object names in an aligned output,
we can replace their reference to the constant TRANSPORT_SUMMARY_WIDTH
with a helper function call to transport_summary_width() that takes
the set of ref as a parameter. This step does not yet iterate over
the refs and compute, which is left as an exercise to the readers.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The existing "git fetch --depth=<n>" option was hard to use
correctly when making the history of an existing shallow clone
deeper. A new option, "--deepen=<n>", has been added to make this
easier to use. "git clone" also learned "--shallow-since=<date>"
and "--shallow-exclude=<tag>" options to make it easier to specify
"I am interested only in the recent N months worth of history" and
"Give me only the history since that version".
* nd/shallow-deepen: (27 commits)
fetch, upload-pack: --deepen=N extends shallow boundary by N commits
upload-pack: add get_reachable_list()
upload-pack: split check_unreachable() in two, prep for get_reachable_list()
t5500, t5539: tests for shallow depth excluding a ref
clone: define shallow clone boundary with --shallow-exclude
fetch: define shallow boundary with --shallow-exclude
upload-pack: support define shallow boundary by excluding revisions
refs: add expand_ref()
t5500, t5539: tests for shallow depth since a specific date
clone: define shallow clone boundary based on time with --shallow-since
fetch: define shallow boundary with --shallow-since
upload-pack: add deepen-since to cut shallow repos based on time
shallow.c: implement a generic shallow boundary finder based on rev-list
fetch-pack: use a separate flag for fetch in deepening mode
fetch-pack.c: mark strings for translating
fetch-pack: use a common function for verbose printing
fetch-pack: use skip_prefix() instead of starts_with()
upload-pack: move rev-list code out of check_non_tip()
upload-pack: make check_non_tip() clean things up on error
upload-pack: tighten number parsing at "deepen" lines
...
|
|
We'll be introducing a new way to decide the default abbreviation
length by initialising DEFAULT_ABBREV to -1 to signal the first call
to "find unique abbreviation" codepath to compute a reasonable value
based on the number of objects we have to avoid collisions.
We have long relied on DEFAULT_ABBREV being a positive concrete
value that is used as the abbreviation length when no extra
configuration or command line option has overridden it. Some
codepaths wants to use such a positive concrete default value
even before making their first request to actually trigger the
computation for the auto sized default.
Introduce FALLBACK_DEFAULT_ABBREV and use it to the code that
attempts to align the report from "git fetch". For now, this
macro is also used to initialize the default_abbrev variable,
but the auto-sizing code will use -1 and then use the value of
FALLBACK_DEFAULT_ABBREV as the starting point of auto-sizing.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This implements everything that is required on the client side to make use
of push options from the porcelain push command.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In git-fetch, --depth argument is always relative with the latest
remote refs. This makes it a bit difficult to cover this use case,
where the user wants to make the shallow history, say 3 levels
deeper. It would work if remote refs have not moved yet, but nobody
can guarantee that, especially when that use case is performed a
couple months after the last clone or "git fetch --depth". Also,
modifying shallow boundary using --depth does not work well with
clones created by --since or --not.
This patch fixes that. A new argument --deepen=<N> will add <N> more (*)
parent commits to the current history regardless of where remote refs
are.
Have/Want negotiation is still respected. So if remote refs move, the
server will send two chunks: one between "have" and "want" and another
to extend shallow history. In theory, the client could send no "want"s
in order to get the second chunk only. But the protocol does not allow
that. Either you send no want lines, which means ls-remote; or you
have to send at least one want line that carries deep-relative to the
server..
The main work was done by Dongcan Jiang. I fixed it up here and there.
And of course all the bugs belong to me.
(*) We could even support --deepen=<N> where <N> is negative. In that
case we can cut some history from the shallow clone. This operation
(and --depth=<shorter depth>) does not require interaction with remote
side (and more complicated to implement as a result).
Helped-by: Duy Nguyen <pclouds@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Sometimes it is necessary to force IPv4-only or IPv6-only operation
on networks where name lookups may return a non-routable address and
stall remote operations.
The ssh(1) command has an equivalent switches which we may pass when
we run them. There may be old ssh(1) implementations out there
which do not support these switches; they should report the
appropriate error in that case.
rsync support is untouched for now since it is deprecated and
scheduled to be removed.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Reviewed-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Use struct object_id in three fields in struct ref and convert all the
necessary places that use it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
|
|
|
|
|
|
The current callers only want to die when their transport is
prohibited. But future callers want to query the mechanism
without dying.
Let's break out a few query functions, and also save the
results in a static list so we don't have to re-parse for
each query.
Based-on-a-patch-by: Blake Burkhart <bburky@bburky.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
If we are cloning an untrusted remote repository into a
sandbox, we may also want to fetch remote submodules in
order to get the complete view as intended by the other
side. However, that opens us up to attacks where a malicious
user gets us to clone something they would not otherwise
have access to (this is not necessarily a problem by itself,
but we may then act on the cloned contents in a way that
exposes them to the attacker).
Ideally such a setup would sandbox git entirely away from
high-value items, but this is not always practical or easy
to set up (e.g., OS network controls may block multiple
protocols, and we would want to enable some but not others).
We can help this case by providing a way to restrict
particular protocols. We use a whitelist in the environment.
This is more annoying to set up than a blacklist, but
defaults to safety if the set of protocols git supports
grows). If no whitelist is specified, we continue to default
to allowing all protocols (this is an "unsafe" default, but
since the minority of users will want this sandboxing
effect, it is the only sensible one).
A note on the tests: ideally these would all be in a single
test file, but the git-daemon and httpd test infrastructure
is an all-or-nothing proposition rather than a test-by-test
prerequisite. By putting them all together, we would be
unable to test the file-local code on machines without
apache.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a new flag --sign=true (or --sign=false), which means the same
thing as the original --signed (or --no-signed). Give it a third
value --sign=if-asked to tell push and send-pack to send a push
certificate if and only if the server advertised a push cert nonce.
If not, warn the user that their push may not be as secure as they
thought.
Signed-off-by: Dave Borowitz <dborowitz@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This field was set in transport_set_option, but never read in the push
code. The push code basically ignores the smart_options field
entirely, and derives its options from the flags arguments to the
push* callbacks. Note that in git_transport_push there are already
several args set from flags that have no corresponding field in
git_transport_options; after this change, push_cert is just like
those.
Signed-off-by: Dave Borowitz <dborowitz@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a command line argument to the git push command to request atomic
pushes.
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While signed tags and commits assert that the objects thusly signed
came from you, who signed these objects, there is not a good way to
assert that you wanted to have a particular object at the tip of a
particular branch. My signing v2.0.1 tag only means I want to call
the version v2.0.1, and it does not mean I want to push it out to my
'master' branch---it is likely that I only want it in 'maint', so
the signature on the object alone is insufficient.
The only assurance to you that 'maint' points at what I wanted to
place there comes from your trust on the hosting site and my
authentication with it, which cannot easily audited later.
Introduce a mechanism that allows you to sign a "push certificate"
(for the lack of better name) every time you push, asserting that
what object you are pushing to update which ref that used to point
at what other object. Think of it as a cryptographic protection for
ref updates, similar to signed tags/commits but working on an
orthogonal axis.
The basic flow based on this mechanism goes like this:
1. You push out your work with "git push --signed".
2. The sending side learns where the remote refs are as usual,
together with what protocol extension the receiving end
supports. If the receiving end does not advertise the protocol
extension "push-cert", an attempt to "git push --signed" fails.
Otherwise, a text file, that looks like the following, is
prepared in core:
certificate version 0.1
pusher Junio C Hamano <gitster@pobox.com> 1315427886 -0700
7339ca65... 21580ecb... refs/heads/master
3793ac56... 12850bec... refs/heads/next
The file begins with a few header lines, which may grow as we
gain more experience. The 'pusher' header records the name of
the signer (the value of user.signingkey configuration variable,
falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the
certificate generation. After the header, a blank line follows,
followed by a copy of the protocol message lines.
Each line shows the old and the new object name at the tip of
the ref this push tries to update, in the way identical to how
the underlying "git push" protocol exchange tells the ref
updates to the receiving end (by recording the "old" object
name, the push certificate also protects against replaying). It
is expected that new command packet types other than the
old-new-refname kind will be included in push certificate in the
same way as would appear in the plain vanilla command packets in
unsigned pushes.
The user then is asked to sign this push certificate using GPG,
formatted in a way similar to how signed tag objects are signed,
and the result is sent to the other side (i.e. receive-pack).
In the protocol exchange, this step comes immediately before the
sender tells what the result of the push should be, which in
turn comes before it sends the pack data.
3. When the receiving end sees a push certificate, the certificate
is written out as a blob. The pre-receive hook can learn about
the certificate by checking GIT_PUSH_CERT environment variable,
which, if present, tells the object name of this blob, and make
the decision to allow or reject this push. Additionally, the
post-receive hook can also look at the certificate, which may be
a good place to log all the received certificates for later
audits.
Because a push certificate carry the same information as the usual
command packets in the protocol exchange, we can omit the latter
when a push certificate is in use and reduce the protocol overhead.
This however is not included in this patch to make it easier to
review (in other words, the series at this step should never be
released without the remainder of the series, as it implements an
interim protocol that will be incompatible with the final one).
As such, the documentation update for the protocol is left out of
this step.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The same steps are done as in when --update-shallow is not given. The
only difference is we now add all shallow commits in "ours" and
"theirs" to .git/shallow (aka "step 8").
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Cloning from a shallow repository does not follow the "8 steps for new
.git/shallow" because if it does we need to get through step 6 for all
refs. That means commit walking down to the bottom.
Instead the rule to create .git/shallow is simpler and, more
importantly, cheap: if a shallow commit is found in the pack, it's
probably used (i.e. reachable from some refs), so we add it. Others
are dropped.
One may notice this method seems flawed by the word "probably". A
shallow commit may not be reachable from any refs at all if it's
attached to an object island (a group of objects that are not
reachable by any refs).
If that object island is not complete, a new fetch request may send
more objects to connect it to some ref. At that time, because we
incorrectly installed the shallow commit in this island, the user will
not see anything after that commit (fsck is still ok). This is not
desired.
Given that object islands are rare (C Git never sends such islands for
security reasons) and do not really harm the repository integrity, a
tradeoff is made to surprise the user occasionally but work faster
everyday.
A new option --strict could be added later that follows exactly the 8
steps. "git prune" can also learn to remove dangling objects _and_ the
shallow commits that are attached to them from .git/shallow.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The auto-tag-following code in "git fetch" tries to reuse the same
transport twice when the serving end does not cooperate and does
not give tags that point to commits that are asked for as part of
the primary transfer. Unfortunately, Git-aware transport helper
interface is not designed to be used more than once, hence this
does not work over smart-http transfer.
* jc/transport-do-not-use-connect-twice-in-fetch:
builtin/fetch.c: Fix a sparse warning
fetch: work around "transport-take-over" hack
fetch: refactor code that fetches leftover tags
fetch: refactor code that prepares a transport
fetch: rename file-scope global "transport" to "gtransport"
t5802: add test for connect helper
|
|
A Git-aware "connect" transport allows the "transport_take_over" to
redirect generic transport requests like fetch(), push_refs() and
get_refs_list() to the native Git transport handling methods. The
take-over process replaces transport->data with a fake data that
these method implementations understand.
While this hack works OK for a single request, it breaks when the
transport needs to make more than one requests. transport->data
that used to hold necessary information for the specific helper to
work correctly is destroyed during the take-over process.
One codepath that this matters is "git fetch" in auto-follow mode;
when it does not get all the tags that ought to point at the history
it got (which can be determined by looking at the peeled tags in the
initial advertisement) from the primary transfer, it internally
makes a second request to complete the fetch. Because "take-over"
hack has already destroyed the data necessary to talk to the
transport helper by the time this happens, the second request cannot
make a request to the helper to make another connection to fetch
these additional tags.
Mark such a transport as "cannot_reuse", and use a separate
transport to perform the backfill fetch in order to work around
this breakage.
Note that this problem does not manifest itself when running t5802,
because our upload-pack gives you all the necessary auto-followed
tags during the primary transfer. You would need to step through
"git fetch" in a debugger, stop immediately after the primary
transfer finishes and writes these auto-followed tags, remove the
tag references and repack/prune the repository to convince the
"find-non-local-tags" procedure that the primary transfer failed to
give us all the necessary tags, and then let it continue, in order
to trigger the bug in the secondary transfer this patch fixes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This plugs the push_cas_option data collected by the command line
option parser to the transport system with a new function
apply_push_cas(), which is called after match_push_refs() has
already been called.
At this point, we know which remote we are talking to, and what
remote refs we are going to update, so we can fill in the details
that may have been missing from the command line, such as
(1) what abbreviated refname the user gave us matches the actual
refname at the remote; and
(2) which remote-tracking branch in our local repository to read
the value of the object to expect at the remote.
to populate the old_sha1_expect[] field of each of the remote ref.
As stated in the documentation, the use of remote-tracking branch
as the default is a tentative one, and we may come up with a better
logic as we gain experience.
Still nobody uses this information, which is the topic of the next
patch.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The definition of "struct ref" in "cache.h", a header file so
central to the system, always confused me. This structure is not
about the local ref used by sha1-name API to name local objects.
It is what refspecs are expanded into, after finding out what refs
the other side has, to define what refs are updated after object
transfer succeeds to what values. It belongs to "remote.h" together
with "struct refspec".
While we are at it, also move the types and functions related to the
Git transport connection to a new header file connect.h
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* ph/builtin-srcs-are-in-subdir-these-days:
fix "builtin-*" references to be "builtin/*"
|
|
Documentation and some comments still refer to files in builtin/
as 'builtin-*.[cho]'. Update these to show the correct location.
Signed-off-by: Phil Hord <hordp@cisco.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Assisted-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In order to make sure the cloned repository is good, we run "rev-list
--objects --not --all $new_refs" on the repository. This is expensive
on large repositories. This patch attempts to mitigate the impact in
this special case.
In the "good" clone case, we only have one pack. If all of the
following are met, we can be sure that all objects reachable from the
new refs exist, which is the intention of running "rev-list ...":
- all refs point to an object in the pack
- there are no dangling pointers in any object in the pack
- no objects in the pack point to objects outside the pack
The second and third checks can be done with the help of index-pack as
a slight variation of --strict check (which introduces a new condition
for the shortcut: pack transfer must be used and the number of objects
large enough to call index-pack). The first is checked in
check_everything_connected after we get an "ok" from index-pack.
"index-pack + new checks" is still faster than the current "index-pack
+ rev-list", which is the whole point of this patch. If any of the
conditions fail, we fall back to the good old but expensive "rev-list
..". In that case it's even more expensive because we have to pay for
the new checks in index-pack. But that should only happen when the
other side is either buggy or malicious.
Cloning linux-2.6 over file://
before after
real 3m25.693s 2m53.050s
user 5m2.037s 4m42.396s
sys 0m13.750s 0m16.574s
A more realistic test with ssh:// over wireless
before after
real 11m26.629s 10m4.213s
user 5m43.196s 5m19.444s
sys 0m35.812s 0m37.630s
This shortcut is not applied to shallow clones, partly because shallow
clones should have no more objects than a usual fetch and the cost of
rev-list is acceptable, partly to avoid dealing with corner cases when
grafting is involved.
This shortcut does not apply to unpack-objects code path either
because the number of objects must be small in order to trigger that
code path.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|