summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
authorLibravatar Junio C Hamano <gitster@pobox.com>2020-06-25 12:27:47 -0700
committerLibravatar Junio C Hamano <gitster@pobox.com>2020-06-25 12:27:47 -0700
commit34e849b05a454a2c6487f8fbfa68c39932d22730 (patch)
tree7a71a357e9f7e2c977849224ca18b2cb9a4461ea /Documentation
parentMerge branch 'ss/submodule-set-branch-in-c' (diff)
parentupload-pack: fix a sparse '0 as NULL pointer' warning (diff)
downloadtgif-34e849b05a454a2c6487f8fbfa68c39932d22730.tar.xz
Merge branch 'jt/cdn-offload'
The "fetch/clone" protocol has been updated to allow the server to instruct the clients to grab pre-packaged packfile(s) in addition to the packed object data coming over the wire. * jt/cdn-offload: upload-pack: fix a sparse '0 as NULL pointer' warning upload-pack: send part of packfile response as uri fetch-pack: support more than one pack lockfile upload-pack: refactor reading of pack-objects out Documentation: add Packfile URIs design doc Documentation: order protocol v2 sections http-fetch: support fetching packfiles by URL http-fetch: refactor into function http: refactor finish_http_pack_request() http: use --stdin when indexing dumb HTTP pack
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/git-http-fetch.txt9
-rw-r--r--Documentation/technical/packfile-uri.txt78
-rw-r--r--Documentation/technical/protocol-v2.txt48
3 files changed, 124 insertions, 11 deletions
diff --git a/Documentation/git-http-fetch.txt b/Documentation/git-http-fetch.txt
index 666b042679..4deb4893f5 100644
--- a/Documentation/git-http-fetch.txt
+++ b/Documentation/git-http-fetch.txt
@@ -9,7 +9,7 @@ git-http-fetch - Download from a remote Git repository via HTTP
SYNOPSIS
--------
[verse]
-'git http-fetch' [-c] [-t] [-a] [-d] [-v] [-w filename] [--recover] [--stdin] <commit> <url>
+'git http-fetch' [-c] [-t] [-a] [-d] [-v] [-w filename] [--recover] [--stdin | --packfile=<hash> | <commit>] <url>
DESCRIPTION
-----------
@@ -40,6 +40,13 @@ commit-id::
<commit-id>['\t'<filename-as-in--w>]
+--packfile=<hash>::
+ Instead of a commit id on the command line (which is not expected in
+ this case), 'git http-fetch' fetches the packfile directly at the given
+ URL and uses index-pack to generate corresponding .idx and .keep files.
+ The hash is used to determine the name of the temporary file and is
+ arbitrary. The output of index-pack is printed to stdout.
+
--recover::
Verify that everything reachable from target is fetched. Used after
an earlier fetch is interrupted.
diff --git a/Documentation/technical/packfile-uri.txt b/Documentation/technical/packfile-uri.txt
new file mode 100644
index 0000000000..318713abc3
--- /dev/null
+++ b/Documentation/technical/packfile-uri.txt
@@ -0,0 +1,78 @@
+Packfile URIs
+=============
+
+This feature allows servers to serve part of their packfile response as URIs.
+This allows server designs that improve scalability in bandwidth and CPU usage
+(for example, by serving some data through a CDN), and (in the future) provides
+some measure of resumability to clients.
+
+This feature is available only in protocol version 2.
+
+Protocol
+--------
+
+The server advertises the `packfile-uris` capability.
+
+If the client then communicates which protocols (HTTPS, etc.) it supports with
+a `packfile-uris` argument, the server MAY send a `packfile-uris` section
+directly before the `packfile` section (right after `wanted-refs` if it is
+sent) containing URIs of any of the given protocols. The URIs point to
+packfiles that use only features that the client has declared that it supports
+(e.g. ofs-delta and thin-pack). See protocol-v2.txt for the documentation of
+this section.
+
+Clients should then download and index all the given URIs (in addition to
+downloading and indexing the packfile given in the `packfile` section of the
+response) before performing the connectivity check.
+
+Server design
+-------------
+
+The server can be trivially made compatible with the proposed protocol by
+having it advertise `packfile-uris`, tolerating the client sending
+`packfile-uris`, and never sending any `packfile-uris` section. But we should
+include some sort of non-trivial implementation in the Minimum Viable Product,
+at least so that we can test the client.
+
+This is the implementation: a feature, marked experimental, that allows the
+server to be configured by one or more `uploadpack.blobPackfileUri=<sha1>
+<uri>` entries. Whenever the list of objects to be sent is assembled, all such
+blobs are excluded, replaced with URIs. The client will download those URIs,
+expecting them to each point to packfiles containing single blobs.
+
+Client design
+-------------
+
+The client has a config variable `fetch.uriprotocols` that determines which
+protocols the end user is willing to use. By default, this is empty.
+
+When the client downloads the given URIs, it should store them with "keep"
+files, just like it does with the packfile in the `packfile` section. These
+additional "keep" files can only be removed after the refs have been updated -
+just like the "keep" file for the packfile in the `packfile` section.
+
+The division of work (initial fetch + additional URIs) introduces convenient
+points for resumption of an interrupted clone - such resumption can be done
+after the Minimum Viable Product (see "Future work").
+
+Future work
+-----------
+
+The protocol design allows some evolution of the server and client without any
+need for protocol changes, so only a small-scoped design is included here to
+form the MVP. For example, the following can be done:
+
+ * On the server, more sophisticated means of excluding objects (e.g. by
+ specifying a commit to represent that commit and all objects that it
+ references).
+ * On the client, resumption of clone. If a clone is interrupted, information
+ could be recorded in the repository's config and a "clone-resume" command
+ can resume the clone in progress. (Resumption of subsequent fetches is more
+ difficult because that must deal with the user wanting to use the repository
+ even after the fetch was interrupted.)
+
+There are some possible features that will require a change in protocol:
+
+ * Additional HTTP headers (e.g. authentication)
+ * Byte range support
+ * Different file formats referenced by URIs (e.g. raw object)
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 3996d70891..5852f499a6 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -325,13 +325,26 @@ included in the client's request:
indicating its sideband (1, 2, or 3), and the server may send "0005\2"
(a PKT-LINE of sideband 2 with no payload) as a keepalive packet.
+If the 'packfile-uris' feature is advertised, the following argument
+can be included in the client's request as well as the potential
+addition of the 'packfile-uris' section in the server's response as
+explained below.
+
+ packfile-uris <comma-separated list of protocols>
+ Indicates to the server that the client is willing to receive
+ URIs of any of the given protocols in place of objects in the
+ sent packfile. Before performing the connectivity check, the
+ client should download from all given URIs. Currently, the
+ protocols supported are "http" and "https".
+
The response of `fetch` is broken into a number of sections separated by
delimiter packets (0001), with each section beginning with its section
-header.
+header. Most sections are sent only when the packfile is sent.
- output = *section
- section = (acknowledgments | shallow-info | wanted-refs | packfile)
- (flush-pkt | delim-pkt)
+ output = acknowledgements flush-pkt |
+ [acknowledgments delim-pkt] [shallow-info delim-pkt]
+ [wanted-refs delim-pkt] [packfile-uris delim-pkt]
+ packfile flush-pkt
acknowledgments = PKT-LINE("acknowledgments" LF)
(nak | *ack)
@@ -349,13 +362,17 @@ header.
*PKT-LINE(wanted-ref LF)
wanted-ref = obj-id SP refname
+ packfile-uris = PKT-LINE("packfile-uris" LF) *packfile-uri
+ packfile-uri = PKT-LINE(40*(HEXDIGIT) SP *%x20-ff LF)
+
packfile = PKT-LINE("packfile" LF)
*PKT-LINE(%x01-03 *%x00-ff)
acknowledgments section
- * If the client determines that it is finished with negotiations
- by sending a "done" line, the acknowledgments sections MUST be
- omitted from the server's response.
+ * If the client determines that it is finished with negotiations by
+ sending a "done" line (thus requiring the server to send a packfile),
+ the acknowledgments sections MUST be omitted from the server's
+ response.
* Always begins with the section header "acknowledgments"
@@ -406,9 +423,6 @@ header.
which the client has not indicated was shallow as a part of
its request.
- * This section is only included if a packfile section is also
- included in the response.
-
wanted-refs section
* This section is only included if the client has requested a
ref using a 'want-ref' line and if a packfile section is also
@@ -422,6 +436,20 @@ header.
* The server MUST NOT send any refs which were not requested
using 'want-ref' lines.
+ packfile-uris section
+ * This section is only included if the client sent
+ 'packfile-uris' and the server has at least one such URI to
+ send.
+
+ * Always begins with the section header "packfile-uris".
+
+ * For each URI the server sends, it sends a hash of the pack's
+ contents (as output by git index-pack) followed by the URI.
+
+ * The hashes are 40 hex characters long. When Git upgrades to a new
+ hash algorithm, this might need to be updated. (It should match
+ whatever index-pack outputs after "pack\t" or "keep\t".
+
packfile section
* This section is only included if the client has sent 'want'
lines in its request and either requested that no more