summaryrefslogtreecommitdiff
path: root/Documentation/technical
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/technical')
-rw-r--r--Documentation/technical/api-config.txt2
-rw-r--r--Documentation/technical/api-parse-options.txt4
-rw-r--r--Documentation/technical/api-ref-iteration.txt2
-rw-r--r--Documentation/technical/api-trace2.txt211
-rw-r--r--Documentation/technical/commit-graph-format.txt11
-rw-r--r--Documentation/technical/commit-graph.txt210
-rw-r--r--Documentation/technical/directory-rename-detection.txt4
-rw-r--r--Documentation/technical/pack-protocol.txt8
-rw-r--r--Documentation/technical/protocol-v2.txt52
9 files changed, 360 insertions, 144 deletions
diff --git a/Documentation/technical/api-config.txt b/Documentation/technical/api-config.txt
index fa39ac9d71..7d20716c32 100644
--- a/Documentation/technical/api-config.txt
+++ b/Documentation/technical/api-config.txt
@@ -229,7 +229,7 @@ A `config_set` can be used to construct an in-memory cache for
config-like files that the caller specifies (i.e., files like `.gitmodules`,
`~/.gitconfig` etc.). For example,
----------------------------------------
+----------------------------------------
struct config_set gm_config;
git_configset_init(&gm_config);
int b;
diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt
index 2b036d7838..2e2e7c10c6 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -198,8 +198,10 @@ There are some macros to easily define options:
The filename will be prefixed by passing the filename along with
the prefix argument of `parse_options()` to `prefix_filename()`.
-`OPT_ARGUMENT(long, description)`::
+`OPT_ARGUMENT(long, &int_var, description)`::
Introduce a long-option argument that will be kept in `argv[]`.
+ If this option was seen, `int_var` will be set to one (except
+ if a `NULL` pointer was passed).
`OPT_NUMBER_CALLBACK(&var, description, func_ptr)`::
Recognize numerical options like -123 and feed the integer as
diff --git a/Documentation/technical/api-ref-iteration.txt b/Documentation/technical/api-ref-iteration.txt
index 46c3d5c355..ad9d019ff9 100644
--- a/Documentation/technical/api-ref-iteration.txt
+++ b/Documentation/technical/api-ref-iteration.txt
@@ -54,7 +54,7 @@ this:
do not do this you will get an error for each ref that it does not point
to a valid object.
-Note: As a side-effect of this you can not safely assume that all
+Note: As a side-effect of this you cannot safely assume that all
objects you lookup are available in superproject. All submodule objects
will be available the same way as the superprojects objects.
diff --git a/Documentation/technical/api-trace2.txt b/Documentation/technical/api-trace2.txt
index 2de565fa3d..71eb081fed 100644
--- a/Documentation/technical/api-trace2.txt
+++ b/Documentation/technical/api-trace2.txt
@@ -22,21 +22,41 @@ Targets are defined using a VTable allowing easy extension to other
formats in the future. This might be used to define a binary format,
for example.
+Trace2 is controlled using `trace2.*` config values in the system and
+global config files and `GIT_TRACE2*` environment variables. Trace2 does
+not read from repo local or worktree config files or respect `-c`
+command line config settings.
+
== Trace2 Targets
Trace2 defines the following set of Trace2 Targets.
Format details are given in a later section.
-`GIT_TR2` (NORMAL)::
+=== The Normal Format Target
+
+The normal format target is a tradition printf format and similar
+to GIT_TRACE format. This format is enabled with the `GIT_TRACE2`
+environment variable or the `trace2.normalTarget` system or global
+config setting.
+
+For example
- a simple printf format like GIT_TRACE.
-+
------------
-$ export GIT_TR2=~/log.normal
+$ export GIT_TRACE2=~/log.normal
$ git version
git version 2.20.1.155.g426c96fcdb
------------
-+
+
+or
+
+------------
+$ git config --global trace2.normalTarget ~/log.normal
+$ git version
+git version 2.20.1.155.g426c96fcdb
+------------
+
+yields
+
------------
$ cat ~/log.normal
12:28:42.620009 common-main.c:38 version 2.20.1.155.g426c96fcdb
@@ -46,76 +66,86 @@ $ cat ~/log.normal
12:28:42.621250 trace2/tr2_tgt_normal.c:124 atexit elapsed:0.001265 code:0
------------
-`GIT_TR2_PERF` (PERF)::
+=== The Performance Format Target
+
+The performance format target (PERF) is a column-based format to
+replace GIT_TRACE_PERFORMANCE and is suitable for development and
+testing, possibly to complement tools like gprof. This format is
+enabled with the `GIT_TRACE2_PERF` environment variable or the
+`trace2.perfTarget` system or global config setting.
+
+For example
- a column-based format to replace GIT_TRACE_PERFORMANCE suitable for
- development and testing, possibly to complement tools like gprof.
-+
------------
-$ export GIT_TR2_PERF=~/log.perf
+$ export GIT_TRACE2_PERF=~/log.perf
$ git version
git version 2.20.1.155.g426c96fcdb
------------
-+
+
+or
+
+------------
+$ git config --global trace2.perfTarget ~/log.perf
+$ git version
+git version 2.20.1.155.g426c96fcdb
+------------
+
+yields
+
------------
$ cat ~/log.perf
12:28:42.620675 common-main.c:38 | d0 | main | version | | | | | 2.20.1.155.g426c96fcdb
-12:28:42.621001 common-main.c:39 | d0 | main | start | | | | | git version
+12:28:42.621001 common-main.c:39 | d0 | main | start | | 0.001173 | | | git version
12:28:42.621111 git.c:432 | d0 | main | cmd_name | | | | | version (version)
12:28:42.621225 git.c:662 | d0 | main | exit | | 0.001227 | | | code:0
12:28:42.621259 trace2/tr2_tgt_perf.c:211 | d0 | main | atexit | | 0.001265 | | | code:0
------------
-`GIT_TR2_EVENT` (EVENT)::
+=== The Event Format Target
+
+The event format target is a JSON-based format of event data suitable
+for telemetry analysis. This format is enabled with the `GIT_TRACE2_EVENT`
+environment variable or the `trace2.eventTarget` system or global config
+setting.
+
+For example
- a JSON-based format of event data suitable for telemetry analysis.
-+
------------
-$ export GIT_TR2_EVENT=~/log.event
+$ export GIT_TRACE2_EVENT=~/log.event
$ git version
git version 2.20.1.155.g426c96fcdb
------------
-+
-------------
-$ cat ~/log.event
-{"event":"version","sid":"1547659722619736-11614","thread":"main","time":"2019-01-16 17:28:42.620713","file":"common-main.c","line":38,"evt":"1","exe":"2.20.1.155.g426c96fcdb"}
-{"event":"start","sid":"1547659722619736-11614","thread":"main","time":"2019-01-16 17:28:42.621027","file":"common-main.c","line":39,"argv":["git","version"]}
-{"event":"cmd_name","sid":"1547659722619736-11614","thread":"main","time":"2019-01-16 17:28:42.621122","file":"git.c","line":432,"name":"version","hierarchy":"version"}
-{"event":"exit","sid":"1547659722619736-11614","thread":"main","time":"2019-01-16 17:28:42.621236","file":"git.c","line":662,"t_abs":0.001227,"code":0}
-{"event":"atexit","sid":"1547659722619736-11614","thread":"main","time":"2019-01-16 17:28:42.621268","file":"trace2/tr2_tgt_event.c","line":163,"t_abs":0.001265,"code":0}
-------------
-== Enabling a Target
+or
-A Trace2 Target is enabled when the corresponding environment variable
-(`GIT_TR2`, `GIT_TR2_PERF`, or `GIT_TR2_EVENT`) is set. The following
-values are recognized.
-
-`0`::
-`false`::
-
- Disables the target.
-
-`1`::
-`true`::
-
- Enables the target and writes stream to `STDERR`.
+------------
+$ git config --global trace2.eventTarget ~/log.event
+$ git version
+git version 2.20.1.155.g426c96fcdb
+------------
-`[2-9]`::
+yields
- Enables the target and writes to the already opened file descriptor.
+------------
+$ cat ~/log.event
+{"event":"version","sid":"sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.620713Z","file":"common-main.c","line":38,"evt":"1","exe":"2.20.1.155.g426c96fcdb"}
+{"event":"start","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621027Z","file":"common-main.c","line":39,"t_abs":0.001173,"argv":["git","version"]}
+{"event":"cmd_name","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621122Z","file":"git.c","line":432,"name":"version","hierarchy":"version"}
+{"event":"exit","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621236Z","file":"git.c","line":662,"t_abs":0.001227,"code":0}
+{"event":"atexit","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621268Z","file":"trace2/tr2_tgt_event.c","line":163,"t_abs":0.001265,"code":0}
+------------
-`<absolute-pathname>`::
+=== Enabling a Target
- Enables the target, opens and writes to the file in append mode.
+To enable a target, set the corresponding environment variable or
+system or global config value to one of the following:
-`af_unix:[<socket_type>:]<absolute-pathname>`::
+include::../trace2-target-values.txt[]
- Enables the target, opens and writes to a Unix Domain Socket
- (on platforms that support them).
-+
-Socket type can be either `stream` or `dgram`. If the socket type is
-omitted, Git will try both.
+If the target already exists and is a directory, the traces will be
+written to files (one per process) underneath the given directory. They
+will be named according to the last component of the SID (optionally
+followed by a counter to avoid filename collisions).
== Trace2 API
@@ -160,17 +190,23 @@ purposes.
These are concerned with the lifetime of the overall git process.
+`void trace2_initialize_clock()`::
+
+ Initialize the Trace2 start clock and nothing else. This should
+ be called at the very top of main() to capture the process start
+ time and reduce startup order dependencies.
+
`void trace2_initialize()`::
Determines if any Trace2 Targets should be enabled and
- initializes the Trace2 facility. This includes starting the
- elapsed time clocks and thread local storage (TLS).
+ initializes the Trace2 facility. This includes setting up the
+ Trace2 thread local storage (TLS).
+
This function emits a "version" message containing the version of git
and the Trace2 protocol.
+
This function should be called from `main()` as early as possible in
-the life of the process.
+the life of the process after essential process initialization.
`int trace2_is_enabled()`::
@@ -237,15 +273,16 @@ significantly affects program performance or behavior, such as
Emits a "def_param" messages for "important" configuration
settings.
+
-The environment variable `GIT_TR2_CONFIG_PARAMS` can be set to a
+The environment variable `GIT_TRACE2_CONFIG_PARAMS` or the `trace2.configParams`
+config value can be set to a
list of patterns of important configuration settings, for example:
`core.*,remote.*.url`. This function will iterate over all config
settings and emit a "def_param" message for each match.
`void trace2_cmd_set_config(const char *key, const char *value)`::
- Emits a "def_param" message for a specific configuration
- setting IFF it matches the `GIT_TR2_CONFIG_PARAMS` pattern.
+ Emits a "def_param" message for a new or updated key/value
+ pair IF `key` is considered important.
+
This is used to hook into `git_config_set()` and catch any
configuration changes and update a value previously reported by
@@ -412,9 +449,6 @@ recursive tree walk.
=== NORMAL Format
-NORMAL format is enabled when the `GIT_TR2` environment variable is
-set.
-
Events are written as lines of the form:
------------
@@ -431,8 +465,8 @@ Events are written as lines of the form:
Note that this may contain embedded LF or CRLF characters that are
not escaped, so the event may spill across multiple lines.
-If `GIT_TR2_BRIEF` is true, the `time`, `filename`, and `line` fields
-are omitted.
+If `GIT_TRACE2_BRIEF` or `trace2.normalBrief` is true, the `time`, `filename`,
+and `line` fields are omitted.
This target is intended to be more of a summary (like GIT_TRACE) and
less detailed than the other targets. It ignores thread, region, and
@@ -440,9 +474,6 @@ data messages, for example.
=== PERF Format
-PERF format is enabled when the `GIT_TR2_PERF` environment variable
-is set.
-
Events are written as lines of the form:
------------
@@ -502,8 +533,8 @@ This field is in anticipation of in-proc submodules in the future.
15:33:33.532712 wt-status.c:2331 | d0 | main | region_leave | r1 | 0.127568 | 0.001504 | status | label:print
------------
-If `GIT_TR2_PERF_BRIEF` is true, the `time`, `file`, and `line`
-fields are omitted.
+If `GIT_TRACE2_PERF_BRIEF` or `trace2.perfBrief` is true, the `time`, `file`,
+and `line` fields are omitted.
------------
d0 | main | region_leave | r1 | 0.011717 | 0.009122 | index | label:preload
@@ -514,9 +545,6 @@ during development and is quite noisy.
=== EVENT Format
-EVENT format is enabled when the `GIT_TR2_EVENT` environment
-variable is set.
-
Each event is a JSON-object containing multiple key/value pairs
written as a single line and followed by a LF.
@@ -534,11 +562,11 @@ The following key/value pairs are common to all events:
------------
{
"event":"version",
- "sid":"1547659722619736-11614",
+ "sid":"20190408T191827.272759Z-H9b68c35f-P00003510",
"thread":"main",
- "time":"2019-01-16 17:28:42.620713",
+ "time":"2019-04-08T19:18:27.282761Z",
"file":"common-main.c",
- "line":38,
+ "line":42,
...
}
------------
@@ -570,9 +598,9 @@ The following key/value pairs are common to all events:
`"repo":<repo-id>`::
when present, is the integer repo-id as described previously.
-If `GIT_TR2_EVENT_BRIEF` is true, the `file` and `line` fields are omitted
-from all events and the `time` field is only present on the "start" and
-"atexit" events.
+If `GIT_TRACE2_EVENT_BRIEF` or `trace2.eventBrief` is true, the `file`
+and `line` fields are omitted from all events and the `time` field is
+only present on the "start" and "atexit" events.
==== Event-Specific Key/Value Pairs
@@ -595,6 +623,7 @@ from all events and the `time` field is only present on the "start" and
{
"event":"start",
...
+ "t_abs":0.001227, # elapsed time in seconds
"argv":["git","version"]
}
------------
@@ -639,7 +668,7 @@ completed.)
"event":"signal",
...
"t_abs":0.001227, # elapsed time in seconds
- "signal":13 # SIGTERM, SIGINT, etc.
+ "signo":13 # SIGTERM, SIGINT, etc.
}
------------
@@ -882,7 +911,7 @@ visited.
The `category` field may be used in a future enhancement to
do category-based filtering.
+
-The `GIT_TR2_EVENT_NESTING` environment variable can be used to
+`GIT_TRACE2_EVENT_NESTING` or `trace2.eventNesting` can be used to
filter deeply nested regions and data events. It defaults to "2".
`"region_leave"`::
@@ -1010,8 +1039,8 @@ rev-list, and gc. This example also shows that fetch took
5.199 seconds and of that 4.932 was in ssh.
+
----------------
-$ export GIT_TR2_BRIEF=1
-$ export GIT_TR2=~/log.normal
+$ export GIT_TRACE2_BRIEF=1
+$ export GIT_TRACE2=~/log.normal
$ git fetch origin
...
----------------
@@ -1046,8 +1075,8 @@ its name as "gc", it also reports the hierarchy as "fetch/gc".
indented for clarity.)
+
----------------
-$ export GIT_TR2_BRIEF=1
-$ export GIT_TR2=~/log.normal
+$ export GIT_TRACE2_BRIEF=1
+$ export GIT_TRACE2=~/log.normal
$ git fetch origin
...
----------------
@@ -1105,14 +1134,14 @@ In this example, scanning for untracked files ran from +0.012568 to
+0.027149 (since the process started) and took 0.014581 seconds.
+
----------------
-$ export GIT_TR2_PERF_BRIEF=1
-$ export GIT_TR2_PERF=~/log.perf
+$ export GIT_TRACE2_PERF_BRIEF=1
+$ export GIT_TRACE2_PERF=~/log.perf
$ git status
...
$ cat ~/log.perf
d0 | main | version | | | | | 2.20.1.160.g5676107ecd.dirty
-d0 | main | start | | | | | git status
+d0 | main | start | | 0.001173 | | | git status
d0 | main | def_repo | r1 | | | | worktree:/Users/jeffhost/work/gfw
d0 | main | cmd_name | | | | | status (status)
...
@@ -1151,13 +1180,13 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
We can further investigate the time spent scanning for untracked files.
+
----------------
-$ export GIT_TR2_PERF_BRIEF=1
-$ export GIT_TR2_PERF=~/log.perf
+$ export GIT_TRACE2_PERF_BRIEF=1
+$ export GIT_TRACE2_PERF=~/log.perf
$ git status
...
$ cat ~/log.perf
d0 | main | version | | | | | 2.20.1.162.gb4ccea44db.dirty
-d0 | main | start | | | | | git status
+d0 | main | start | | 0.001173 | | | git status
d0 | main | def_repo | r1 | | | | worktree:/Users/jeffhost/work/gfw
d0 | main | cmd_name | | | | | status (status)
...
@@ -1207,13 +1236,13 @@ int read_index_from(struct index_state *istate, const char *path,
This example shows that the index contained 3552 entries.
+
----------------
-$ export GIT_TR2_PERF_BRIEF=1
-$ export GIT_TR2_PERF=~/log.perf
+$ export GIT_TRACE2_PERF_BRIEF=1
+$ export GIT_TRACE2_PERF=~/log.perf
$ git status
...
$ cat ~/log.perf
d0 | main | version | | | | | 2.20.1.156.gf9916ae094.dirty
-d0 | main | start | | | | | git status
+d0 | main | start | | 0.001173 | | | git status
d0 | main | def_repo | r1 | | | | worktree:/Users/jeffhost/work/gfw
d0 | main | cmd_name | | | | | status (status)
d0 | main | region_enter | r1 | 0.001791 | | index | label:do_read_index .git/index
@@ -1281,8 +1310,8 @@ Data events are tagged with the active thread name. They are used
to report the per-thread parameters.
+
----------------
-$ export GIT_TR2_PERF_BRIEF=1
-$ export GIT_TR2_PERF=~/log.perf
+$ export GIT_TRACE2_PERF_BRIEF=1
+$ export GIT_TRACE2_PERF=~/log.perf
$ git status
...
$ cat ~/log.perf
diff --git a/Documentation/technical/commit-graph-format.txt b/Documentation/technical/commit-graph-format.txt
index 16452a0504..a4f17441ae 100644
--- a/Documentation/technical/commit-graph-format.txt
+++ b/Documentation/technical/commit-graph-format.txt
@@ -44,8 +44,9 @@ HEADER:
1-byte number (C) of "chunks"
- 1-byte (reserved for later use)
- Current clients should ignore this value.
+ 1-byte number (B) of base commit-graphs
+ We infer the length (H*B) of the Base Graphs chunk
+ from this value.
CHUNK LOOKUP:
@@ -92,6 +93,12 @@ CHUNK DATA:
positions for the parents until reaching a value with the most-significant
bit on. The other bits correspond to the position of the last parent.
+ Base Graphs List (ID: {'B', 'A', 'S', 'E'}) [Optional]
+ This list of H-byte hashes describe a set of B commit-graph files that
+ form a commit-graph chain. The graph position for the ith commit in this
+ file's OID Lookup chunk is equal to i plus the number of commits in all
+ base graphs. If B is non-zero, this chunk must exist.
+
TRAILER:
H-byte HASH-checksum of all of the above.
diff --git a/Documentation/technical/commit-graph.txt b/Documentation/technical/commit-graph.txt
index 7805b0968c..729fbcb32f 100644
--- a/Documentation/technical/commit-graph.txt
+++ b/Documentation/technical/commit-graph.txt
@@ -127,22 +127,196 @@ Design Details
helpful for these clones, anyway. The commit-graph will not be read or
written when shallow commits are present.
-Future Work
------------
-
-- After computing and storing generation numbers, we must make graph
- walks aware of generation numbers to gain the performance benefits they
- enable. This will mostly be accomplished by swapping a commit-date-ordered
- priority queue with one ordered by generation number. The following
- operations are important candidates:
-
- - 'log --topo-order'
- - 'tag --merged'
-
-- A server could provide a commit-graph file as part of the network protocol
- to avoid extra calculations by clients. This feature is only of benefit if
- the user is willing to trust the file, because verifying the file is correct
- is as hard as computing it from scratch.
+Commit Graphs Chains
+--------------------
+
+Typically, repos grow with near-constant velocity (commits per day). Over time,
+the number of commits added by a fetch operation is much smaller than the
+number of commits in the full history. By creating a "chain" of commit-graphs,
+we enable fast writes of new commit data without rewriting the entire commit
+history -- at least, most of the time.
+
+## File Layout
+
+A commit-graph chain uses multiple files, and we use a fixed naming convention
+to organize these files. Each commit-graph file has a name
+`$OBJDIR/info/commit-graphs/graph-{hash}.graph` where `{hash}` is the hex-
+valued hash stored in the footer of that file (which is a hash of the file's
+contents before that hash). For a chain of commit-graph files, a plain-text
+file at `$OBJDIR/info/commit-graphs/commit-graph-chain` contains the
+hashes for the files in order from "lowest" to "highest".
+
+For example, if the `commit-graph-chain` file contains the lines
+
+```
+ {hash0}
+ {hash1}
+ {hash2}
+```
+
+then the commit-graph chain looks like the following diagram:
+
+ +-----------------------+
+ | graph-{hash2}.graph |
+ +-----------------------+
+ |
+ +-----------------------+
+ | |
+ | graph-{hash1}.graph |
+ | |
+ +-----------------------+
+ |
+ +-----------------------+
+ | |
+ | |
+ | |
+ | graph-{hash0}.graph |
+ | |
+ | |
+ | |
+ +-----------------------+
+
+Let X0 be the number of commits in `graph-{hash0}.graph`, X1 be the number of
+commits in `graph-{hash1}.graph`, and X2 be the number of commits in
+`graph-{hash2}.graph`. If a commit appears in position i in `graph-{hash2}.graph`,
+then we interpret this as being the commit in position (X0 + X1 + i), and that
+will be used as its "graph position". The commits in `graph-{hash2}.graph` use these
+positions to refer to their parents, which may be in `graph-{hash1}.graph` or
+`graph-{hash0}.graph`. We can navigate to an arbitrary commit in position j by checking
+its containment in the intervals [0, X0), [X0, X0 + X1), [X0 + X1, X0 + X1 +
+X2).
+
+Each commit-graph file (except the base, `graph-{hash0}.graph`) contains data
+specifying the hashes of all files in the lower layers. In the above example,
+`graph-{hash1}.graph` contains `{hash0}` while `graph-{hash2}.graph` contains
+`{hash0}` and `{hash1}`.
+
+## Merging commit-graph files
+
+If we only added a new commit-graph file on every write, we would run into a
+linear search problem through many commit-graph files. Instead, we use a merge
+strategy to decide when the stack should collapse some number of levels.
+
+The diagram below shows such a collapse. As a set of new commits are added, it
+is determined by the merge strategy that the files should collapse to
+`graph-{hash1}`. Thus, the new commits, the commits in `graph-{hash2}` and
+the commits in `graph-{hash1}` should be combined into a new `graph-{hash3}`
+file.
+
+ +---------------------+
+ | |
+ | (new commits) |
+ | |
+ +---------------------+
+ | |
+ +-----------------------+ +---------------------+
+ | graph-{hash2} |->| |
+ +-----------------------+ +---------------------+
+ | | |
+ +-----------------------+ +---------------------+
+ | | | |
+ | graph-{hash1} |->| |
+ | | | |
+ +-----------------------+ +---------------------+
+ | tmp_graphXXX
+ +-----------------------+
+ | |
+ | |
+ | |
+ | graph-{hash0} |
+ | |
+ | |
+ | |
+ +-----------------------+
+
+During this process, the commits to write are combined, sorted and we write the
+contents to a temporary file, all while holding a `commit-graph-chain.lock`
+lock-file. When the file is flushed, we rename it to `graph-{hash3}`
+according to the computed `{hash3}`. Finally, we write the new chain data to
+`commit-graph-chain.lock`:
+
+```
+ {hash3}
+ {hash0}
+```
+
+We then close the lock-file.
+
+## Merge Strategy
+
+When writing a set of commits that do not exist in the commit-graph stack of
+height N, we default to creating a new file at level N + 1. We then decide to
+merge with the Nth level if one of two conditions hold:
+
+ 1. `--size-multiple=<X>` is specified or X = 2, and the number of commits in
+ level N is less than X times the number of commits in level N + 1.
+
+ 2. `--max-commits=<C>` is specified with non-zero C and the number of commits
+ in level N + 1 is more than C commits.
+
+This decision cascades down the levels: when we merge a level we create a new
+set of commits that then compares to the next level.
+
+The first condition bounds the number of levels to be logarithmic in the total
+number of commits. The second condition bounds the total number of commits in
+a `graph-{hashN}` file and not in the `commit-graph` file, preventing
+significant performance issues when the stack merges and another process only
+partially reads the previous stack.
+
+The merge strategy values (2 for the size multiple, 64,000 for the maximum
+number of commits) could be extracted into config settings for full
+flexibility.
+
+## Deleting graph-{hash} files
+
+After a new tip file is written, some `graph-{hash}` files may no longer
+be part of a chain. It is important to remove these files from disk, eventually.
+The main reason to delay removal is that another process could read the
+`commit-graph-chain` file before it is rewritten, but then look for the
+`graph-{hash}` files after they are deleted.
+
+To allow holding old split commit-graphs for a while after they are unreferenced,
+we update the modified times of the files when they become unreferenced. Then,
+we scan the `$OBJDIR/info/commit-graphs/` directory for `graph-{hash}`
+files whose modified times are older than a given expiry window. This window
+defaults to zero, but can be changed using command-line arguments or a config
+setting.
+
+## Chains across multiple object directories
+
+In a repo with alternates, we look for the `commit-graph-chain` file starting
+in the local object directory and then in each alternate. The first file that
+exists defines our chain. As we look for the `graph-{hash}` files for
+each `{hash}` in the chain file, we follow the same pattern for the host
+directories.
+
+This allows commit-graphs to be split across multiple forks in a fork network.
+The typical case is a large "base" repo with many smaller forks.
+
+As the base repo advances, it will likely update and merge its commit-graph
+chain more frequently than the forks. If a fork updates their commit-graph after
+the base repo, then it should "reparent" the commit-graph chain onto the new
+chain in the base repo. When reading each `graph-{hash}` file, we track
+the object directory containing it. During a write of a new commit-graph file,
+we check for any changes in the source object directory and read the
+`commit-graph-chain` file for that source and create a new file based on those
+files. During this "reparent" operation, we necessarily need to collapse all
+levels in the fork, as all of the files are invalid against the new base file.
+
+It is crucial to be careful when cleaning up "unreferenced" `graph-{hash}.graph`
+files in this scenario. It falls to the user to define the proper settings for
+their custom environment:
+
+ 1. When merging levels in the base repo, the unreferenced files may still be
+ referenced by chains from fork repos.
+
+ 2. The expiry time should be set to a length of time such that every fork has
+ time to recompute their commit-graph chain to "reparent" onto the new base
+ file(s).
+
+ 3. If the commit-graph chain is updated in the base, the fork will not have
+ access to the new chain until its chain is updated to reference those files.
+ (This may change in the future [5].)
Related Links
-------------
@@ -170,3 +344,7 @@ Related Links
[4] https://public-inbox.org/git/20180108154822.54829-1-git@jeffhostetler.com/T/#u
A patch to remove the ahead-behind calculation from 'status'.
+
+[5] https://public-inbox.org/git/f27db281-abad-5043-6d71-cbb083b1c877@gmail.com/
+ A discussion of a "two-dimensional graph position" that can allow reading
+ multiple commit-graph chains at the same time.
diff --git a/Documentation/technical/directory-rename-detection.txt b/Documentation/technical/directory-rename-detection.txt
index 1c0086e287..844629c8c4 100644
--- a/Documentation/technical/directory-rename-detection.txt
+++ b/Documentation/technical/directory-rename-detection.txt
@@ -20,8 +20,8 @@ More interesting possibilities exist, though, such as:
* one side of history renames x -> z, and the other renames some file to
x/e, causing the need for the merge to do a transitive rename.
- * one side of history renames x -> z, but also renames all files within
- x. For example, x/a -> z/alpha, x/b -> z/bravo, etc.
+ * one side of history renames x -> z, but also renames all files within x.
+ For example, x/a -> z/alpha, x/b -> z/bravo, etc.
* both 'x' and 'y' being merged into a single directory 'z', with a
directory rename being detected for both x->z and y->z.
diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt
index 7a2375a55d..c73e72de0e 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -657,14 +657,14 @@ can be rejected.
An example client/server communication might look like this:
----
- S: 007c74730d410fcb6603ace96f1dc55ea6196122532d refs/heads/local\0report-status delete-refs ofs-delta\n
+ S: 006274730d410fcb6603ace96f1dc55ea6196122532d refs/heads/local\0report-status delete-refs ofs-delta\n
S: 003e7d1665144a3a975c05f1f43902ddaf084e784dbe refs/heads/debug\n
S: 003f74730d410fcb6603ace96f1dc55ea6196122532d refs/heads/master\n
- S: 003f74730d410fcb6603ace96f1dc55ea6196122532d refs/heads/team\n
+ S: 003d74730d410fcb6603ace96f1dc55ea6196122532d refs/heads/team\n
S: 0000
- C: 003e7d1665144a3a975c05f1f43902ddaf084e784dbe 74730d410fcb6603ace96f1dc55ea6196122532d refs/heads/debug\n
- C: 003e74730d410fcb6603ace96f1dc55ea6196122532d 5a3f6be755bbb7deae50065988cbfa1ffa9ab68a refs/heads/master\n
+ C: 00677d1665144a3a975c05f1f43902ddaf084e784dbe 74730d410fcb6603ace96f1dc55ea6196122532d refs/heads/debug\n
+ C: 006874730d410fcb6603ace96f1dc55ea6196122532d 5a3f6be755bbb7deae50065988cbfa1ffa9ab68a refs/heads/master\n
C: 0000
C: [PACKDATA]
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index ead85ce35c..03264c7d9a 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -1,5 +1,5 @@
- Git Wire Protocol, Version 2
-==============================
+Git Wire Protocol, Version 2
+============================
This document presents a specification for a version 2 of Git's wire
protocol. Protocol v2 will improve upon v1 in the following ways:
@@ -22,8 +22,8 @@ will be commands which a client can request be executed. Once a command
has completed, a client can reuse the connection and request that other
commands be executed.
- Packet-Line Framing
----------------------
+Packet-Line Framing
+-------------------
All communication is done using packet-line framing, just as in v1. See
`Documentation/technical/pack-protocol.txt` and
@@ -34,8 +34,8 @@ In protocol v2 these special packets will have the following semantics:
* '0000' Flush Packet (flush-pkt) - indicates the end of a message
* '0001' Delimiter Packet (delim-pkt) - separates sections of a message
- Initial Client Request
-------------------------
+Initial Client Request
+----------------------
In general a client can request to speak protocol v2 by sending
`version=2` through the respective side-channel for the transport being
@@ -43,22 +43,22 @@ used which inevitably sets `GIT_PROTOCOL`. More information can be
found in `pack-protocol.txt` and `http-protocol.txt`. In all cases the
response from the server is the capability advertisement.
- Git Transport
-~~~~~~~~~~~~~~~
+Git Transport
+~~~~~~~~~~~~~
When using the git:// transport, you can request to use protocol v2 by
sending "version=2" as an extra parameter:
003egit-upload-pack /project.git\0host=myserver.com\0\0version=2\0
- SSH and File Transport
-~~~~~~~~~~~~~~~~~~~~~~~~
+SSH and File Transport
+~~~~~~~~~~~~~~~~~~~~~~
When using either the ssh:// or file:// transport, the GIT_PROTOCOL
environment variable must be set explicitly to include "version=2".
- HTTP Transport
-~~~~~~~~~~~~~~~~
+HTTP Transport
+~~~~~~~~~~~~~~
When using the http:// or https:// transport a client makes a "smart"
info/refs request as described in `http-protocol.txt` and requests that
@@ -79,8 +79,8 @@ A v2 server would reply:
Subsequent requests are then made directly to the service
`$GIT_URL/git-upload-pack`. (This works the same for git-receive-pack).
- Capability Advertisement
---------------------------
+Capability Advertisement
+------------------------
A server which decides to communicate (based on a request from a client)
using protocol version 2, notifies the client by sending a version string
@@ -101,8 +101,8 @@ to be executed by the client.
key = 1*(ALPHA | DIGIT | "-_")
value = 1*(ALPHA | DIGIT | " -_.,?\/{}[]()<>!@#$%^&*+=:;")
- Command Request
------------------
+Command Request
+---------------
After receiving the capability advertisement, a client can then issue a
request to select the command it wants with any particular capabilities
@@ -137,8 +137,8 @@ command be executed or can terminate the connection. A client may
optionally send an empty request consisting of just a flush-pkt to
indicate that no more requests will be made.
- Capabilities
---------------
+Capabilities
+------------
There are two different types of capabilities: normal capabilities,
which can be used to to convey information or alter the behavior of a
@@ -153,8 +153,8 @@ management on the server side in order to function correctly. This
permits simple round-robin load-balancing on the server side, without
needing to worry about state management.
- agent
-~~~~~~~
+agent
+~~~~~
The server can advertise the `agent` capability with a value `X` (in the
form `agent=X`) to notify the client that the server is running version
@@ -168,8 +168,8 @@ printable ASCII characters except space (i.e., the byte range 32 < x <
and debugging purposes, and MUST NOT be used to programmatically assume
the presence or absence of particular features.
- ls-refs
-~~~~~~~~~
+ls-refs
+~~~~~~~
`ls-refs` is the command used to request a reference advertisement in v2.
Unlike the current reference advertisement, ls-refs takes in arguments
@@ -199,8 +199,8 @@ The output of ls-refs is as follows:
symref = "symref-target:" symref-target
peeled = "peeled:" obj-id
- fetch
-~~~~~~~
+fetch
+~~~~~
`fetch` is the command used to fetch a packfile in v2. It can be looked
at as a modified version of the v1 fetch where the ref-advertisement is
@@ -444,8 +444,8 @@ header.
2 - progress messages
3 - fatal error message just before stream aborts
- server-option
-~~~~~~~~~~~~~~~
+server-option
+~~~~~~~~~~~~~
If advertised, indicates that any number of server specific options can be
included in a request. This is done by sending each option as a