summaryrefslogtreecommitdiff
path: root/Documentation/technical
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/technical')
-rw-r--r--Documentation/technical/api-argv-array.txt46
-rw-r--r--Documentation/technical/api-builtin.txt2
-rw-r--r--Documentation/technical/api-diff.txt4
-rw-r--r--Documentation/technical/api-gitattributes.txt61
-rw-r--r--Documentation/technical/api-merge.txt104
-rw-r--r--Documentation/technical/api-parse-options.txt29
-rw-r--r--Documentation/technical/api-ref-iteration.txt81
-rw-r--r--Documentation/technical/api-run-command.txt57
-rw-r--r--Documentation/technical/api-sha1-array.txt79
-rw-r--r--Documentation/technical/api-sigchain.txt41
-rw-r--r--Documentation/technical/api-string-list.txt20
-rw-r--r--Documentation/technical/api-tree-walking.txt2
-rw-r--r--Documentation/technical/index-format.txt186
-rw-r--r--Documentation/technical/pack-protocol.txt113
-rw-r--r--Documentation/technical/protocol-capabilities.txt2
15 files changed, 745 insertions, 82 deletions
diff --git a/Documentation/technical/api-argv-array.txt b/Documentation/technical/api-argv-array.txt
new file mode 100644
index 0000000000..49b3d52952
--- /dev/null
+++ b/Documentation/technical/api-argv-array.txt
@@ -0,0 +1,46 @@
+argv-array API
+==============
+
+The argv-array API allows one to dynamically build and store
+NULL-terminated lists. An argv-array maintains the invariant that the
+`argv` member always points to a non-NULL array, and that the array is
+always NULL-terminated at the element pointed to by `argv[argc]`. This
+makes the result suitable for passing to functions expecting to receive
+argv from main(), or the link:api-run-command.html[run-command API].
+
+The link:api-string-list.html[string-list API] is similar, but cannot be
+used for these purposes; instead of storing a straight string pointer,
+it contains an item structure with a `util` field that is not compatible
+with the traditional argv interface.
+
+Each `argv_array` manages its own memory. Any strings pushed into the
+array are duplicated, and all memory is freed by argv_array_clear().
+
+Data Structures
+---------------
+
+`struct argv_array`::
+
+ A single array. This should be initialized by assignment from
+ `ARGV_ARRAY_INIT`, or by calling `argv_array_init`. The `argv`
+ member contains the actual array; the `argc` member contains the
+ number of elements in the array, not including the terminating
+ NULL.
+
+Functions
+---------
+
+`argv_array_init`::
+ Initialize an array. This is no different than assigning from
+ `ARGV_ARRAY_INIT`.
+
+`argv_array_push`::
+ Push a copy of a string onto the end of the array.
+
+`argv_array_pushf`::
+ Format a string and push it onto the end of the array. This is a
+ convenience wrapper combining `strbuf_addf` and `argv_array_push`.
+
+`argv_array_clear`::
+ Free all memory associated with the array and return it to the
+ initial, empty state.
diff --git a/Documentation/technical/api-builtin.txt b/Documentation/technical/api-builtin.txt
index 5cb2b0590a..b0cafe87be 100644
--- a/Documentation/technical/api-builtin.txt
+++ b/Documentation/technical/api-builtin.txt
@@ -49,6 +49,8 @@ Additionally, if `foo` is a new command, there are 3 more things to do:
. Add an entry for `git-foo` to `command-list.txt`.
+. Add an entry for `/git-foo` to `.gitignore`.
+
How a built-in is called
------------------------
diff --git a/Documentation/technical/api-diff.txt b/Documentation/technical/api-diff.txt
index 20b0241d30..2d2ebc04b7 100644
--- a/Documentation/technical/api-diff.txt
+++ b/Documentation/technical/api-diff.txt
@@ -32,7 +32,7 @@ Calling sequence
* As you find different pairs of files, call `diff_change()` to feed
modified files, `diff_addremove()` to feed created or deleted files,
- or `diff_unmerged()` to feed a file whose state is 'unmerged' to the
+ or `diff_unmerge()` to feed a file whose state is 'unmerged' to the
API. These are thin wrappers to a lower-level `diff_queue()` function
that is flexible enough to record any of these kinds of changes.
@@ -50,7 +50,7 @@ Data structures
This is the internal representation for a single file (blob). It
records the blob object name (if known -- for a work tree file it
typically is a NUL SHA-1), filemode and pathname. This is what the
-`diff_addremove()`, `diff_change()` and `diff_unmerged()` synthesize and
+`diff_addremove()`, `diff_change()` and `diff_unmerge()` synthesize and
feed `diff_queue()` function with.
* `struct diff_filepair`
diff --git a/Documentation/technical/api-gitattributes.txt b/Documentation/technical/api-gitattributes.txt
index 9d97eaa9de..ce363b6305 100644
--- a/Documentation/technical/api-gitattributes.txt
+++ b/Documentation/technical/api-gitattributes.txt
@@ -11,27 +11,15 @@ Data Structure
`struct git_attr`::
An attribute is an opaque object that is identified by its name.
- Pass the name and its length to `git_attr()` function to obtain
- the object of this type. The internal representation of this
- structure is of no interest to the calling programs.
+ Pass the name to `git_attr()` function to obtain the object of
+ this type. The internal representation of this structure is
+ of no interest to the calling programs. The name of the
+ attribute can be retrieved by calling `git_attr_name()`.
`struct git_attr_check`::
This structure represents a set of attributes to check in a call
- to `git_checkattr()` function, and receives the results.
-
-
-Calling Sequence
-----------------
-
-* Prepare an array of `struct git_attr_check` to define the list of
- attributes you would want to check. To populate this array, you would
- need to define necessary attributes by calling `git_attr()` function.
-
-* Call git_checkattr() to check the attributes for the path.
-
-* Inspect `git_attr_check` structure to see how each of the attribute in
- the array is defined for the path.
+ to `git_check_attr()` function, and receives the results.
Attribute Values
@@ -57,6 +45,19 @@ If none of the above returns true, `.value` member points at a string
value of the attribute for the path.
+Querying Specific Attributes
+----------------------------
+
+* Prepare an array of `struct git_attr_check` to define the list of
+ attributes you would want to check. To populate this array, you would
+ need to define necessary attributes by calling `git_attr()` function.
+
+* Call `git_check_attr()` to check the attributes for the path.
+
+* Inspect `git_attr_check` structure to see how each of the attribute in
+ the array is defined for the path.
+
+
Example
-------
@@ -72,18 +73,18 @@ static void setup_check(void)
{
if (check[0].attr)
return; /* already done */
- check[0].attr = git_attr("crlf", 4);
- check[1].attr = git_attr("ident", 5);
+ check[0].attr = git_attr("crlf");
+ check[1].attr = git_attr("ident");
}
------------
-. Call `git_checkattr()` with the prepared array of `struct git_attr_check`:
+. Call `git_check_attr()` with the prepared array of `struct git_attr_check`:
------------
const char *path;
setup_check();
- git_checkattr(path, ARRAY_SIZE(check), check);
+ git_check_attr(path, ARRAY_SIZE(check), check);
------------
. Act on `.value` member of the result, left in `check[]`:
@@ -108,4 +109,20 @@ static void setup_check(void)
}
------------
-(JC)
+
+Querying All Attributes
+-----------------------
+
+To get the values of all attributes associated with a file:
+
+* Call `git_all_attrs()`, which returns an array of `git_attr_check`
+ structures.
+
+* Iterate over the `git_attr_check` array to examine the attribute
+ names and values. The name of the attribute described by a
+ `git_attr_check` object can be retrieved via
+ `git_attr_name(check[i].attr)`. (Please note that no items will be
+ returned for unset attributes, so `ATTR_UNSET()` will return false
+ for all returned `git_array_check` objects.)
+
+* Free the `git_array_check` array.
diff --git a/Documentation/technical/api-merge.txt b/Documentation/technical/api-merge.txt
new file mode 100644
index 0000000000..9dc1bed768
--- /dev/null
+++ b/Documentation/technical/api-merge.txt
@@ -0,0 +1,104 @@
+merge API
+=========
+
+The merge API helps a program to reconcile two competing sets of
+improvements to some files (e.g., unregistered changes from the work
+tree versus changes involved in switching to a new branch), reporting
+conflicts if found. The library called through this API is
+responsible for a few things.
+
+ * determining which trees to merge (recursive ancestor consolidation);
+
+ * lining up corresponding files in the trees to be merged (rename
+ detection, subtree shifting), reporting edge cases like add/add
+ and rename/rename conflicts to the user;
+
+ * performing a three-way merge of corresponding files, taking
+ path-specific merge drivers (specified in `.gitattributes`)
+ into account.
+
+Data structures
+---------------
+
+* `mmbuffer_t`, `mmfile_t`
+
+These store data usable for use by the xdiff backend, for writing and
+for reading, respectively. See `xdiff/xdiff.h` for the definitions
+and `diff.c` for examples.
+
+* `struct ll_merge_options`
+
+This describes the set of options the calling program wants to affect
+the operation of a low-level (single file) merge. Some options:
+
+`virtual_ancestor`::
+ Behave as though this were part of a merge between common
+ ancestors in a recursive merge.
+ If a helper program is specified by the
+ `[merge "<driver>"] recursive` configuration, it will
+ be used (see linkgit:gitattributes[5]).
+
+`variant`::
+ Resolve local conflicts automatically in favor
+ of one side or the other (as in 'git merge-file'
+ `--ours`/`--theirs`/`--union`). Can be `0`,
+ `XDL_MERGE_FAVOR_OURS`, `XDL_MERGE_FAVOR_THEIRS`, or
+ `XDL_MERGE_FAVOR_UNION`.
+
+`renormalize`::
+ Resmudge and clean the "base", "theirs" and "ours" files
+ before merging. Use this when the merge is likely to have
+ overlapped with a change in smudge/clean or end-of-line
+ normalization rules.
+
+Low-level (single file) merge
+-----------------------------
+
+`ll_merge`::
+
+ Perform a three-way single-file merge in core. This is
+ a thin wrapper around `xdl_merge` that takes the path and
+ any merge backend specified in `.gitattributes` or
+ `.git/info/attributes` into account. Returns 0 for a
+ clean merge.
+
+Calling sequence:
+
+* Prepare a `struct ll_merge_options` to record options.
+ If you have no special requests, skip this and pass `NULL`
+ as the `opts` parameter to use the default options.
+
+* Allocate an mmbuffer_t variable for the result.
+
+* Allocate and fill variables with the file's original content
+ and two modified versions (using `read_mmfile`, for example).
+
+* Call `ll_merge()`.
+
+* Read the merged content from `result_buf.ptr` and `result_buf.size`.
+
+* Release buffers when finished. A simple
+ `free(ancestor.ptr); free(ours.ptr); free(theirs.ptr);
+ free(result_buf.ptr);` will do.
+
+If the modifications do not merge cleanly, `ll_merge` will return a
+nonzero value and `result_buf` will generally include a description of
+the conflict bracketed by markers such as the traditional `<<<<<<<`
+and `>>>>>>>`.
+
+The `ancestor_label`, `our_label`, and `their_label` parameters are
+used to label the different sides of a conflict if the merge driver
+supports this.
+
+Everything else
+---------------
+
+Talk about <merge-recursive.h> and merge_file():
+
+ - merge_trees() to merge with rename detection
+ - merge_recursive() for ancestor consolidation
+ - try_merge_command() for other strategies
+ - conflict format
+ - merge options
+
+(Daniel, Miklos, Stephan, JC)
diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt
index 50f9e9ac17..f6a4a361bd 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -115,13 +115,19 @@ There are some macros to easily define options:
`OPT__ABBREV(&int_var)`::
Add `\--abbrev[=<n>]`.
-`OPT__DRY_RUN(&int_var)`::
+`OPT__COLOR(&int_var, description)`::
+ Add `\--color[=<when>]` and `--no-color`.
+
+`OPT__DRY_RUN(&int_var, description)`::
Add `-n, \--dry-run`.
-`OPT__QUIET(&int_var)`::
+`OPT__FORCE(&int_var, description)`::
+ Add `-f, \--force`.
+
+`OPT__QUIET(&int_var, description)`::
Add `-q, \--quiet`.
-`OPT__VERBOSE(&int_var)`::
+`OPT__VERBOSE(&int_var, description)`::
Add `-v, \--verbose`.
`OPT_GROUP(description)`::
@@ -183,13 +189,22 @@ There are some macros to easily define options:
arguments. Short options that happen to be digits take
precedence over it.
+`OPT_COLOR_FLAG(short, long, &int_var, description)`::
+ Introduce an option that takes an optional argument that can
+ have one of three values: "always", "never", or "auto". If the
+ argument is not given, it defaults to "always". The `--no-` form
+ works like `--long=never`; it cannot take an argument. If
+ "always", set `int_var` to 1; if "never", set `int_var` to 0; if
+ "auto", set `int_var` to 1 if stdout is a tty or a pager,
+ 0 otherwise.
+
The last element of the array must be `OPT_END()`.
If not stated otherwise, interpret the arguments as follows:
* `short` is a character for the short option
- (e.g. `\'e\'` for `-e`, use `0` to omit),
+ (e.g. `{apostrophe}e{apostrophe}` for `-e`, use `0` to omit),
* `long` is a string for the long option
(e.g. `"example"` for `\--example`, use `NULL` to omit),
@@ -216,10 +231,10 @@ The function must be defined in this form:
The callback mechanism is as follows:
* Inside `func`, the only interesting member of the structure
- given by `opt` is the void pointer `opt->value`.
- `\*opt->value` will be the value that is saved into `var`, if you
+ given by `opt` is the void pointer `opt\->value`.
+ `\*opt\->value` will be the value that is saved into `var`, if you
use `OPT_CALLBACK()`.
- For example, do `*(unsigned long *)opt->value = 42;` to get 42
+ For example, do `*(unsigned long *)opt\->value = 42;` to get 42
into an `unsigned long` variable.
* Return value `0` indicates success and non-zero return
diff --git a/Documentation/technical/api-ref-iteration.txt b/Documentation/technical/api-ref-iteration.txt
new file mode 100644
index 0000000000..dbbea95db7
--- /dev/null
+++ b/Documentation/technical/api-ref-iteration.txt
@@ -0,0 +1,81 @@
+ref iteration API
+=================
+
+
+Iteration of refs is done by using an iterate function which will call a
+callback function for every ref. The callback function has this
+signature:
+
+ int handle_one_ref(const char *refname, const unsigned char *sha1,
+ int flags, void *cb_data);
+
+There are different kinds of iterate functions which all take a
+callback of this type. The callback is then called for each found ref
+until the callback returns nonzero. The returned value is then also
+returned by the iterate function.
+
+Iteration functions
+-------------------
+
+* `head_ref()` just iterates the head ref.
+
+* `for_each_ref()` iterates all refs.
+
+* `for_each_ref_in()` iterates all refs which have a defined prefix and
+ strips that prefix from the passed variable refname.
+
+* `for_each_tag_ref()`, `for_each_branch_ref()`, `for_each_remote_ref()`,
+ `for_each_replace_ref()` iterate refs from the respective area.
+
+* `for_each_glob_ref()` iterates all refs that match the specified glob
+ pattern.
+
+* `for_each_glob_ref_in()` the previous and `for_each_ref_in()` combined.
+
+* `head_ref_submodule()`, `for_each_ref_submodule()`,
+ `for_each_ref_in_submodule()`, `for_each_tag_ref_submodule()`,
+ `for_each_branch_ref_submodule()`, `for_each_remote_ref_submodule()`
+ do the same as the functions descibed above but for a specified
+ submodule.
+
+* `for_each_rawref()` can be used to learn about broken ref and symref.
+
+* `for_each_reflog()` iterates each reflog file.
+
+Submodules
+----------
+
+If you want to iterate the refs of a submodule you first need to add the
+submodules object database. You can do this by a code-snippet like
+this:
+
+ const char *path = "path/to/submodule"
+ if (!add_submodule_odb(path))
+ die("Error submodule '%s' not populated.", path);
+
+`add_submodule_odb()` will return an non-zero value on success. If you
+do not do this you will get an error for each ref that it does not point
+to a valid object.
+
+Note: As a side-effect of this you can not safely assume that all
+objects you lookup are available in superproject. All submodule objects
+will be available the same way as the superprojects objects.
+
+Example:
+--------
+
+----
+static int handle_remote_ref(const char *refname,
+ const unsigned char *sha1, int flags, void *cb_data)
+{
+ struct strbuf *output = cb_data;
+ strbuf_addf(output, "%s\n", refname);
+ return 0;
+}
+
+...
+
+ struct strbuf output = STRBUF_INIT;
+ for_each_remote_ref(handle_remote_ref, &output);
+ printf("%s", output.buf);
+----
diff --git a/Documentation/technical/api-run-command.txt b/Documentation/technical/api-run-command.txt
index 68bf4cad8b..f18b4f4817 100644
--- a/Documentation/technical/api-run-command.txt
+++ b/Documentation/technical/api-run-command.txt
@@ -64,8 +64,8 @@ The functions above do the following:
`start_async`::
Run a function asynchronously. Takes a pointer to a `struct
- async` that specifies the details and returns a pipe FD
- from which the caller reads. See below for details.
+ async` that specifies the details and returns a set of pipe FDs
+ for communication with the function. See below for details.
`finish_async`::
@@ -135,7 +135,7 @@ stderr as follows:
.in: The FD must be readable; it becomes child's stdin.
.out: The FD must be writable; it becomes child's stdout.
- .err > 0 is not supported.
+ .err: The FD must be writable; it becomes child's stderr.
The specified FD is closed by start_command(), even if it fails to
run the sub-process!
@@ -180,17 +180,47 @@ The caller:
struct async variable;
2. initializes .proc and .data;
3. calls start_async();
-4. processes the data by reading from the fd in .out;
-5. closes .out;
+4. processes communicates with proc through .in and .out;
+5. closes .in and .out;
6. calls finish_async().
+The members .in, .out are used to provide a set of fd's for
+communication between the caller and the callee as follows:
+
+. Specify 0 to have no file descriptor passed. The callee will
+ receive -1 in the corresponding argument.
+
+. Specify < 0 to have a pipe allocated; start_async() replaces
+ with the pipe FD in the following way:
+
+ .in: Returns the writable pipe end into which the caller
+ writes; the readable end of the pipe becomes the function's
+ in argument.
+
+ .out: Returns the readable pipe end from which the caller
+ reads; the writable end of the pipe becomes the function's
+ out argument.
+
+ The caller of start_async() must close the returned FDs after it
+ has completed reading from/writing from them.
+
+. Specify a file descriptor > 0 to be used by the function:
+
+ .in: The FD must be readable; it becomes the function's in.
+ .out: The FD must be writable; it becomes the function's out.
+
+ The specified FD is closed by start_async(), even if it fails to
+ run the function.
+
The function pointer in .proc has the following signature:
- int proc(int fd, void *data);
+ int proc(int in, int out, void *data);
-. fd specifies a writable file descriptor to which the function must
- write the data that it produces. The function *must* close this
- descriptor before it returns.
+. in, out specifies a set of file descriptors to which the function
+ must read/write the data that it needs/produces. The function
+ *must* close these descriptors before it returns. A descriptor
+ may be -1 if the caller did not configure a descriptor for that
+ direction.
. data is the value that the caller has specified in the .data member
of struct async.
@@ -201,12 +231,13 @@ The function pointer in .proc has the following signature:
There are serious restrictions on what the asynchronous function can do
-because this facility is implemented by a pipe to a forked process on
-UNIX, but by a thread in the same address space on Windows:
+because this facility is implemented by a thread in the same address
+space on most platforms (when pthreads is available), but by a pipe to
+a forked process otherwise:
. It cannot change the program's state (global variables, environment,
- etc.) in a way that the caller notices; in other words, .out is the
- only communication channel to the caller.
+ etc.) in a way that the caller notices; in other words, .in and .out
+ are the only communication channels to the caller.
. It must not change the program's state that the caller of the
facility also uses.
diff --git a/Documentation/technical/api-sha1-array.txt b/Documentation/technical/api-sha1-array.txt
new file mode 100644
index 0000000000..4a4bae8109
--- /dev/null
+++ b/Documentation/technical/api-sha1-array.txt
@@ -0,0 +1,79 @@
+sha1-array API
+==============
+
+The sha1-array API provides storage and manipulation of sets of SHA1
+identifiers. The emphasis is on storage and processing efficiency,
+making them suitable for large lists. Note that the ordering of items is
+not preserved over some operations.
+
+Data Structures
+---------------
+
+`struct sha1_array`::
+
+ A single array of SHA1 hashes. This should be initialized by
+ assignment from `SHA1_ARRAY_INIT`. The `sha1` member contains
+ the actual data. The `nr` member contains the number of items in
+ the set. The `alloc` and `sorted` members are used internally,
+ and should not be needed by API callers.
+
+Functions
+---------
+
+`sha1_array_append`::
+ Add an item to the set. The sha1 will be placed at the end of
+ the array (but note that some operations below may lose this
+ ordering).
+
+`sha1_array_sort`::
+ Sort the elements in the array.
+
+`sha1_array_lookup`::
+ Perform a binary search of the array for a specific sha1.
+ If found, returns the offset (in number of elements) of the
+ sha1. If not found, returns a negative integer. If the array is
+ not sorted, this function has the side effect of sorting it.
+
+`sha1_array_clear`::
+ Free all memory associated with the array and return it to the
+ initial, empty state.
+
+`sha1_array_for_each_unique`::
+ Efficiently iterate over each unique element of the list,
+ executing the callback function for each one. If the array is
+ not sorted, this function has the side effect of sorting it.
+
+Examples
+--------
+
+-----------------------------------------
+void print_callback(const unsigned char sha1[20],
+ void *data)
+{
+ printf("%s\n", sha1_to_hex(sha1));
+}
+
+void some_func(void)
+{
+ struct sha1_array hashes = SHA1_ARRAY_INIT;
+ unsigned char sha1[20];
+
+ /* Read objects into our set */
+ while (read_object_from_stdin(sha1))
+ sha1_array_append(&hashes, sha1);
+
+ /* Check if some objects are in our set */
+ while (read_object_from_stdin(sha1)) {
+ if (sha1_array_lookup(&hashes, sha1) >= 0)
+ printf("it's in there!\n");
+
+ /*
+ * Print the unique set of objects. We could also have
+ * avoided adding duplicate objects in the first place,
+ * but we would end up re-sorting the array repeatedly.
+ * Instead, this will sort once and then skip duplicates
+ * in linear time.
+ */
+ sha1_array_for_each_unique(&hashes, print_callback, NULL);
+}
+-----------------------------------------
diff --git a/Documentation/technical/api-sigchain.txt b/Documentation/technical/api-sigchain.txt
new file mode 100644
index 0000000000..9e1189ef01
--- /dev/null
+++ b/Documentation/technical/api-sigchain.txt
@@ -0,0 +1,41 @@
+sigchain API
+============
+
+Code often wants to set a signal handler to clean up temporary files or
+other work-in-progress when we die unexpectedly. For multiple pieces of
+code to do this without conflicting, each piece of code must remember
+the old value of the handler and restore it either when:
+
+ 1. The work-in-progress is finished, and the handler is no longer
+ necessary. The handler should revert to the original behavior
+ (either another handler, SIG_DFL, or SIG_IGN).
+
+ 2. The signal is received. We should then do our cleanup, then chain
+ to the next handler (or die if it is SIG_DFL).
+
+Sigchain is a tiny library for keeping a stack of handlers. Your handler
+and installation code should look something like:
+
+------------------------------------------
+ void clean_foo_on_signal(int sig)
+ {
+ clean_foo();
+ sigchain_pop(sig);
+ raise(sig);
+ }
+
+ void other_func()
+ {
+ sigchain_push_common(clean_foo_on_signal);
+ mess_up_foo();
+ clean_foo();
+ }
+------------------------------------------
+
+Handlers are given the typedef of sigchain_fun. This is the same type
+that is given to signal() or sigaction(). It is perfectly reasonable to
+push SIG_DFL or SIG_IGN onto the stack.
+
+You can sigchain_push and sigchain_pop individual signals. For
+convenience, sigchain_push_common will push the handler onto the stack
+for many common signals.
diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt
index 293bb15d20..ce24eb96f5 100644
--- a/Documentation/technical/api-string-list.txt
+++ b/Documentation/technical/api-string-list.txt
@@ -29,6 +29,9 @@ member (you need this if you add things later) and you should set the
. Can sort an unsorted list using `sort_string_list`.
+. Can remove individual items of an unsorted list using
+ `unsorted_string_list_delete_item`.
+
. Finally it should free the list using `string_list_clear`.
Example:
@@ -38,8 +41,8 @@ struct string_list list;
int i;
memset(&list, 0, sizeof(struct string_list));
-string_list_append("foo", &list);
-string_list_append("bar", &list);
+string_list_append(&list, "foo");
+string_list_append(&list, "bar");
for (i = 0; i < list.nr; i++)
printf("%s\n", list.items[i].string)
----
@@ -104,10 +107,21 @@ write `string_list_insert(...)->util = ...;`.
`unsorted_string_list_has_string`::
It's like `string_list_has_string()` but for unsorted lists.
+
+`unsorted_string_list_lookup`::
+
+ It's like `string_list_lookup()` but for unsorted lists.
+
-This function needs to look through all items, as opposed to its
+The above two functions need to look through all items, as opposed to their
counterpart for sorted lists, which performs a binary search.
+`unsorted_string_list_delete_item`::
+
+ Remove an item from a string_list. The `string` pointer of the items
+ will be freed in case the `strdup_strings` member of the string_list
+ is set. The third parameter controls if the `util` pointer of the
+ items should be freed or not.
+
Data structures
---------------
diff --git a/Documentation/technical/api-tree-walking.txt b/Documentation/technical/api-tree-walking.txt
index 55b728632c..14af37c3f1 100644
--- a/Documentation/technical/api-tree-walking.txt
+++ b/Documentation/technical/api-tree-walking.txt
@@ -42,6 +42,8 @@ information.
* `data` can be anything the `fn` callback would want to use.
+* `show_all_errors` tells whether to stop at the first error or not.
+
Initializing
------------
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
new file mode 100644
index 0000000000..8930b3fabc
--- /dev/null
+++ b/Documentation/technical/index-format.txt
@@ -0,0 +1,186 @@
+GIT index format
+================
+
+= The git index file has the following format
+
+ All binary numbers are in network byte order. Version 2 is described
+ here unless stated otherwise.
+
+ - A 12-byte header consisting of
+
+ 4-byte signature:
+ The signature is { 'D', 'I', 'R', 'C' } (stands for "dircache")
+
+ 4-byte version number:
+ The current supported versions are 2 and 3.
+
+ 32-bit number of index entries.
+
+ - A number of sorted index entries (see below).
+
+ - Extensions
+
+ Extensions are identified by signature. Optional extensions can
+ be ignored if GIT does not understand them.
+
+ GIT currently supports cached tree and resolve undo extensions.
+
+ 4-byte extension signature. If the first byte is 'A'..'Z' the
+ extension is optional and can be ignored.
+
+ 32-bit size of the extension
+
+ Extension data
+
+ - 160-bit SHA-1 over the content of the index file before this
+ checksum.
+
+== Index entry
+
+ Index entries are sorted in ascending order on the name field,
+ interpreted as a string of unsigned bytes (i.e. memcmp() order, no
+ localization, no special casing of directory separator '/'). Entries
+ with the same name are sorted by their stage field.
+
+ 32-bit ctime seconds, the last time a file's metadata changed
+ this is stat(2) data
+
+ 32-bit ctime nanosecond fractions
+ this is stat(2) data
+
+ 32-bit mtime seconds, the last time a file's data changed
+ this is stat(2) data
+
+ 32-bit mtime nanosecond fractions
+ this is stat(2) data
+
+ 32-bit dev
+ this is stat(2) data
+
+ 32-bit ino
+ this is stat(2) data
+
+ 32-bit mode, split into (high to low bits)
+
+ 4-bit object type
+ valid values in binary are 1000 (regular file), 1010 (symbolic link)
+ and 1110 (gitlink)
+
+ 3-bit unused
+
+ 9-bit unix permission. Only 0755 and 0644 are valid for regular files.
+ Symbolic links and gitlinks have value 0 in this field.
+
+ 32-bit uid
+ this is stat(2) data
+
+ 32-bit gid
+ this is stat(2) data
+
+ 32-bit file size
+ This is the on-disk size from stat(2), truncated to 32-bit.
+
+ 160-bit SHA-1 for the represented object
+
+ A 16-bit 'flags' field split into (high to low bits)
+
+ 1-bit assume-valid flag
+
+ 1-bit extended flag (must be zero in version 2)
+
+ 2-bit stage (during merge)
+
+ 12-bit name length if the length is less than 0xFFF; otherwise 0xFFF
+ is stored in this field.
+
+ (Version 3) A 16-bit field, only applicable if the "extended flag"
+ above is 1, split into (high to low bits).
+
+ 1-bit reserved for future
+
+ 1-bit skip-worktree flag (used by sparse checkout)
+
+ 1-bit intent-to-add flag (used by "git add -N")
+
+ 13-bit unused, must be zero
+
+ Entry path name (variable length) relative to top level directory
+ (without leading slash). '/' is used as path separator. The special
+ path components ".", ".." and ".git" (without quotes) are disallowed.
+ Trailing slash is also disallowed.
+
+ The exact encoding is undefined, but the '.' and '/' characters
+ are encoded in 7-bit ASCII and the encoding cannot contain a NUL
+ byte (iow, this is a UNIX pathname).
+
+ 1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes
+ while keeping the name NUL-terminated.
+
+== Extensions
+
+=== Cached tree
+
+ Cached tree extension contains pre-computed hashes for trees that can
+ be derived from the index. It helps speed up tree object generation
+ from index for a new commit.
+
+ When a path is updated in index, the path must be invalidated and
+ removed from tree cache.
+
+ The signature for this extension is { 'T', 'R', 'E', 'E' }.
+
+ A series of entries fill the entire extension; each of which
+ consists of:
+
+ - NUL-terminated path component (relative to its parent directory);
+
+ - ASCII decimal number of entries in the index that is covered by the
+ tree this entry represents (entry_count);
+
+ - A space (ASCII 32);
+
+ - ASCII decimal number that represents the number of subtrees this
+ tree has;
+
+ - A newline (ASCII 10); and
+
+ - 160-bit object name for the object that would result from writing
+ this span of index as a tree.
+
+ An entry can be in an invalidated state and is represented by having
+ -1 in the entry_count field. In this case, there is no object name
+ and the next entry starts immediately after the newline.
+
+ The entries are written out in the top-down, depth-first order. The
+ first entry represents the root level of the repository, followed by the
+ first subtree---let's call this A---of the root level (with its name
+ relative to the root level), followed by the first subtree of A (with
+ its name relative to A), ...
+
+=== Resolve undo
+
+ A conflict is represented in the index as a set of higher stage entries.
+ When a conflict is resolved (e.g. with "git add path"), these higher
+ stage entries will be removed and a stage-0 entry with proper resoluton
+ is added.
+
+ When these higher stage entries are removed, they are saved in the
+ resolve undo extension, so that conflicts can be recreated (e.g. with
+ "git checkout -m"), in case users want to redo a conflict resolution
+ from scratch.
+
+ The signature for this extension is { 'R', 'E', 'U', 'C' }.
+
+ A series of entries fill the entire extension; each of which
+ consists of:
+
+ - NUL-terminated pathname the entry describes (relative to the root of
+ the repository, i.e. full pathname);
+
+ - Three NUL-terminated ASCII octal numbers, entry mode of entries in
+ stage 1 to 3 (a missing stage is represented by "0" in this field);
+ and
+
+ - At most three 160-bit object names of the entry in stages from 1 to 3
+ (nothing is written for a missing stage).
+
diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt
index 9a5cdafa9c..a7004c63e7 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -36,7 +36,7 @@ Git Transport
The Git transport starts off by sending the command and repository
on the wire using the pkt-line format, followed by a NUL byte and a
-hostname paramater, terminated by a NUL byte.
+hostname parameter, terminated by a NUL byte.
0032git-upload-pack /project.git\0host=myserver.com\0
@@ -179,34 +179,36 @@ and descriptions.
Packfile Negotiation
--------------------
-After reference and capabilities discovery, the client can decide
-to terminate the connection by sending a flush-pkt, telling the
-server it can now gracefully terminate (as happens with the ls-remote
-command) or it can enter the negotiation phase, where the client and
-server determine what the minimal packfile necessary for transport is.
-
-Once the client has the initial list of references that the server
-has, as well as the list of capabilities, it will begin telling the
-server what objects it wants and what objects it has, so the server
-can make a packfile that only contains the objects that the client needs.
-The client will also send a list of the capabilities it wants to be in
-effect, out of what the server said it could do with the first 'want' line.
+After reference and capabilities discovery, the client can decide to
+terminate the connection by sending a flush-pkt, telling the server it can
+now gracefully terminate, and disconnect, when it does not need any pack
+data. This can happen with the ls-remote command, and also can happen when
+the client already is up-to-date.
+
+Otherwise, it enters the negotiation phase, where the client and
+server determine what the minimal packfile necessary for transport is,
+by telling the server what objects it wants, its shallow objects
+(if any), and the maximum commit depth it wants (if any). The client
+will also send a list of the capabilities it wants to be in effect,
+out of what the server said it could do with the first 'want' line.
----
upload-request = want-list
- have-list
- compute-end
+ *shallow-line
+ *1depth-request
+ flush-pkt
want-list = first-want
*additional-want
- flush-pkt
+
+ shallow-line = PKT_LINE("shallow" SP obj-id)
+
+ depth-request = PKT_LINE("deepen" SP depth)
first-want = PKT-LINE("want" SP obj-id SP capability-list LF)
additional-want = PKT-LINE("want" SP obj-id LF)
- have-list = *have-line
- have-line = PKT-LINE("have" SP obj-id LF)
- compute-end = flush-pkt / PKT-LINE("done")
+ depth = 1*DIGIT
----
Clients MUST send all the obj-ids it wants from the reference
@@ -215,21 +217,64 @@ discovery phase as 'want' lines. Clients MUST send at least one
obj-id in a 'want' command which did not appear in the response
obtained through ref discovery.
-If client is requesting a shallow clone, it will now send a 'deepen'
-line with the depth it is requesting.
+The client MUST write all obj-ids which it only has shallow copies
+of (meaning that it does not have the parents of a commit) as
+'shallow' lines so that the server is aware of the limitations of
+the client's history. Clients MUST NOT mention an obj-id which
+it does not know exists on the server.
+
+The client now sends the maximum commit history depth it wants for
+this transaction, which is the number of commits it wants from the
+tip of the history, if any, as a 'deepen' line. A depth of 0 is the
+same as not making a depth request. The client does not want to receive
+any commits beyond this depth, nor objects needed only to complete
+those commits. Commits whose parents are not received as a result are
+defined as shallow and marked as such in the server. This information
+is sent back to the client in the next step.
+
+Once all the 'want's and 'shallow's (and optional 'deepen') are
+transferred, clients MUST send a flush-pkt, to tell the server side
+that it is done sending the list.
+
+Otherwise, if the client sent a positive depth request, the server
+will determine which commits will and will not be shallow and
+send this information to the client. If the client did not request
+a positive depth, this step is skipped.
-Once all the "want"s (and optional 'deepen') are transferred,
-clients MUST send a flush-pkt. If the client has all the references
-on the server, client flushes and disconnects.
+----
+ shallow-update = *shallow-line
+ *unshallow-line
+ flush-pkt
-TODO: shallow/unshallow response and document the deepen command in the ABNF.
+ shallow-line = PKT-LINE("shallow" SP obj-id)
+
+ unshallow-line = PKT-LINE("unshallow" SP obj-id)
+----
+
+If the client has requested a positive depth, the server will compute
+the set of commits which are no deeper than the desired depth, starting
+at the client's wants. The server writes 'shallow' lines for each
+commit whose parents will not be sent as a result. The server writes
+an 'unshallow' line for each commit which the client has indicated is
+shallow, but is no longer shallow at the currently requested depth
+(that is, its parents will now be sent). The server MUST NOT mark
+as unshallow anything which the client has not indicated was shallow.
Now the client will send a list of the obj-ids it has using 'have'
-lines. In multi_ack mode, the canonical implementation will send up
-to 32 of these at a time, then will send a flush-pkt. The canonical
-implementation will skip ahead and send the next 32 immediately,
-so that there is always a block of 32 "in-flight on the wire" at a
-time.
+lines, so the server can make a packfile that only contains the objects
+that the client needs. In multi_ack mode, the canonical implementation
+will send up to 32 of these at a time, then will send a flush-pkt. The
+canonical implementation will skip ahead and send the next 32 immediately,
+so that there is always a block of 32 "in-flight on the wire" at a time.
+
+----
+ upload-haves = have-list
+ compute-end
+
+ have-list = *have-line
+ have-line = PKT-LINE("have" SP obj-id LF)
+ compute-end = flush-pkt / PKT-LINE("done")
+----
If the server reads 'have' lines, it then will respond by ACKing any
of the obj-ids the client said it had that the server also has. The
@@ -331,7 +376,7 @@ An incremental update (fetch) response might look like this:
C: 0009done\n
- S: 003aACK 74730d410fcb6603ace96f1dc55ea6196122532d\n
+ S: 0031ACK 74730d410fcb6603ace96f1dc55ea6196122532d\n
S: [PACKFILE]
----
@@ -488,7 +533,7 @@ An example client/server communication might look like this:
C: 0000
C: [PACKDATA]
- S: 000aunpack ok\n
- S: 0014ok refs/heads/debug\n
- S: 0026ng refs/heads/master non-fast-forward\n
+ S: 000eunpack ok\n
+ S: 0018ok refs/heads/debug\n
+ S: 002ang refs/heads/master non-fast-forward\n
----
diff --git a/Documentation/technical/protocol-capabilities.txt b/Documentation/technical/protocol-capabilities.txt
index fd1a593149..b15517fa06 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -119,7 +119,7 @@ both.
ofs-delta
---------
-Server can send, and client understand PACKv2 with delta refering to
+Server can send, and client understand PACKv2 with delta referring to
its base by position in pack rather than by an obj-id. That is, they can
send/read OBJ_OFS_DELTA (aka type 6) in a packfile.