7 files changed, 322 insertions, 55 deletions
diff --git a/Documentation/technical/api-builtin.txt b/Documentation/technical/api-builtin.txt
index 5cb2b0590a..b0cafe87be 100644
--- a/Documentation/technical/api-builtin.txt
+++ b/Documentation/technical/api-builtin.txt
@@ -49,6 +49,8 @@ Additionally, if `foo` is a new command, there are 3 more things to do:
 
 . Add an entry for `git-foo` to `command-list.txt`.
 
+. Add an entry for `/git-foo` to `.gitignore`.
+
 
 How a built-in is called
 ------------------------
diff --git a/Documentation/technical/api-diff.txt b/Documentation/technical/api-diff.txt
index 20b0241d30..2d2ebc04b7 100644
--- a/Documentation/technical/api-diff.txt
+++ b/Documentation/technical/api-diff.txt
@@ -32,7 +32,7 @@ Calling sequence
 
 * As you find different pairs of files, call `diff_change()` to feed
   modified files, `diff_addremove()` to feed created or deleted files,
-  or `diff_unmerged()` to feed a file whose state is 'unmerged' to the
+  or `diff_unmerge()` to feed a file whose state is 'unmerged' to the
   API.  These are thin wrappers to a lower-level `diff_queue()` function
   that is flexible enough to record any of these kinds of changes.
 
@@ -50,7 +50,7 @@ Data structures
 This is the internal representation for a single file (blob).  It
 records the blob object name (if known -- for a work tree file it
 typically is a NUL SHA-1), filemode and pathname.  This is what the
-`diff_addremove()`, `diff_change()` and `diff_unmerged()` synthesize and
+`diff_addremove()`, `diff_change()` and `diff_unmerge()` synthesize and
 feed `diff_queue()` function with.
 
 * `struct diff_filepair`
diff --git a/Documentation/technical/api-merge.txt b/Documentation/technical/api-merge.txt
index a7e050bb7a..9dc1bed768 100644
--- a/Documentation/technical/api-merge.txt
+++ b/Documentation/technical/api-merge.txt
@@ -17,6 +17,40 @@ responsible for a few things.
    path-specific merge drivers (specified in `.gitattributes`)
    into account.
 
+Data structures
+---------------
+
+* `mmbuffer_t`, `mmfile_t`
+
+These store data usable for use by the xdiff backend, for writing and
+for reading, respectively.  See `xdiff/xdiff.h` for the definitions
+and `diff.c` for examples.
+
+* `struct ll_merge_options`
+
+This describes the set of options the calling program wants to affect
+the operation of a low-level (single file) merge.  Some options:
+
+`virtual_ancestor`::
+	Behave as though this were part of a merge between common
+	ancestors in a recursive merge.
+	If a helper program is specified by the
+	`[merge "<driver>"] recursive` configuration, it will
+	be used (see linkgit:gitattributes[5]).
+
+`variant`::
+	Resolve local conflicts automatically in favor
+	of one side or the other (as in 'git merge-file'
+	`--ours`/`--theirs`/`--union`).  Can be `0`,
+	`XDL_MERGE_FAVOR_OURS`, `XDL_MERGE_FAVOR_THEIRS`, or
+	`XDL_MERGE_FAVOR_UNION`.
+
+`renormalize`::
+	Resmudge and clean the "base", "theirs" and "ours" files
+	before merging.  Use this when the merge is likely to have
+	overlapped with a change in smudge/clean or end-of-line
+	normalization rules.
+
 Low-level (single file) merge
 -----------------------------
 
@@ -28,15 +62,24 @@ Low-level (single file) merge
 	`.git/info/attributes` into account.  Returns 0 for a
 	clean merge.
 
-The caller:
+Calling sequence:
 
-1. allocates an mmbuffer_t variable for the result;
-2. allocates and fills variables with the file's original content
-   and two modified versions (using `read_mmfile`, for example);
-3. calls ll_merge();
-4. reads the output from result_buf.ptr and result_buf.size;
-5. releases buffers when finished (free(ancestor.ptr); free(ours.ptr);
-   free(theirs.ptr); free(result_buf.ptr);).
+* Prepare a `struct ll_merge_options` to record options.
+  If you have no special requests, skip this and pass `NULL`
+  as the `opts` parameter to use the default options.
+
+* Allocate an mmbuffer_t variable for the result.
+
+* Allocate and fill variables with the file's original content
+  and two modified versions (using `read_mmfile`, for example).
+
+* Call `ll_merge()`.
+
+* Read the merged content from `result_buf.ptr` and `result_buf.size`.
+
+* Release buffers when finished.  A simple
+  `free(ancestor.ptr); free(ours.ptr); free(theirs.ptr);
+  free(result_buf.ptr);` will do.
 
 If the modifications do not merge cleanly, `ll_merge` will return a
 nonzero value and `result_buf` will generally include a description of
@@ -47,18 +90,6 @@ The `ancestor_label`, `our_label`, and `their_label` parameters are
 used to label the different sides of a conflict if the merge driver
 supports this.
 
-The `flag` parameter is a bitfield:
-
- - The `LL_OPT_VIRTUAL_ANCESTOR` bit indicates whether this is an
-   internal merge to consolidate ancestors for a recursive merge.
-
- - The `LL_OPT_FAVOR_MASK` bits allow local conflicts to be automatically
-   resolved in favor of one side or the other (as in 'git merge-file'
-   `--ours`/`--theirs`/`--union`).
-   They can be populated by `create_ll_flag`, whose argument can be
-   `XDL_MERGE_FAVOR_OURS`, `XDL_MERGE_FAVOR_THEIRS`, or
-   `XDL_MERGE_FAVOR_UNION`.
-
 Everything else
 ---------------
 
diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt
index c5d141cd63..f6a4a361bd 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -118,13 +118,16 @@ There are some macros to easily define options:
 `OPT__COLOR(&int_var, description)`::
 	Add `\--color[=<when>]` and `--no-color`.
 
-`OPT__DRY_RUN(&int_var)`::
+`OPT__DRY_RUN(&int_var, description)`::
 	Add `-n, \--dry-run`.
 
-`OPT__QUIET(&int_var)`::
+`OPT__FORCE(&int_var, description)`::
+	Add `-f, \--force`.
+
+`OPT__QUIET(&int_var, description)`::
 	Add `-q, \--quiet`.
 
-`OPT__VERBOSE(&int_var)`::
+`OPT__VERBOSE(&int_var, description)`::
 	Add `-v, \--verbose`.
 
 `OPT_GROUP(description)`::
diff --git a/Documentation/technical/api-sigchain.txt b/Documentation/technical/api-sigchain.txt
index 535cdff164..9e1189ef01 100644
--- a/Documentation/technical/api-sigchain.txt
+++ b/Documentation/technical/api-sigchain.txt
@@ -32,7 +32,7 @@ and installation code should look something like:
   }
 ------------------------------------------
 
-Handlers are given the typdef of sigchain_fun. This is the same type
+Handlers are given the typedef of sigchain_fun. This is the same type
 that is given to signal() or sigaction(). It is perfectly reasonable to
 push SIG_DFL or SIG_IGN onto the stack.
 
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
new file mode 100644
index 0000000000..8930b3fabc
--- /dev/null
+++ b/Documentation/technical/index-format.txt
@@ -0,0 +1,186 @@
+GIT index format
+================
+
+= The git index file has the following format
+
+  All binary numbers are in network byte order. Version 2 is described
+  here unless stated otherwise.
+
+   - A 12-byte header consisting of
+
+     4-byte signature:
+       The signature is { 'D', 'I', 'R', 'C' } (stands for "dircache")
+
+     4-byte version number:
+       The current supported versions are 2 and 3.
+
+     32-bit number of index entries.
+
+   - A number of sorted index entries (see below).
+
+   - Extensions
+
+     Extensions are identified by signature. Optional extensions can
+     be ignored if GIT does not understand them.
+
+     GIT currently supports cached tree and resolve undo extensions.
+
+     4-byte extension signature. If the first byte is 'A'..'Z' the
+     extension is optional and can be ignored.
+
+     32-bit size of the extension
+
+     Extension data
+
+   - 160-bit SHA-1 over the content of the index file before this
+     checksum.
+
+== Index entry
+
+  Index entries are sorted in ascending order on the name field,
+  interpreted as a string of unsigned bytes (i.e. memcmp() order, no
+  localization, no special casing of directory separator '/'). Entries
+  with the same name are sorted by their stage field.
+
+  32-bit ctime seconds, the last time a file's metadata changed
+    this is stat(2) data
+
+  32-bit ctime nanosecond fractions
+    this is stat(2) data
+
+  32-bit mtime seconds, the last time a file's data changed
+    this is stat(2) data
+
+  32-bit mtime nanosecond fractions
+    this is stat(2) data
+
+  32-bit dev
+    this is stat(2) data
+
+  32-bit ino
+    this is stat(2) data
+
+  32-bit mode, split into (high to low bits)
+
+    4-bit object type
+      valid values in binary are 1000 (regular file), 1010 (symbolic link)
+      and 1110 (gitlink)
+
+    3-bit unused
+
+    9-bit unix permission. Only 0755 and 0644 are valid for regular files.
+    Symbolic links and gitlinks have value 0 in this field.
+
+  32-bit uid
+    this is stat(2) data
+
+  32-bit gid
+    this is stat(2) data
+
+  32-bit file size
+    This is the on-disk size from stat(2), truncated to 32-bit.
+
+  160-bit SHA-1 for the represented object
+
+  A 16-bit 'flags' field split into (high to low bits)
+
+    1-bit assume-valid flag
+
+    1-bit extended flag (must be zero in version 2)
+
+    2-bit stage (during merge)
+
+    12-bit name length if the length is less than 0xFFF; otherwise 0xFFF
+    is stored in this field.
+
+  (Version 3) A 16-bit field, only applicable if the "extended flag"
+  above is 1, split into (high to low bits).
+
+    1-bit reserved for future
+
+    1-bit skip-worktree flag (used by sparse checkout)
+
+    1-bit intent-to-add flag (used by "git add -N")
+
+    13-bit unused, must be zero
+
+  Entry path name (variable length) relative to top level directory
+    (without leading slash). '/' is used as path separator. The special
+    path components ".", ".." and ".git" (without quotes) are disallowed.
+    Trailing slash is also disallowed.
+
+    The exact encoding is undefined, but the '.' and '/' characters
+    are encoded in 7-bit ASCII and the encoding cannot contain a NUL
+    byte (iow, this is a UNIX pathname).
+
+  1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes
+  while keeping the name NUL-terminated.
+
+== Extensions
+
+=== Cached tree
+
+  Cached tree extension contains pre-computed hashes for trees that can
+  be derived from the index. It helps speed up tree object generation
+  from index for a new commit.
+
+  When a path is updated in index, the path must be invalidated and
+  removed from tree cache.
+
+  The signature for this extension is { 'T', 'R', 'E', 'E' }.
+
+  A series of entries fill the entire extension; each of which
+  consists of:
+
+  - NUL-terminated path component (relative to its parent directory);
+
+  - ASCII decimal number of entries in the index that is covered by the
+    tree this entry represents (entry_count);
+
+  - A space (ASCII 32);
+
+  - ASCII decimal number that represents the number of subtrees this
+    tree has;
+
+  - A newline (ASCII 10); and
+
+  - 160-bit object name for the object that would result from writing
+    this span of index as a tree.
+
+  An entry can be in an invalidated state and is represented by having
+  -1 in the entry_count field. In this case, there is no object name
+  and the next entry starts immediately after the newline.
+
+  The entries are written out in the top-down, depth-first order.  The
+  first entry represents the root level of the repository, followed by the
+  first subtree---let's call this A---of the root level (with its name
+  relative to the root level), followed by the first subtree of A (with
+  its name relative to A), ...
+
+=== Resolve undo
+
+  A conflict is represented in the index as a set of higher stage entries.
+  When a conflict is resolved (e.g. with "git add path"), these higher
+  stage entries will be removed and a stage-0 entry with proper resoluton
+  is added.
+
+  When these higher stage entries are removed, they are saved in the
+  resolve undo extension, so that conflicts can be recreated (e.g. with
+  "git checkout -m"), in case users want to redo a conflict resolution
+  from scratch.
+
+  The signature for this extension is { 'R', 'E', 'U', 'C' }.
+
+  A series of entries fill the entire extension; each of which
+  consists of:
+
+  - NUL-terminated pathname the entry describes (relative to the root of
+    the repository, i.e. full pathname);
+
+  - Three NUL-terminated ASCII octal numbers, entry mode of entries in
+    stage 1 to 3 (a missing stage is represented by "0" in this field);
+    and
+
+  - At most three 160-bit object names of the entry in stages from 1 to 3
+    (nothing is written for a missing stage).
+
diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt
index 369f91d3b9..a7004c63e7 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -179,34 +179,36 @@ and descriptions.
 
 Packfile Negotiation
 --------------------
-After reference and capabilities discovery, the client can decide
-to terminate the connection by sending a flush-pkt, telling the
-server it can now gracefully terminate (as happens with the ls-remote
-command) or it can enter the negotiation phase, where the client and
-server determine what the minimal packfile necessary for transport is.
-
-Once the client has the initial list of references that the server
-has, as well as the list of capabilities, it will begin telling the
-server what objects it wants and what objects it has, so the server
-can make a packfile that only contains the objects that the client needs.
-The client will also send a list of the capabilities it wants to be in
-effect, out of what the server said it could do with the first 'want' line.
+After reference and capabilities discovery, the client can decide to
+terminate the connection by sending a flush-pkt, telling the server it can
+now gracefully terminate, and disconnect, when it does not need any pack
+data. This can happen with the ls-remote command, and also can happen when
+the client already is up-to-date.
+
+Otherwise, it enters the negotiation phase, where the client and
+server determine what the minimal packfile necessary for transport is,
+by telling the server what objects it wants, its shallow objects
+(if any), and the maximum commit depth it wants (if any).  The client
+will also send a list of the capabilities it wants to be in effect,
+out of what the server said it could do with the first 'want' line.
 
 ----
   upload-request    =  want-list
-		       have-list
-		       compute-end
+		       *shallow-line
+		       *1depth-request
+		       flush-pkt
 
   want-list         =  first-want
 		       *additional-want
-		       flush-pkt
+
+  shallow-line      =  PKT_LINE("shallow" SP obj-id)
+
+  depth-request     =  PKT_LINE("deepen" SP depth)
 
   first-want        =  PKT-LINE("want" SP obj-id SP capability-list LF)
   additional-want   =  PKT-LINE("want" SP obj-id LF)
 
-  have-list         =  *have-line
-  have-line         =  PKT-LINE("have" SP obj-id LF)
-  compute-end       =  flush-pkt / PKT-LINE("done")
+  depth             =  1*DIGIT
 ----
 
 Clients MUST send all the obj-ids it wants from the reference
@@ -215,21 +217,64 @@ discovery phase as 'want' lines. Clients MUST send at least one
 obj-id in a 'want' command which did not appear in the response
 obtained through ref discovery.
 
-If client is requesting a shallow clone, it will now send a 'deepen'
-line with the depth it is requesting.
+The client MUST write all obj-ids which it only has shallow copies
+of (meaning that it does not have the parents of a commit) as
+'shallow' lines so that the server is aware of the limitations of
+the client's history. Clients MUST NOT mention an obj-id which
+it does not know exists on the server.
+
+The client now sends the maximum commit history depth it wants for
+this transaction, which is the number of commits it wants from the
+tip of the history, if any, as a 'deepen' line.  A depth of 0 is the
+same as not making a depth request. The client does not want to receive
+any commits beyond this depth, nor objects needed only to complete
+those commits. Commits whose parents are not received as a result are
+defined as shallow and marked as such in the server. This information
+is sent back to the client in the next step.
+
+Once all the 'want's and 'shallow's (and optional 'deepen') are
+transferred, clients MUST send a flush-pkt, to tell the server side
+that it is done sending the list.
+
+Otherwise, if the client sent a positive depth request, the server
+will determine which commits will and will not be shallow and
+send this information to the client. If the client did not request
+a positive depth, this step is skipped.
 
-Once all the "want"s (and optional 'deepen') are transferred,
-clients MUST send a flush-pkt. If the client has all the references
-on the server, client flushes and disconnects.
+----
+  shallow-update   =  *shallow-line
+		      *unshallow-line
+		      flush-pkt
 
-TODO: shallow/unshallow response and document the deepen command in the ABNF.
+  shallow-line     =  PKT-LINE("shallow" SP obj-id)
+
+  unshallow-line   =  PKT-LINE("unshallow" SP obj-id)
+----
+
+If the client has requested a positive depth, the server will compute
+the set of commits which are no deeper than the desired depth, starting
+at the client's wants. The server writes 'shallow' lines for each
+commit whose parents will not be sent as a result. The server writes
+an 'unshallow' line for each commit which the client has indicated is
+shallow, but is no longer shallow at the currently requested depth
+(that is, its parents will now be sent). The server MUST NOT mark
+as unshallow anything which the client has not indicated was shallow.
 
 Now the client will send a list of the obj-ids it has using 'have'
-lines.  In multi_ack mode, the canonical implementation will send up
-to 32 of these at a time, then will send a flush-pkt.  The canonical
-implementation will skip ahead and send the next 32 immediately,
-so that there is always a block of 32 "in-flight on the wire" at a
-time.
+lines, so the server can make a packfile that only contains the objects
+that the client needs. In multi_ack mode, the canonical implementation
+will send up to 32 of these at a time, then will send a flush-pkt. The
+canonical implementation will skip ahead and send the next 32 immediately,
+so that there is always a block of 32 "in-flight on the wire" at a time.
+
+----
+  upload-haves      =  have-list
+		       compute-end
+
+  have-list         =  *have-line
+  have-line         =  PKT-LINE("have" SP obj-id LF)
+  compute-end       =  flush-pkt / PKT-LINE("done")
+----
 
 If the server reads 'have' lines, it then will respond by ACKing any
 of the obj-ids the client said it had that the server also has. The