3 files changed, 456 insertions, 3 deletions
diff --git a/Documentation/howto/keep-canonical-history-correct.txt b/Documentation/howto/keep-canonical-history-correct.txt
new file mode 100644
index 0000000000..35d48ef714
--- /dev/null
+++ b/Documentation/howto/keep-canonical-history-correct.txt
@@ -0,0 +1,216 @@
+From: Junio C Hamano <gitster@pobox.com>
+Date: Wed, 07 May 2014 13:15:39 -0700
+Subject: Beginner question on "Pull is mostly evil"
+Abstract: This how-to explains a method for keeping a
+ project's history correct when using git pull.
+Content-type: text/asciidoc
+
+Keep authoritative canonical history correct with git pull
+==========================================================
+
+Sometimes a new project integrator will end up with project history
+that appears to be "backwards" from what other project developers
+expect. This howto presents a suggested integration workflow for
+maintaining a central repository.
+
+Suppose that that central repository has this history:
+
+------------
+    ---o---o---A
+------------
+
+which ends at commit `A` (time flows from left to right and each node
+in the graph is a commit, lines between them indicating parent-child
+relationship).
+
+Then you clone it and work on your own commits, which leads you to
+have this history in *your* repository:
+
+------------
+    ---o---o---A---B---C
+------------
+
+Imagine your coworker did the same and built on top of `A` in *his*
+repository in the meantime, and then pushed it to the
+central repository:
+
+------------
+    ---o---o---A---X---Y---Z
+------------
+
+Now, if you `git push` at this point, because your history that leads
+to `C` lacks `X`, `Y` and `Z`, it will fail.  You need to somehow make
+the tip of your history a descendant of `Z`.
+
+One suggested way to solve the problem is "fetch and then merge", aka
+`git pull`. When you fetch, your repository will have a history like
+this:
+
+------------
+    ---o---o---A---B---C
+		\
+		 X---Y---Z
+------------
+
+Once you run merge after that, while still on *your* branch, i.e. `C`,
+you will create a merge `M` and make the history look like this:
+
+------------
+    ---o---o---A---B---C---M
+		\         /
+		 X---Y---Z
+------------
+
+`M` is a descendant of `Z`, so you can push to update the central
+repository.  Such a merge `M` does not lose any commit in both
+histories, so in that sense it may not be wrong, but when people want
+to talk about "the authoritative canonical history that is shared
+among the project participants", i.e. "the trunk", they often view
+it as "commits you see by following the first-parent chain", and use
+this command to view it:
+
+------------
+    $ git log --first-parent
+------------
+
+For all other people who observed the central repository after your
+coworker pushed `Z` but before you pushed `M`, the commit on the trunk
+used to be `o-o-A-X-Y-Z`.  But because you made `M` while you were on
+`C`, `M`'s first parent is `C`, so by pushing `M` to advance the
+central repository, you made `X-Y-Z` a side branch, not on the trunk.
+
+You would rather want to have a history of this shape:
+
+------------
+    ---o---o---A---X---Y---Z---M'
+		\             /
+		 B-----------C
+------------
+
+so that in the first-parent chain, it is clear that the project first
+did `X` and then `Y` and then `Z` and merged a change that consists of
+two commits `B` and `C` that achieves a single goal.  You may have
+worked on fixing the bug #12345 with these two patches, and the merge
+`M'` with swapped parents can say in its log message "Merge
+fix-bug-12345". Having a way to tell `git pull` to create a merge
+but record the parents in reverse order may be a way to do so.
+
+Note that I said "achieves a single goal" above, because this is
+important.  "Swapping the merge order" only covers a special case
+where the project does not care too much about having unrelated
+things done on a single merge but cares a lot about first-parent
+chain.
+
+There are multiple schools of thought about the "trunk" management.
+
+ 1. Some projects want to keep a completely linear history without any
+    merges.  Obviously, swapping the merge order would not match their
+    taste.  You would need to flatten your history on top of the
+    updated upstream to result in a history of this shape instead:
++
+------------
+    ---o---o---A---X---Y---Z---B---C
+------------
++
+with `git pull --rebase` or something.
+
+ 2. Some projects tolerate merges in their history, but do not worry
+    too much about the first-parent order, and allow fast-forward
+    merges.  To them, swapping the merge order does not hurt, but
+    it is unnecessary.
+
+ 3. Some projects want each commit on the "trunk" to do one single
+    thing.  The output of `git log --first-parent` in such a project
+    would show either a merge of a side branch that completes a single
+    theme, or a single commit that completes a single theme by itself.
+    If your two commits `B` and `C` (or they may even be two groups of
+    commits) were solving two independent issues, then the merge `M'`
+    we made in the earlier example by swapping the merge order is
+    still not up to the project standard.  It merges two unrelated
+    efforts `B` and `C` at the same time.
+
+For projects in the last category (Git itself is one of them),
+individual developers would want to prepare a history more like
+this:
+
+------------
+		 C0--C1--C2     topic-c
+		/
+    ---o---o---A                master
+		\
+		 B0--B1--B2     topic-b
+------------
+
+That is, keeping separate topics on separate branches, perhaps like
+so:
+
+------------
+    $ git clone $URL work && cd work
+    $ git checkout -b topic-b master
+    $ ... work to create B0, B1 and B2 to complete one theme
+    $ git checkout -b topic-c master
+    $ ... same for the theme of topic-c
+------------
+
+And then
+
+------------
+    $ git checkout master
+    $ git pull --ff-only
+------------
+
+would grab `X`, `Y` and `Z` from the upstream and advance your master
+branch:
+
+------------
+		 C0--C1--C2     topic-c
+		/
+    ---o---o---A---X---Y---Z    master
+		\
+		 B0--B1--B2     topic-b
+------------
+
+And then you would merge these two branches separately:
+
+------------
+    $ git merge topic-b
+    $ git merge topic-c
+------------
+
+to result in
+
+------------
+		 C0--C1---------C2
+		/                 \
+    ---o---o---A---X---Y---Z---M---N
+		\             /
+		 B0--B1-----B2
+------------
+
+and push it back to the central repository.
+
+It is very much possible that while you are merging topic-b and
+topic-c, somebody again advanced the history in the central repository
+to put `W` on top of `Z`, and make your `git push` fail.
+
+In such a case, you would rewind to discard `M` and `N`, update the
+tip of your 'master' again and redo the two merges:
+
+------------
+    $ git reset --hard origin/master
+    $ git pull --ff-only
+    $ git merge topic-b
+    $ git merge topic-c
+------------
+
+The procedure will result in a history that looks like this:
+
+------------
+		 C0--C1--------------C2
+		/                     \
+    ---o---o---A---X---Y---Z---W---M'--N'
+		\                 /
+		 B0--B1---------B2
+------------
+
+See also http://git-blame.blogspot.com/2013/09/fun-with-first-parent-history.html
diff --git a/Documentation/howto/recover-corrupted-object-harder.txt b/Documentation/howto/recover-corrupted-object-harder.txt
index 6f33dac0e0..9c4cd0915f 100644
--- a/Documentation/howto/recover-corrupted-object-harder.txt
+++ b/Documentation/howto/recover-corrupted-object-harder.txt
@@ -38,7 +38,7 @@ zlib were failing).
 Reading the zlib source code, I found that "incorrect data check" means
 that the adler-32 checksum at the end of the zlib data did not match the
 inflated data. So stepping the data through zlib would not help, as it
-did not fail until the very end, when we realize the crc does not match.
+did not fail until the very end, when we realize the CRC does not match.
 The problematic bytes could be anywhere in the object data.
 
 The first thing I did was pull the broken data out of the packfile. I
@@ -195,7 +195,7 @@ halfway through:
 -------
 
 I let it run to completion, and got a few more hits at the end (where it
-was munging the crc to match our broken data). So there was a good
+was munging the CRC to match our broken data). So there was a good
 chance this middle hit was the source of the problem.
 
 I confirmed by tweaking the byte in a hex editor, zlib inflating the
@@ -240,3 +240,240 @@ But more importantly, git's hashing and checksumming noticed a problem
 that easily could have gone undetected in another system. The result
 still compiled, but would have caused an interesting bug (that would
 have been blamed on some random commit).
+
+
+The adventure continues...
+--------------------------
+
+I ended up doing this again! Same entity, new hardware. The assumption
+at this point is that the old disk corrupted the packfile, and then the
+corruption was migrated to the new hardware (because it was done by
+rsync or similar, and no fsck was done at the time of migration).
+
+This time, the affected blob was over 20 megabytes, which was far too
+large to do a brute-force on. I followed the instructions above to
+create the `zlib` file. I then used the `inflate` program below to pull
+the corrupted data from that. Examining that output gave me a hint about
+where in the file the corruption was. But now I was working with the
+file itself, not the zlib contents. So knowing the sha1 of the object
+and the approximate area of the corruption, I used the `sha1-munge`
+program below to brute-force the correct byte.
+
+Here's the inflate program (it's essentially `gunzip` but without the
+`.gz` header processing):
+
+--------------------------
+#include <stdio.h>
+#include <string.h>
+#include <zlib.h>
+#include <stdlib.h>
+
+int main(int argc, char **argv)
+{
+	/*
+	 * oversized so we can read the whole buffer in;
+	 * this could actually be switched to streaming
+	 * to avoid any memory limitations
+	 */
+	static unsigned char buf[25 * 1024 * 1024];
+	static unsigned char out[25 * 1024 * 1024];
+	int len;
+	z_stream z;
+	int ret;
+
+	len = read(0, buf, sizeof(buf));
+	memset(&z, 0, sizeof(z));
+	inflateInit(&z);
+
+	z.next_in = buf;
+	z.avail_in = len;
+	z.next_out = out;
+	z.avail_out = sizeof(out);
+
+	ret = inflate(&z, 0);
+	if (ret != Z_OK && ret != Z_STREAM_END)
+		fprintf(stderr, "initial inflate failed (%d)\n", ret);
+
+	fprintf(stderr, "outputting %lu bytes", z.total_out);
+	fwrite(out, 1, z.total_out, stdout);
+	return 0;
+}
+--------------------------
+
+And here is the `sha1-munge` program:
+
+--------------------------
+#include <stdio.h>
+#include <unistd.h>
+#include <string.h>
+#include <signal.h>
+#include <openssl/sha.h>
+#include <stdlib.h>
+
+/* eye candy */
+static int counter = 0;
+static void progress(int sig)
+{
+	fprintf(stderr, "\r%d", counter);
+	alarm(1);
+}
+
+static const signed char hexval_table[256] = {
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 00-07 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 08-0f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 10-17 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 18-1f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 20-27 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 28-2f */
+	  0,  1,  2,  3,  4,  5,  6,  7,		/* 30-37 */
+	  8,  9, -1, -1, -1, -1, -1, -1,		/* 38-3f */
+	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 40-47 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 48-4f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 50-57 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 58-5f */
+	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 60-67 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 68-67 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 70-77 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 78-7f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 80-87 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 88-8f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 90-97 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 98-9f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a0-a7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a8-af */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b0-b7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b8-bf */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c0-c7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c8-cf */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d0-d7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d8-df */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e0-e7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e8-ef */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f0-f7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f8-ff */
+};
+
+static inline unsigned int hexval(unsigned char c)
+{
+return hexval_table[c];
+}
+
+static int get_sha1_hex(const char *hex, unsigned char *sha1)
+{
+	int i;
+	for (i = 0; i < 20; i++) {
+		unsigned int val;
+		/*
+		 * hex[1]=='\0' is caught when val is checked below,
+		 * but if hex[0] is NUL we have to avoid reading
+		 * past the end of the string:
+		 */
+		if (!hex[0])
+			return -1;
+		val = (hexval(hex[0]) << 4) | hexval(hex[1]);
+		if (val & ~0xff)
+			return -1;
+		*sha1++ = val;
+		hex += 2;
+	}
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	/* oversized so we can read the whole buffer in */
+	static unsigned char buf[25 * 1024 * 1024];
+	char header[32];
+	int header_len;
+	unsigned char have[20], want[20];
+	int start, len;
+	SHA_CTX orig;
+	unsigned i, j;
+
+	if (!argv[1] || get_sha1_hex(argv[1], want)) {
+		fprintf(stderr, "usage: sha1-munge <sha1> [start] <file.in\n");
+		return 1;
+	}
+
+	if (argv[2])
+		start = atoi(argv[2]);
+	else
+		start = 0;
+
+	len = read(0, buf, sizeof(buf));
+	header_len = sprintf(header, "blob %d", len) + 1;
+	fprintf(stderr, "using header: %s\n", header);
+
+	/*
+	 * We keep a running sha1 so that if you are munging
+	 * near the end of the file, we do not have to re-sha1
+	 * the unchanged earlier bytes
+	 */
+	SHA1_Init(&orig);
+	SHA1_Update(&orig, header, header_len);
+	if (start)
+		SHA1_Update(&orig, buf, start);
+
+	signal(SIGALRM, progress);
+	alarm(1);
+
+	for (i = start; i < len; i++) {
+		unsigned char c;
+		SHA_CTX x;
+
+#if 0
+		/*
+		 * deletion -- this would not actually work in practice,
+		 * I think, because we've already committed to a
+		 * particular size in the header. Ditto for addition
+		 * below. In those cases, you'd have to do the whole
+		 * sha1 from scratch, or possibly keep three running
+		 * "orig" sha1 computations going.
+		 */
+		memcpy(&x, &orig, sizeof(x));
+		SHA1_Update(&x, buf + i + 1, len - i - 1);
+		SHA1_Final(have, &x);
+		if (!memcmp(have, want, 20))
+			printf("i=%d, deletion\n", i);
+#endif
+
+		/*
+		 * replacement -- note that this tries each of the 256
+		 * possible bytes. If you suspect a single-bit flip,
+		 * it would be much shorter to just try the 8
+		 * bit-flipped variants.
+		 */
+		c = buf[i];
+		for (j = 0; j <= 0xff; j++) {
+			buf[i] = j;
+
+			memcpy(&x, &orig, sizeof(x));
+			SHA1_Update(&x, buf + i, len - i);
+			SHA1_Final(have, &x);
+			if (!memcmp(have, want, 20))
+				printf("i=%d, j=%02x\n", i, j);
+		}
+		buf[i] = c;
+
+#if 0
+		/* addition */
+		for (j = 0; j <= 0xff; j++) {
+			unsigned char extra = j;
+			memcpy(&x, &orig, sizeof(x));
+			SHA1_Update(&x, &extra, 1);
+			SHA1_Update(&x, buf + i, len - i);
+			SHA1_Final(have, &x);
+			if (!memcmp(have, want, 20))
+				printf("i=%d, addition=%02x", i, j);
+		}
+#endif
+
+		SHA1_Update(&orig, buf + i, 1);
+		counter++;
+	}
+
+	alarm(0);
+	fprintf(stderr, "\r%d\n", counter);
+	return 0;
+}
+--------------------------
diff --git a/Documentation/howto/setup-git-server-over-http.txt b/Documentation/howto/setup-git-server-over-http.txt
index 6de4f3c487..f44e5e9458 100644
--- a/Documentation/howto/setup-git-server-over-http.txt
+++ b/Documentation/howto/setup-git-server-over-http.txt
@@ -181,7 +181,7 @@ On Debian:
 
    Most tests should pass.
 
-A command line tool to test WebDAV is cadaver. If you prefer GUIs, for
+A command-line tool to test WebDAV is cadaver. If you prefer GUIs, for
 example, konqueror can open WebDAV URLs as "webdav://..." or
 "webdavs://...".