1 files changed, 216 insertions, 14 deletions
diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index ef67b53f72..c6e536f180 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -1,5 +1,5 @@
 Like other projects, we also have some guidelines to keep to the
-code.  For Git in general, three rough rules are:
+code.  For Git in general, a few rough rules are:
 
  - Most importantly, we never say "It's in POSIX; we'll happily
    ignore your needs should your system not conform to it."
@@ -18,6 +18,14 @@ code.  For Git in general, three rough rules are:
    judgement call, the decision based more on real world
    constraints people face than what the paper standard says.
 
+ - Fixing style violations while working on a real change as a
+   preparatory clean-up step is good, but otherwise avoid useless code
+   churn for the sake of conforming to the style.
+
+   "Once it _is_ in the tree, it's not really worth the patch noise to
+   go and fix it up."
+   Cf. http://article.gmane.org/gmane.linux.kernel/943020
+
 Make your code readable and sensible, and don't try to be clever.
 
 As for more concrete guidelines, just imitate the existing code
@@ -34,7 +42,17 @@ For shell scripts specifically (not exhaustive):
 
  - We use tabs for indentation.
 
- - Case arms are indented at the same depth as case and esac lines.
+ - Case arms are indented at the same depth as case and esac lines,
+   like this:
+
+	case "$variable" in
+	pattern1)
+		do this
+		;;
+	pattern2)
+		do that
+		;;
+	esac
 
  - Redirection operators should be written with space before, but no
    space after them.  In other words, write 'echo test >"$file"'
@@ -43,6 +61,14 @@ For shell scripts specifically (not exhaustive):
    redirection target in a variable (as shown above), our code does so
    because some versions of bash issue a warning without the quotes.
 
+	(incorrect)
+	cat hello > world < universe
+	echo hello >$world
+
+	(correct)
+	cat hello >world <universe
+	echo hello >"$world"
+
  - We prefer $( ... ) for command substitution; unlike ``, it
    properly nests.  It should have been the way Bourne spelled
    it from day one, but unfortunately isn't.
@@ -81,23 +107,42 @@ For shell scripts specifically (not exhaustive):
    "then" should be on the next line for if statements, and "do"
    should be on the next line for "while" and "for".
 
+	(incorrect)
+	if test -f hello; then
+		do this
+	fi
+
+	(correct)
+	if test -f hello
+	then
+		do this
+	fi
+
  - We prefer "test" over "[ ... ]".
 
  - We do not write the noiseword "function" in front of shell
    functions.
 
- - We prefer a space between the function name and the parentheses. The
-   opening "{" should also be on the same line.
-   E.g.: my_function () {
+ - We prefer a space between the function name and the parentheses,
+   and no space inside the parentheses. The opening "{" should also
+   be on the same line.
+
+	(incorrect)
+	my_function(){
+		...
+
+	(correct)
+	my_function () {
+		...
 
  - As to use of grep, stick to a subset of BRE (namely, no \{m,n\},
-   [::], [==], nor [..]) for portability.
+   [::], [==], or [..]) for portability.
 
    - We do not use \{m,n\};
 
    - We do not use -E;
 
-   - We do not use ? nor + (which are \{0,1\} and \{1,\}
+   - We do not use ? or + (which are \{0,1\} and \{1,\}
      respectively in BRE) but that goes without saying as these
      are ERE elements not BRE (note that \? and \+ are not even part
      of BRE -- making them accessible from BRE is a GNU extension).
@@ -106,6 +151,19 @@ For shell scripts specifically (not exhaustive):
    interface translatable. See "Marking strings for translation" in
    po/README.
 
+ - We do not write our "test" command with "-a" and "-o" and use "&&"
+   or "||" to concatenate multiple "test" commands instead, because
+   the use of "-a/-o" is often error-prone.  E.g.
+
+     test -n "$x" -a "$a" = "$b"
+
+   is buggy and breaks when $x is "=", but
+
+     test -n "$x" && test "$a" = "$b"
+
+   does not have such a problem.
+
+
 For C programs:
 
  - We use tabs to indent, and interpret tabs as taking up to
@@ -126,6 +184,17 @@ For C programs:
    "char * string".  This makes it easier to understand code
    like "char *string, c;".
 
+ - Use whitespace around operators and keywords, but not inside
+   parentheses and not around functions. So:
+
+        while (condition)
+		func(bar + 1);
+
+   and not:
+
+        while( condition )
+		func (bar+1);
+
  - We avoid using braces unnecessarily.  I.e.
 
 	if (bla) {
@@ -138,7 +207,7 @@ For C programs:
    of "else if" statements, it can make sense to add braces to
    single line blocks.
 
- - We try to avoid assignments inside if().
+ - We try to avoid assignments in the condition of an "if" statement.
 
  - Try to make your code understandable.  You may put comments
    in, but comments invariably tend to stale out when the code
@@ -153,9 +222,101 @@ For C programs:
 	 * multi-line comment.
 	 */
 
+   Note however that a comment that explains a translatable string to
+   translators uses a convention of starting with a magic token
+   "TRANSLATORS: " immediately after the opening delimiter, even when
+   it spans multiple lines.  We do not add an asterisk at the beginning
+   of each line, either.  E.g.
+
+	/* TRANSLATORS: here is a comment that explains the string
+	   to be translated, that follows immediately after it */
+	_("Here is a translatable string explained by the above.");
+
  - Double negation is often harder to understand than no negation
    at all.
 
+ - There are two schools of thought when it comes to comparison,
+   especially inside a loop. Some people prefer to have the less stable
+   value on the left hand side and the more stable value on the right hand
+   side, e.g. if you have a loop that counts variable i down to the
+   lower bound,
+
+	while (i > lower_bound) {
+		do something;
+		i--;
+	}
+
+   Other people prefer to have the textual order of values match the
+   actual order of values in their comparison, so that they can
+   mentally draw a number line from left to right and place these
+   values in order, i.e.
+
+	while (lower_bound < i) {
+		do something;
+		i--;
+	}
+
+   Both are valid, and we use both.  However, the more "stable" the
+   stable side becomes, the more we tend to prefer the former
+   (comparison with a constant, "i > 0", is an extreme example).
+   Just do not mix styles in the same part of the code and mimic
+   existing styles in the neighbourhood.
+
+ - There are two schools of thought when it comes to splitting a long
+   logical line into multiple lines.  Some people push the second and
+   subsequent lines far enough to the right with tabs and align them:
+
+        if (the_beginning_of_a_very_long_expression_that_has_to ||
+		span_more_than_a_single_line_of ||
+		the_source_text) {
+                ...
+
+   while other people prefer to align the second and the subsequent
+   lines with the column immediately inside the opening parenthesis,
+   with tabs and spaces, following our "tabstop is always a multiple
+   of 8" convention:
+
+        if (the_beginning_of_a_very_long_expression_that_has_to ||
+	    span_more_than_a_single_line_of ||
+	    the_source_text) {
+                ...
+
+   Both are valid, and we use both.  Again, just do not mix styles in
+   the same part of the code and mimic existing styles in the
+   neighbourhood.
+
+ - When splitting a long logical line, some people change line before
+   a binary operator, so that the result looks like a parse tree when
+   you turn your head 90-degrees counterclockwise:
+
+        if (the_beginning_of_a_very_long_expression_that_has_to
+	    || span_more_than_a_single_line_of_the_source_text) {
+
+   while other people prefer to leave the operator at the end of the
+   line:
+
+        if (the_beginning_of_a_very_long_expression_that_has_to ||
+	    span_more_than_a_single_line_of_the_source_text) {
+
+   Both are valid, but we tend to use the latter more, unless the
+   expression gets fairly complex, in which case the former tends to
+   be easier to read.  Again, just do not mix styles in the same part
+   of the code and mimic existing styles in the neighbourhood.
+
+ - When splitting a long logical line, with everything else being
+   equal, it is preferable to split after the operator at higher
+   level in the parse tree.  That is, this is more preferable:
+
+	if (a_very_long_variable * that_is_used_in +
+	    a_very_long_expression) {
+		...
+
+   than
+
+	if (a_very_long_variable *
+	    that_is_used_in + a_very_long_expression) {
+		...
+
  - Some clever tricks, like using the !! operator with arithmetic
    constructs, can be extremely confusing to others.  Avoid them,
    unless there is a compelling reason to use them.
@@ -167,9 +328,14 @@ For C programs:
 
  - When you come up with an API, document it.
 
- - The first #include in C files, except in platform specific
-   compat/ implementations, should be git-compat-util.h or another
-   header file that includes it, such as cache.h or builtin.h.
+ - The first #include in C files, except in platform specific compat/
+   implementations, must be either "git-compat-util.h", "cache.h" or
+   "builtin.h".  You do not have to include more than one of these.
+
+ - A C file must directly include the header files that declare the
+   functions and the types it uses, except for the functions and types
+   that are made available to it by including one of the header files
+   it must include by the previous rule.
 
  - If you are planning a new command, consider writing it in shell
    or perl first, so that changes in semantics can be easily
@@ -243,6 +409,38 @@ For Python scripts:
    documentation for version 2.6 does not mention this prefix, it has
    been supported since version 2.6.0.
 
+Error Messages
+
+ - Do not end error messages with a full stop.
+
+ - Do not capitalize ("unable to open %s", not "Unable to open %s")
+
+ - Say what the error is first ("cannot open %s", not "%s: cannot open")
+
+
+Externally Visible Names
+
+ - For configuration variable names, follow the existing convention:
+
+   . The section name indicates the affected subsystem.
+
+   . The subsection name, if any, indicates which of an unbounded set
+     of things to set the value for.
+
+   . The variable name describes the effect of tweaking this knob.
+
+   The section and variable names that consist of multiple words are
+   formed by concatenating the words without punctuations (e.g. `-`),
+   and are broken using bumpyCaps in documentation as a hint to the
+   reader.
+
+   When choosing the variable namespace, do not use variable name for
+   specifying possibly unbounded set of things, most notably anything
+   an end user can freely come up with (e.g. branch names).  Instead,
+   use subsection names or variable values, like the existing variable
+   branch.<name>.description does.
+
+
 Writing Documentation:
 
  Most (if not all) of the documentation pages are written in the
@@ -271,6 +469,10 @@ Writing Documentation:
    --sort=<key>
    --abbrev[=<n>]
 
+ If a placeholder has multiple words, they are separated by dashes:
+   <new-branch-name>
+   --template=<template-directory>
+
  Possibility of multiple occurrences is indicated by three dots:
    <file>...
    (One or more of <file>.)
@@ -287,12 +489,12 @@ Writing Documentation:
    (Zero or more of <patch>.  Note that the dots are inside, not
    outside the brackets.)
 
- Multiple alternatives are indicated with vertical bar:
+ Multiple alternatives are indicated with vertical bars:
    [-q | --quiet]
    [--utf8 | --no-utf8]
 
  Parentheses are used for grouping:
-   [(<rev>|<range>)...]
+   [(<rev> | <range>)...]
    (Any number of either <rev> or <range>.  Parens are needed to make
    it clear that "..." pertains to both <rev> and <range>.)
 
@@ -324,7 +526,7 @@ Writing Documentation:
  `backticks around word phrases`, do so.
    `--pretty=oneline`
    `git rev-list`
-   `remote.pushdefault`
+   `remote.pushDefault`
 
  Word phrases enclosed in `backtick characters` are rendered literally
  and will not be further expanded. The use of `backticks` to achieve the