summaryrefslogtreecommitdiff
path: root/Documentation/gitattributes.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/gitattributes.txt')
-rw-r--r--Documentation/gitattributes.txt250
1 files changed, 206 insertions, 44 deletions
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 643c1ba929..e0b66c1220 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -80,7 +80,7 @@ Attributes which should be version-controlled and distributed to other
repositories (i.e., attributes of interest to all users) should go into
`.gitattributes` files. Attributes that should affect all repositories
for a single user should be placed in a file specified by the
-`core.attributesfile` configuration option (see linkgit:git-config[1]).
+`core.attributesFile` configuration option (see linkgit:git-config[1]).
Its default value is $XDG_CONFIG_HOME/git/attributes. If $XDG_CONFIG_HOME
is either not set or empty, $HOME/.config/git/attributes is used instead.
Attributes for all users on a system should be placed in the
@@ -115,6 +115,7 @@ text file is normalized, its line endings are converted to LF in the
repository. To control what line ending style is used in the working
directory, use the `eol` attribute for a single file and the
`core.eol` configuration variable for all text files.
+Note that `core.autocrlf` overrides `core.eol`
Set::
@@ -130,8 +131,9 @@ Unset::
Set to string value "auto"::
When `text` is set to "auto", the path is marked for automatic
- end-of-line normalization. If Git decides that the content is
- text, its line endings are normalized to LF on checkin.
+ end-of-line conversion. If Git decides that the content is
+ text, its line endings are converted to LF on checkin.
+ When the file has been committed with CRLF, no conversion is done.
Unspecified::
@@ -146,7 +148,7 @@ unspecified.
^^^^^
This attribute sets a specific line-ending style to be used in the
-working directory. It enables end-of-line normalization without any
+working directory. It enables end-of-line conversion without any
content checks, effectively setting the `text` attribute.
Set to string value "crlf"::
@@ -180,60 +182,51 @@ While Git normally leaves file contents alone, it can be configured to
normalize line endings to LF in the repository and, optionally, to
convert them to CRLF when files are checked out.
-Here is an example that will make Git normalize .txt, .vcproj and .sh
-files, ensure that .vcproj files have CRLF and .sh files have LF in
-the working directory, and prevent .jpg files from being normalized
-regardless of their content.
-
-------------------------
-*.txt text
-*.vcproj eol=crlf
-*.sh eol=lf
-*.jpg -text
-------------------------
-
-Other source code management systems normalize all text files in their
-repositories, and there are two ways to enable similar automatic
-normalization in Git.
-
If you simply want to have CRLF line endings in your working directory
regardless of the repository you are working with, you can set the
-config variable "core.autocrlf" without changing any attributes.
+config variable "core.autocrlf" without using any attributes.
------------------------
[core]
autocrlf = true
------------------------
-This does not force normalization of all text files, but does ensure
+This does not force normalization of text files, but does ensure
that text files that you introduce to the repository have their line
endings normalized to LF when they are added, and that files that are
already normalized in the repository stay normalized.
-If you want to interoperate with a source code management system that
-enforces end-of-line normalization, or you simply want all text files
-in your repository to be normalized, you should instead set the `text`
-attribute to "auto" for _all_ files.
+If you want to ensure that text files that any contributor introduces to
+the repository have their line endings normalized, you can set the
+`text` attribute to "auto" for _all_ files.
------------------------
* text=auto
------------------------
-This ensures that all files that Git considers to be text will have
-normalized (LF) line endings in the repository. The `core.eol`
-configuration variable controls which line endings Git will use for
-normalized files in your working directory; the default is to use the
-native line ending for your platform, or CRLF if `core.autocrlf` is
-set.
+The attributes allow a fine-grained control, how the line endings
+are converted.
+Here is an example that will make Git normalize .txt, .vcproj and .sh
+files, ensure that .vcproj files have CRLF and .sh files have LF in
+the working directory, and prevent .jpg files from being normalized
+regardless of their content.
+
+------------------------
+* text=auto
+*.txt text
+*.vcproj text eol=crlf
+*.sh text eol=lf
+*.jpg -text
+------------------------
+
+NOTE: When `text=auto` conversion is enabled in a cross-platform
+project using push and pull to a central repository the text files
+containing CRLFs should be normalized.
-NOTE: When `text=auto` normalization is enabled in an existing
-repository, any text files containing CRLFs should be normalized. If
-they are not they will be normalized the next time someone tries to
-change them, causing unfortunate misattribution. From a clean working
-directory:
+From a clean working directory:
-------------------------------------------------
-$ echo "* text=auto" >>.gitattributes
+$ echo "* text=auto" >.gitattributes
$ rm .git/index # Remove the index to force Git to
$ git reset # re-scan the working directory
$ git status # Show files that will be normalized
@@ -300,7 +293,15 @@ checkout, when the `smudge` command is specified, the command is
fed the blob object from its standard input, and its standard
output is used to update the worktree file. Similarly, the
`clean` command is used to convert the contents of worktree file
-upon checkin.
+upon checkin. By default these commands process only a single
+blob and terminate. If a long running `process` filter is used
+in place of `clean` and/or `smudge` filters, then Git can process
+all blobs with a single filter command invocation for the entire
+life of a single Git command, for example `git add --all`. If a
+long running `process` filter is configured then it always takes
+precedence over a configured single blob filter. See section
+below for the description of the protocol used to communicate with
+a `process` filter.
One use of the content filtering is to massage the content into a shape
that is more convenient for the platform, filesystem, and the user to use.
@@ -374,6 +375,160 @@ substitution. For example:
smudge = git-p4-filter --smudge %f
------------------------
+Note that "%f" is the name of the path that is being worked on. Depending
+on the version that is being filtered, the corresponding file on disk may
+not exist, or may have different contents. So, smudge and clean commands
+should not try to access the file on disk, but only act as filters on the
+content provided to them on standard input.
+
+Long Running Filter Process
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If the filter command (a string value) is defined via
+`filter.<driver>.process` then Git can process all blobs with a
+single filter invocation for the entire life of a single Git
+command. This is achieved by using a packet format (pkt-line,
+see technical/protocol-common.txt) based protocol over standard
+input and standard output as follows. All packets, except for the
+"*CONTENT" packets and the "0000" flush packet, are considered
+text and therefore are terminated by a LF.
+
+Git starts the filter when it encounters the first file
+that needs to be cleaned or smudged. After the filter started
+Git sends a welcome message ("git-filter-client"), a list of supported
+protocol version numbers, and a flush packet. Git expects to read a welcome
+response message ("git-filter-server"), exactly one protocol version number
+from the previously sent list, and a flush packet. All further
+communication will be based on the selected version. The remaining
+protocol description below documents "version=2". Please note that
+"version=42" in the example below does not exist and is only there
+to illustrate how the protocol would look like with more than one
+version.
+
+After the version negotiation Git sends a list of all capabilities that
+it supports and a flush packet. Git expects to read a list of desired
+capabilities, which must be a subset of the supported capabilities list,
+and a flush packet as response:
+------------------------
+packet: git> git-filter-client
+packet: git> version=2
+packet: git> version=42
+packet: git> 0000
+packet: git< git-filter-server
+packet: git< version=2
+packet: git< 0000
+packet: git> capability=clean
+packet: git> capability=smudge
+packet: git> capability=not-yet-invented
+packet: git> 0000
+packet: git< capability=clean
+packet: git< capability=smudge
+packet: git< 0000
+------------------------
+Supported filter capabilities in version 2 are "clean" and
+"smudge".
+
+Afterwards Git sends a list of "key=value" pairs terminated with
+a flush packet. The list will contain at least the filter command
+(based on the supported capabilities) and the pathname of the file
+to filter relative to the repository root. Right after the flush packet
+Git sends the content split in zero or more pkt-line packets and a
+flush packet to terminate content. Please note, that the filter
+must not send any response before it received the content and the
+final flush packet. Also note that the "value" of a "key=value" pair
+can contain the "=" character whereas the key would never contain
+that character.
+------------------------
+packet: git> command=smudge
+packet: git> pathname=path/testfile.dat
+packet: git> 0000
+packet: git> CONTENT
+packet: git> 0000
+------------------------
+
+The filter is expected to respond with a list of "key=value" pairs
+terminated with a flush packet. If the filter does not experience
+problems then the list must contain a "success" status. Right after
+these packets the filter is expected to send the content in zero
+or more pkt-line packets and a flush packet at the end. Finally, a
+second list of "key=value" pairs terminated with a flush packet
+is expected. The filter can change the status in the second list
+or keep the status as is with an empty list. Please note that the
+empty list must be terminated with a flush packet regardless.
+
+------------------------
+packet: git< status=success
+packet: git< 0000
+packet: git< SMUDGED_CONTENT
+packet: git< 0000
+packet: git< 0000 # empty list, keep "status=success" unchanged!
+------------------------
+
+If the result content is empty then the filter is expected to respond
+with a "success" status and a flush packet to signal the empty content.
+------------------------
+packet: git< status=success
+packet: git< 0000
+packet: git< 0000 # empty content!
+packet: git< 0000 # empty list, keep "status=success" unchanged!
+------------------------
+
+In case the filter cannot or does not want to process the content,
+it is expected to respond with an "error" status.
+------------------------
+packet: git< status=error
+packet: git< 0000
+------------------------
+
+If the filter experiences an error during processing, then it can
+send the status "error" after the content was (partially or
+completely) sent.
+------------------------
+packet: git< status=success
+packet: git< 0000
+packet: git< HALF_WRITTEN_ERRONEOUS_CONTENT
+packet: git< 0000
+packet: git< status=error
+packet: git< 0000
+------------------------
+
+In case the filter cannot or does not want to process the content
+as well as any future content for the lifetime of the Git process,
+then it is expected to respond with an "abort" status at any point
+in the protocol.
+------------------------
+packet: git< status=abort
+packet: git< 0000
+------------------------
+
+Git neither stops nor restarts the filter process in case the
+"error"/"abort" status is set. However, Git sets its exit code
+according to the `filter.<driver>.required` flag, mimicking the
+behavior of the `filter.<driver>.clean` / `filter.<driver>.smudge`
+mechanism.
+
+If the filter dies during the communication or does not adhere to
+the protocol then Git will stop the filter process and restart it
+with the next file that needs to be processed. Depending on the
+`filter.<driver>.required` flag Git will interpret that as error.
+
+After the filter has processed a blob it is expected to wait for
+the next "key=value" list containing a command. Git will close
+the command pipe on exit. The filter is expected to detect EOF
+and exit gracefully on its own. Git will wait until the filter
+process has stopped.
+
+A long running filter demo implementation can be found in
+`contrib/long-running-filter/example.pl` located in the Git
+core repository. If you develop your own long running filter
+process then the `GIT_TRACE_PACKET` environment variables can be
+very helpful for debugging (see linkgit:git[1]).
+
+Please note that you cannot use an existing `filter.<driver>.clean`
+or `filter.<driver>.smudge` command with `filter.<driver>.process`
+because the former two use a different inter process communication
+protocol than the latter one.
+
Interaction between checkin/checkout attributes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -440,8 +595,8 @@ Unspecified::
A path to which the `diff` attribute is unspecified
first gets its contents inspected, and if it looks like
- text, it is treated as text. Otherwise it would
- generate `Binary files differ`.
+ text and is smaller than core.bigFileThreshold, it is treated
+ as text. Otherwise it would generate `Binary files differ`.
String::
@@ -525,8 +680,12 @@ patterns are available:
- `csharp` suitable for source code in the C# language.
+- `css` suitable for cascading style sheets.
+
- `fortran` suitable for source code in the Fortran language.
+- `fountain` suitable for Fountain documents.
+
- `html` suitable for HTML/XHTML documents.
- `java` suitable for source code in the Java language.
@@ -665,7 +824,7 @@ data by examining the beginning of the contents. However, sometimes you
may want to override its decision, either because a blob contains binary
data later in the file, or because the content, while technically
composed of text characters, is opaque to a human reader. For example,
-many postscript files contain only ascii characters, but produce noisy
+many postscript files contain only ASCII characters, but produce noisy
and meaningless diffs.
The simplest way to mark a file as binary is to unset the diff
@@ -680,7 +839,7 @@ patch, if binary patches are enabled) instead of a regular diff.
However, one may also want to specify other diff driver attributes. For
example, you might want to use `textconv` to convert postscript files to
-an ascii representation for human viewing, but otherwise treat them as
+an ASCII representation for human viewing, but otherwise treat them as
binary files. You cannot specify both `-diff` and `diff=ps` attributes.
The solution is to use the `diff.*.binary` config option:
@@ -774,7 +933,7 @@ To define a custom merge driver `filfre`, add a section to your
----------------------------------------------------------------
[merge "filfre"]
name = feel-free merge driver
- driver = filfre %O %A %B
+ driver = filfre %O %A %B %L %P
recursive = binary
----------------------------------------------------------------
@@ -800,6 +959,9 @@ merge between common ancestors, when there are more than one.
When left unspecified, the driver itself is used for both
internal merge and the final merge.
+The merge driver can learn the pathname in which the merged result
+will be stored via placeholder `%P`.
+
`conflict-marker-size`
^^^^^^^^^^^^^^^^^^^^^^