summaryrefslogtreecommitdiff
path: root/t/t4013/diff.diff-tree_--pretty_--patch-with-stat_initial
diff options
context:
space:
mode:
authorLibravatar Karsten Blees <karsten.blees@gmail.com>2015-07-01 21:10:47 +0200
committerLibravatar Junio C Hamano <gitster@pobox.com>2015-07-01 14:55:53 -0700
commit3a59e5954ef19ac94522219c2f29d49a187d31d8 (patch)
treeb79952098c087d313e92af57fb8fe0f87bb7a696 /t/t4013/diff.diff-tree_--pretty_--patch-with-stat_initial
parentSecond half of seventh batch (diff)
downloadtgif-3a59e5954ef19ac94522219c2f29d49a187d31d8.tar.xz
Documentation/i18n.txt: clarify character encoding support
As a "distributed" VCS, git should better define the encodings of its core textual data structures, in particular those that are part of the network protocol. That git is encoding agnostic is only really true for blob objects. E.g. the 'non-NUL bytes' requirement of tree and commit objects excludes UTF-16/32, and the special meaning of '/' in the index file as well as space and linefeed in commit objects eliminates EBCDIC and other non-ASCII encodings. Git expects bytes < 0x80 to be pure ASCII, thus CJK encodings that partly overlap with the ASCII range are problematic as well. E.g. fmt_ident() removes trailing 0x5C from user names on the assumption that it is ASCII '\'. However, there are over 200 GBK double byte codes that end in 0x5C. UTF-8 as default encoding on Linux and respective path translations in the Mac and Windows versions have established UTF-8 NFC as de-facto standard for path names. Update the documentation in i18n.txt to reflect the current status-quo. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 't/t4013/diff.diff-tree_--pretty_--patch-with-stat_initial')
0 files changed, 0 insertions, 0 deletions