diff options
Diffstat (limited to 'Documentation/git-fast-import.txt')
-rw-r--r-- | Documentation/git-fast-import.txt | 152 |
1 files changed, 125 insertions, 27 deletions
diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt index 3d3d219e58..7d9aad2a7e 100644 --- a/Documentation/git-fast-import.txt +++ b/Documentation/git-fast-import.txt @@ -9,7 +9,7 @@ git-fast-import - Backend for fast Git data importers SYNOPSIS -------- [verse] -frontend | 'git fast-import' [options] +frontend | 'git fast-import' [<options>] DESCRIPTION ----------- @@ -40,9 +40,10 @@ OPTIONS not contain the old commit). --quiet:: - Disable all non-fatal output, making fast-import silent when it - is successful. This option disables the output shown by - --stats. + Disable the output shown by --stats, making fast-import usually + be silent when it is successful. However, if the import stream + has directives intended to show user output (e.g. `progress` + directives), the corresponding messages will still be shown. --stats:: Display some basic statistics about the objects fast-import has @@ -50,6 +51,21 @@ OPTIONS memory used by fast-import during this run. Showing this output is currently the default, but can be disabled with --quiet. +--allow-unsafe-features:: + Many command-line options can be provided as part of the + fast-import stream itself by using the `feature` or `option` + commands. However, some of these options are unsafe (e.g., + allowing fast-import to access the filesystem outside of the + repository). These options are disabled by default, but can be + allowed by providing this option on the command line. This + currently impacts only the `export-marks`, `import-marks`, and + `import-marks-if-exists` feature commands. ++ + Only enable this option if you trust the program generating the + fast-import stream! This option is enabled automatically for + remote-helpers that use the `import` capability, as they are + already trusted to run their own code. + Options for Frontends ~~~~~~~~~~~~~~~~~~~~~ @@ -106,6 +122,26 @@ Locations of Marks Files Relative and non-relative marks may be combined by interweaving --(no-)-relative-marks with the --(import|export)-marks= options. +Submodule Rewriting +~~~~~~~~~~~~~~~~~~~ + +--rewrite-submodules-from=<name>:<file>:: +--rewrite-submodules-to=<name>:<file>:: + Rewrite the object IDs for the submodule specified by <name> from the values + used in the from <file> to those used in the to <file>. The from marks should + have been created by `git fast-export`, and the to marks should have been + created by `git fast-import` when importing that same submodule. ++ +<name> may be any arbitrary string not containing a colon character, but the +same value must be used with both options when specifying corresponding marks. +Multiple submodules may be specified with different values for <name>. It is an +error not to use these options in corresponding pairs. ++ +These options are primarily useful when converting a repository from one hash +algorithm to another; without them, fast-import will fail if it encounters a +submodule because it has no way of writing the object ID into the new hash +algorithm. + Performance and Compression Tuning ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -139,7 +175,7 @@ Performance and Compression Tuning fastimport.unpackLimit:: See linkgit:git-config[1] -Performance +PERFORMANCE ----------- The design of fast-import allows it to import large projects in a minimum amount of memory usage and processing time. Assuming the frontend @@ -155,7 +191,7 @@ faster if the source data is stored on a different drive than the destination Git repository (due to less IO contention). -Development Cost +DEVELOPMENT COST ---------------- A typical frontend for fast-import tends to weigh in at approximately 200 lines of Perl/Python/Ruby code. Most developers have been able to @@ -165,7 +201,7 @@ an ideal situation, given that most conversion tools are throw-away (use once, and never look back). -Parallel Operation +PARALLEL OPERATION ------------------ Like 'git push' or 'git fetch', imports handled by fast-import are safe to run alongside parallel `git repack -a -d` or `git gc` invocations, @@ -186,7 +222,7 @@ this only be used on an otherwise quiet repository. Using --force is not necessary for an initial import into an empty repository. -Technical Discussion +TECHNICAL DISCUSSION -------------------- fast-import tracks a set of branches in memory. Any branch can be created or modified at any point during the import process by sending a @@ -204,7 +240,7 @@ directory also allows fast-import to run very quickly, as it does not need to perform any costly file update operations when switching between branches. -Input Format +INPUT FORMAT ------------ With the exception of raw file data (which Git does not interpret) the fast-import input format is text (ASCII) based. This text based @@ -257,7 +293,14 @@ by users who are located in the same location and time zone. In this case a reasonable offset from UTC could be assumed. + Unlike the `rfc2822` format, this format is very strict. Any -variation in formatting will cause fast-import to reject the value. +variation in formatting will cause fast-import to reject the value, +and some sanity checks on the numeric values may also be performed. + +`raw-permissive`:: + This is the same as `raw` except that no sanity checks on + the numeric epoch and local offset are performed. This can + be useful when trying to filter or import an existing history + with e.g. bogus timezone values. `rfc2822`:: This is the standard email format as described by RFC 2822. @@ -336,6 +379,13 @@ and control the current import process. More detailed discussion `commit` command. This command is optional and is not needed to perform an import. +`alias`:: + Record that a mark refers to a given object without first + creating any new object. Using --import-marks and referring + to missing marks will cause fast-import to fail, so aliases + can provide a way to set otherwise pruned commits to a valid + value (e.g. the nearest non-pruned ancestor). + `checkpoint`:: Forces fast-import to close the current packfile, generate its unique SHA-1 checksum and index, and start a new packfile. @@ -384,11 +434,13 @@ change to the project. .... 'commit' SP <ref> LF mark? + original-oid? ('author' (SP <name>)? SP LT <email> GT SP <when> LF)? 'committer' (SP <name>)? SP LT <email> GT SP <when> LF + ('encoding' SP <encoding>)? data ('from' SP <commit-ish> LF)? - ('merge' SP <commit-ish> LF)? + ('merge' SP <commit-ish> LF)* (filemodify | filedelete | filecopy | filerename | filedeleteall | notemodify)* LF? .... @@ -420,7 +472,12 @@ However it is recommended that a `filedeleteall` command precede all `filemodify`, `filecopy`, `filerename` and `notemodify` commands in the same commit, as `filedeleteall` wipes the branch clean (see below). -The `LF` after the command is optional (it used to be required). +The `LF` after the command is optional (it used to be required). Note +that for reasons of backward compatibility, if the commit ends with a +`data` command (i.e. it has no `from`, `merge`, `filemodify`, +`filedelete`, `filecopy`, `filerename`, `filedeleteall` or +`notemodify` commands) then two `LF` commands may appear at the end of +the command instead of just one. `author` ^^^^^^^^ @@ -448,6 +505,12 @@ that was selected by the --date-format=<fmt> command-line option. See ``Date Formats'' above for the set of supported formats, and their syntax. +`encoding` +^^^^^^^^^^ +The optional `encoding` command indicates the encoding of the commit +message. Most commits are UTF-8 and the encoding is omitted, but this +allows importing commit messages into git without first reencoding them. + `from` ^^^^^^ The `from` command is used to specify the commit to initialize @@ -740,6 +803,19 @@ New marks are created automatically. Existing marks can be moved to another object simply by reusing the same `<idnum>` in another `mark` command. +`original-oid` +~~~~~~~~~~~~~~ +Provides the name of the object in the original source control system. +fast-import will simply ignore this directive, but filter processes +which operate on and modify the stream before feeding to fast-import +may have uses for this information + +.... + 'original-oid' SP <object-identifier> LF +.... + +where `<object-identifer>` is any string not containing LF. + `tag` ~~~~~ Creates an annotated tag referring to a specific commit. To create @@ -747,7 +823,9 @@ lightweight (non-annotated) tags see the `reset` command below. .... 'tag' SP <name> LF + mark? 'from' SP <commit-ish> LF + original-oid? 'tagger' (SP <name>)? SP LT <email> GT SP <when> LF data .... @@ -822,6 +900,7 @@ assigned mark. .... 'blob' LF mark? + original-oid? data .... @@ -884,6 +963,21 @@ a data chunk which does not have an LF as its last byte. + The `LF` after `<delim> LF` is optional (it used to be required). +`alias` +~~~~~~~ +Record that a mark refers to a given object without first creating any +new object. + +.... + 'alias' LF + mark + 'to' SP <commit-ish> LF + LF? +.... + +For a detailed description of `<commit-ish>` see above under `from`. + + `checkpoint` ~~~~~~~~~~~~ Forces fast-import to close the current packfile, start a new one, and to @@ -949,10 +1043,6 @@ might want to refer to in their commit messages. 'get-mark' SP ':' <idnum> LF .... -This command can be used anywhere in the stream that comments are -accepted. In particular, the `get-mark` command can be used in the -middle of a commit but not in the middle of a `data` command. - See ``Responses To Commands'' below for details about how to read this output safely. @@ -979,9 +1069,10 @@ Output uses the same format as `git cat-file --batch`: <contents> LF ==== -This command can be used anywhere in the stream that comments are -accepted. In particular, the `cat-blob` command can be used in the -middle of a commit but not in the middle of a `data` command. +This command can be used where a `filemodify` directive can appear, +allowing it to be used in the middle of a commit. For a `filemodify` +using an inline directive, it can also appear right before the `data` +directive. See ``Responses To Commands'' below for details about how to read this output safely. @@ -994,8 +1085,8 @@ printing a blob from the active commit (with `cat-blob`) or copying a blob or tree from a previous commit for use in the current one (with `filemodify`). -The `ls` command can be used anywhere in the stream that comments are -accepted, including the middle of a commit. +The `ls` command can also be used where a `filemodify` directive can +appear, allowing it to be used in the middle of a commit. Reading from the active commit:: This form can only be used in the middle of a `commit`. @@ -1131,7 +1222,7 @@ If the `--done` command-line option or `feature done` command is in use, the `done` command is mandatory and marks the end of the stream. -Responses To Commands +RESPONSES TO COMMANDS --------------------- New objects written by fast-import are not available immediately. Most fast-import commands have no visible effect until the next @@ -1160,7 +1251,7 @@ To avoid deadlock, such frontends must completely consume any pending output from `progress`, `ls`, `get-mark`, and `cat-blob` before performing writes to fast-import that might block. -Crash Reports +CRASH REPORTS ------------- If fast-import is supplied invalid input it will terminate with a non-zero exit status and create a crash report in the top level of @@ -1247,7 +1338,7 @@ An example crash: END OF CRASH REPORT ==== -Tips and Tricks +TIPS AND TRICKS --------------- The following tips and tricks have been collected from various users of fast-import, and are offered here as suggestions. @@ -1349,7 +1440,7 @@ Your users will feel better knowing how much of the data stream has been processed. -Packfile Optimization +PACKFILE OPTIMIZATION --------------------- When packing a blob fast-import always attempts to deltify against the last blob written. Unless specifically arranged for by the frontend, @@ -1379,8 +1470,15 @@ deltas are suboptimal (see above) then also adding the `-f` option to force recomputation of all deltas can significantly reduce the final packfile size (30-50% smaller can be quite typical). +Instead of running `git repack` you can also run `git gc +--aggressive`, which will also optimize other things after an import +(e.g. pack loose refs). As noted in the "AGGRESSIVE" section in +linkgit:git-gc[1] the `--aggressive` option will find new deltas with +the `-f` option to linkgit:git-repack[1]. For the reasons elaborated +on above using `--aggressive` after a fast-import is one of the few +cases where it's known to be worthwhile. -Memory Utilization +MEMORY UTILIZATION ------------------ There are a number of factors which affect how much memory fast-import requires to perform an import. Like critical sections of core @@ -1458,7 +1556,7 @@ and lazy loading of subtrees, allows fast-import to efficiently import projects with 2,000+ branches and 45,114+ files in a very limited memory footprint (less than 2.7 MiB per active branch). -Signals +SIGNALS ------- Sending *SIGUSR1* to the 'git fast-import' process ends the current packfile early, simulating a `checkpoint` command. The impatient |