summaryrefslogtreecommitdiff
path: root/vcs-svn/svndump.c
AgeCommit message (Collapse)AuthorFilesLines
2011-02-26Merge commit 'jn/svn-fe' of git://github.com/gitster/git into svn-feLibravatar Jonathan Nieder1-67/+142
* git://github.com/gitster/git: vcs-svn: Allow change nodes for root of tree (/) vcs-svn: Implement Prop-delta handling vcs-svn: Sharpen parsing of property lines vcs-svn: Split off function for handling of individual properties vcs-svn: Make source easier to read on small screens vcs-svn: More dump format sanity checks vcs-svn: Reject path nodes without Node-action vcs-svn: Delay read of per-path properties vcs-svn: Combine repo_replace and repo_modify functions vcs-svn: Replace = Delete + Add vcs-svn: handle_node: Handle deletion case early vcs-svn: Use mark to indicate nodes with included text vcs-svn: Unclutter handle_node by introducing have_props var vcs-svn: Eliminate node_ctx.mark global vcs-svn: Eliminate node_ctx.srcRev global vcs-svn: Check for errors from open() vcs-svn: Allow simple v3 dumps (no deltas yet) Conflicts: t/t9010-svn-fe.sh vcs-svn/svndump.c
2011-02-26vcs-svn: teach line_buffer to handle multiple input filesLibravatar Jonathan Nieder1-13/+16
Collect the line_buffer state in a newly public line_buffer struct. Callers can use multiple line_buffers to manage input from multiple files at a time. svn-fe's delta applier will use this to stream a delta from svnrdump and the preimage it applies to from fast-import at the same time. The tests don't take advantage of the new features, but I think that's okay. It is easier to find lingering examples of nonreentrant code by searching for "static" in line_buffer.c. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-01-18svndump.c: Fix a printf format compiler warningLibravatar Ramsay Jones1-1/+1
In particular, on systems that define uint32_t as an unsigned long, gcc complains as follows: CC vcs-svn/svndump.o vcs-svn/svndump.c: In function `svndump_read': vcs-svn/svndump.c:215: warning: int format, uint32_t arg (arg 2) In order to suppress the warning we use the C99 format specifier macro PRIu32 from <inttypes.h>. Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Acked-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-07vcs-svn: Allow change nodes for root of tree (/)Libravatar Jonathan Nieder1-1/+4
It is not uncommon for a svn repository to include change records for properties at the top level of the tracked tree: Node-path: Node-kind: dir Node-action: change Prop-delta: true Prop-content-length: 43 Content-length: 43 K 10 svn:ignore V 11 build-area PROPS-END Unfortunately a recent svn-fe change (vcs-svn: More dump format sanity checks, 2010-11-19) causes such nodes to be rejected with the error message fatal: invalid dump: path to be modified is missing The repo_tree module does not keep a dirent for the root of the tree. Add a block to the dump parser to take care of this case. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: Implement Prop-delta handlingLibravatar David Barr1-10/+44
The rules for what file is used as delta source for each file are not documented in dump-load-format.txt. Luckily, the Apache Software Foundation repository has rich enough examples to figure out most of the rules: Node-action: replace implies the empty property set and empty text as preimage for deltas. Otherwise, if a copyfrom source is given, that node is the preimage for deltas. Lastly, if none of the above applies and the node path exists in the current revision, then that version forms the basis. [jn: refactored, with tests] Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: Sharpen parsing of property linesLibravatar Jonathan Nieder1-11/+19
Prepare to add a new type of property line (the 'D' line) to handle property deltas. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: Split off function for handling of individual propertiesLibravatar Jonathan Nieder1-14/+19
The handle_property function is the part of read_props that would be interesting for most people: semantics of properties rather than the algorithm for parsing them. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: Make source easier to read on small screensLibravatar Jonathan Nieder1-8/+0
Remove some newlines from handle_node() that are not needed for clarity. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: More dump format sanity checksLibravatar Jonathan Nieder1-4/+14
Node-action: change is not appropriate when switching between file and directory or adding a new file. Current svn-fe silently accepts such nodes and the resulting tree has missing files in the "changed when meant to add" case. Node-action: add requires some content (text or directory); there is no such thing as an "intent to add" node in svn dumps. Current svn-fe accepts such contentless adds but produces an invalid fast-import stream that refers to nonexistent mark :0 in response. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: Reject path nodes without Node-actionLibravatar Jonathan Nieder1-2/+5
It would be better to flag such errors and let the import proceed anyway, but for now it is simpler not to worry about recovery from such weird cases. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: Delay read of per-path propertiesLibravatar Jonathan Nieder1-22/+18
The mode for each file in an svn-format dump is kept in the properties section. The properties section is read as soon as possible to allow the correct mode to be filled in when registering the file with the repo_tree lib. To support nodes with a missing properties section, svn-fe determines the mode in three stages: - The kind (directory or file) of the node is read from the dump and used to make an initial estimate (040000 or 100644). - Properties are read in and allowed to override this for symlinks and executables. - If there is no properties section, the mode from the previous content of the path is left alone, overriding the above considerations. This is a bit of a mess, and worse, it would get even more complicated once we start to support property deltas. If we could only register the file with a provisional value for mode and then change it later when properties say so, the procedure would be much simpler. ... oh, right, we can. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: Combine repo_replace and repo_modify functionsLibravatar Jonathan Nieder1-4/+4
There are two functions to change the staged content for a path in the svn importer's active commit: repo_replace, which changes the text and returns the mode, and repo_modify, which changes the text and mode and returns nothing. Worse, there are more subtle differences: - A mark of 0 passed to repo_modify means "use the existing content". repo_replace uses it as mark :0 and produces a corrupt stream. - When passed a path that is not part of the active commit, repo_replace returns without doing anything. repo_modify transparently adds a new directory entry. Get rid of both and introduce a new function with the best features of both: repo_modify_path modifies the mode, content, or both for a path, depending on which arguments are zero. If no such dirent already exists, it does nothing and reports the error by returning 0. Otherwise, the return value is the resulting mode. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: Replace = Delete + AddLibravatar Jonathan Nieder1-6/+7
Simplify by reducing the "Node-action: replace" case to "Node-action: add". This way, the main part of handle_node() only has to deal with "add" and "change" nodes. Functional change: replacing a symlink or executable without setting properties will reset the mode. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: handle_node: Handle deletion case earlyLibravatar Jonathan Nieder1-3/+8
Take care of "Node-action: delete" as soon as possible, so we can stop worrying about that case in the rest of the function. Functional change: catch deletion nodes with features that would not apply to them (text, properties, or origin data) and error out for those cases. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: Use mark to indicate nodes with included textLibravatar Jonathan Nieder1-8/+8
Allocate a mark if needed as soon as possible so later code can use "if (mark)" to check if this node has text attached rather than explicitly checking for Text-content-length. While at it, reject directory nodes with text attached; the presence of such a node would indicate a bug in the dump generator or svn-fe's understanding. In the long term, it would be nice to be able to continue parsing and save the error for later, but for now it is simpler to error out right away. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: Unclutter handle_node by introducing have_props varLibravatar Jonathan Nieder1-4/+5
It is possible for a path node in an SVN-format dump file to leave out the properties section. svn-fe handles this by carrying over the properties (in particular, file type) from the old version of that node. To support this, handle_node tests several times whether a Prop-content-length field is present. Ancient Subversion actually leaves out the Prop-content-length field even for nodes with properties, so that's not quite the right check. Besides, this detail of mechanism is distracting when the question at hand is instead what content the new node should have. So introduce a local have_props variable. The semantics are the same as before; the adaptations to support ancient streams that leave out the prop-content-length can wait until someone needs them. Signed-off-by: Jonathan Nieder <jrnieer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: Eliminate node_ctx.mark globalLibravatar Jonathan Nieder1-12/+11
The mark variable is only used in handle_node(). Its life is very short and simple: first, a new mark number is allocated if this node has text attached, then that mark is recorded in the in-core tree being built up, and lastly the mark is communicated to fast-import in the stream along with the associated text. A new reader may worry about interaction with other code, especially since mark is not initialized to zero in handle_node() itself. Disperse such worries by making it local. No functional change intended. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: Eliminate node_ctx.srcRev globalLibravatar Jonathan Nieder1-7/+8
The srcRev variable is only used in handle_node(); its purpose is to hold the old mode for a path, to only be used if properties are not being changed. Narrow its scope to make its meaningful lifetime more obvious. No functional change intended. Add some tests as a sanity-check for the simplest case (no renames). Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: Check for errors from open()Libravatar Jonathan Nieder1-2/+4
test-svn-fe segfaults when passed a bogus path. Simplify debugging by exiting with a meaningful error message instead. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: Allow simple v3 dumps (no deltas yet)Libravatar David Barr1-3/+18
Since the dumpfile version 1 days, the Subversion dump format gained some new fields: - a unique identifier for the repository (version 2 format) - whether the text and properties for a node should be interpreted as deltas - checksums for a delta's preimage - SHA-1 sums as alternatives to the existing MD5 checksums for copy source and the payload (delta). For now what is relevant to us is the Text-delta and Prop-delta fields, since not noticing these causes a dump file to be misinterpreted (see the previous commit). [jn: with tests] Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24vcs-svn: Error out for v3 dumpsLibravatar Jonathan Nieder1-3/+10
By ignoring the Text-Delta and Prop-Delta node fields, current svn-fe happily mistakes deltas for full text and instead of cleanly erroring out, it produces a valid but semantically bogus fast-import stream when fed a dump file in the modern "svnadmin dump --deltas" format. Dump file parsers are supposed to ignore header fields they don't understand (to allow for backward-compatible extensions), but they are also supposed to check the SVN-fs-dump-format-version header to prevent misinterpretation of non backward-compatible extensions. Do so. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-12vcs-svn: Fix some printf format compiler warningsLibravatar Ramsay Jones1-1/+1
In particular, on systems that define uint32_t as an unsigned long, gcc complains as follows: CC vcs-svn/fast_export.o vcs-svn/fast_export.c: In function `fast_export_modify': vcs-svn/fast_export.c:28: warning: unsigned int format, uint32_t arg (arg 2) vcs-svn/fast_export.c:28: warning: int format, uint32_t arg (arg 3) vcs-svn/fast_export.c: In function `fast_export_commit': vcs-svn/fast_export.c:42: warning: int format, uint32_t arg (arg 5) vcs-svn/fast_export.c:62: warning: int format, uint32_t arg (arg 2) vcs-svn/fast_export.c: In function `fast_export_blob': vcs-svn/fast_export.c:72: warning: int format, uint32_t arg (arg 2) vcs-svn/fast_export.c:72: warning: int format, uint32_t arg (arg 3) CC vcs-svn/svndump.o vcs-svn/svndump.c: In function `svndump_read': vcs-svn/svndump.c:260: warning: int format, uint32_t arg (arg 3) In order to suppress the warnings we use the C99 format specifier macros PRIo32 and PRIu32 from <inttypes.h>. Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Acked-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-14SVN dump parserLibravatar David Barr1-0/+302
svndump parses data that is in SVN dumpfile format produced by `svnadmin dump` with the help of line_buffer and uses repo_tree and fast_export to emit a git fast-import stream. Based roughly on com.hydrografix.svndump 0.92 from the SvnToCCase project at <http://svn2cc.sarovar.org/>, by Stefan Hegny and others. [rr: allow input from files other than stdin] [jn: with test, more error reporting] Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>