summaryrefslogtreecommitdiff
path: root/vcs-svn
AgeCommit message (Collapse)AuthorFilesLines
2011-05-26vcs-svn: implement text-delta handlingLibravatar David Barr3-5/+120
Handle input in Subversion's dumpfile format, version 3. This is the format produced by "svnrdump dump" and "svnadmin dump --deltas", and the main difference between v3 dumpfiles and the dumpfiles already handled is that these can include nodes whose properties and text are expressed relative to some other node. To handle such nodes, we find which node the text and properties are based on, handle its property changes, use the cat-blob command to request the basis blob from the fast-import backend, use the svndiff0_apply() helper to apply the text delta on the fly, writing output to a temporary file, and then measure that postimage file's length and write its content to the fast-import stream. The temporary postimage file is shared between delta-using nodes to avoid some file system overhead. The svn-fe interface needs to be more complicated to accomodate the backward flow of information from the fast-import backend to svn-fe. The backflow fd is not needed when parsing streams without deltas, though, so existing scripts using svn-fe on v2 dumps should continue to work. NEEDSWORK: generalize interface so caller sets the backflow fd, close temporary file before exiting Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-05-26Merge branch 'db/delta-applier' into db/text-deltaLibravatar Jonathan Nieder7-4/+418
* db/delta-applier: vcs-svn: let deltas use data from preimage vcs-svn: let deltas use data from postimage vcs-svn: verify that deltas consume all inline data vcs-svn: implement copyfrom_data delta instruction vcs-svn: read instructions from deltas vcs-svn: read inline data from deltas vcs-svn: read the preimage when applying deltas vcs-svn: parse svndiff0 window header vcs-svn: skeleton of an svn delta parser vcs-svn: make buffer_read_binary API more convenient vcs-svn: learn to maintain a sliding view of a file Makefile: list one vcs-svn/xdiff object or header per line Conflicts: Makefile vcs-svn/LICENSE
2011-05-26Merge branch 'db/svn-fe-code-purge' into svn-feLibravatar Jonathan Nieder12-636/+58
* db/svn-fe-code-purge: vcs-svn: drop obj_pool vcs-svn: drop treap vcs-svn: drop string_pool vcs-svn: pass paths through to fast-import Conflicts: vcs-svn/fast_export.c vcs-svn/fast_export.h vcs-svn/repo_tree.c vcs-svn/repo_tree.h vcs-svn/string_pool.c vcs-svn/svndump.c vcs-svn/trp.txt
2011-05-26Merge branch 'db/vcs-svn-incremental' into svn-feLibravatar Jonathan Nieder7-355/+247
This teaches svn-fe to incrementally import into an existing repository (at last!) at the expense of less convenient UI. Think of it as growing pains. This opens the door to many excellent things, and it would be a bad idea to discourage people from building on it for much longer. * db/vcs-svn-incremental: vcs-svn: avoid using ls command twice vcs-svn: use mark from previous import for parent commit vcs-svn: handle filenames with dq correctly vcs-svn: quote paths correctly for ls command vcs-svn: eliminate repo_tree structure vcs-svn: add a comment before each commit vcs-svn: save marks for imported commits vcs-svn: use higher mark numbers for blobs vcs-svn: set up channel to read fast-import cat-blob response Conflicts: t/t9010-svn-fe.sh vcs-svn/fast_export.c vcs-svn/fast_export.h vcs-svn/repo_tree.c vcs-svn/svndump.c
2011-04-13remove doubled words, e.g., s/to to/to/, and fix related typosLibravatar Jim Meyering1-1/+1
I found that some doubled words had snuck back into projects from which I'd already removed them, so now there's a "syntax-check" makefile rule in gnulib to help prevent recurrence. Running the command below spotted a few in git, too: git ls-files | xargs perl -0777 -n \ -e 'while (/\b(then?|[iao]n|i[fst]|but|f?or|at|and|[dt])\s+\1\b/gims)' \ -e '{$n=($` =~ tr/\n/\n/ + 1); ($v=$&)=~s/\n/\\n/g;' \ -e 'print "$ARGV:$n:$v\n"}' Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-30Merge branch 'svn-fe' of git://repo.or.cz/git/jrnLibravatar Junio C Hamano1-1/+2
* 'svn-fe' of git://repo.or.cz/git/jrn: tests: kill backgrounded processes more robustly vcs-svn: a void function shouldn't try to return something tests: make sure input to sed is newline terminated vcs-svn: add missing cast to printf argument
2011-03-29vcs-svn: a void function shouldn't try to return somethingLibravatar Michael Witten1-1/+2
As v1.7.4-rc0~184 (2010-10-04) and C99 ยง6.8.6.4.1 remind us, standard C does not permit returning an expression of type void, even for a tail call. Noticed with gcc -pedantic: vcs-svn/svndump.c: In function 'handle_node': vcs-svn/svndump.c:213:3: warning: ISO C forbids 'return' with expression, in function returning void [-pedantic] [jn: with simplified log message] Signed-off-by: Michael Witten <mfwitten@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-28vcs-svn: add missing cast to printf argumentLibravatar Jonathan Nieder1-1/+2
gcc -m32 correctly warns: vcs-svn/fast_export.c: In function 'fast_export_commit': vcs-svn/fast_export.c:54:2: warning: format '%llu' expects argument of type 'long long unsigned int', but argument 2 has type 'unsigned int' [-Wformat] Fix it. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-28vcs-svn: let deltas use data from preimageLibravatar Jonathan Nieder1-5/+23
The copyfrom_source instruction appends data from the preimage buffer to the end of output. Its arguments are a length and an offset relative to the beginning of the source view. With this change, the delta applier is able to reproduce all 5,636,613 blobs in the early history of the ASF repository. Tested with mkfifo backflow svn-fe <svn-asf-public-r0:940166 3<backflow | git fast-import --cat-blob-fd=3 3>backflow with svn-asf-public-r0:940166 produced by whatever version of Subversion the dumps in /dump/ on svn.apache.org use (presumably 1.6.something). Improved-by: Ramkumar Ramachandra <artagnon@gmail.com> Improved-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
2011-03-27vcs-svn: let deltas use data from postimageLibravatar Jonathan Nieder1-2/+26
The copyfrom_target instruction copies appends data that is already present in the current output view to the end of output. (The offset argument is relative to the beginning of output produced in the current window.) The region copied is allowed to run past the end of the existing output. To support that case, copy one character at a time rather than calling memcpy or memmove. This allows copyfrom_target to be used once to repeat a string many times. For example: COPYFROM_DATA 2 COPYFROM_OUTPUT 10, 0 DATA "ab" would produce the output "ababababababababababab". Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
2011-03-27vcs-svn: verify that deltas consume all inline dataLibravatar Jonathan Nieder1-0/+2
By constraining the format of deltas, we can more easily detect corruption and other breakage. Requiring deltas not to provide unconsumed data also opens the possibility of ignoring the declared amount of novel data and simply streaming the data as needed to fulfill copyfrom_data requests. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
2011-03-27vcs-svn: implement copyfrom_data delta instructionLibravatar Jonathan Nieder1-7/+108
The copyfrom_data instruction copies a few bytes verbatim from the novel text section of a window to the postimage. [jn: with memory leak fix from David] Improved-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
2011-03-27vcs-svn: read instructions from deltasLibravatar Jonathan Nieder1-2/+5
Buffer the instruction section upon encountering it for later interpretation. An alternative design would involve parsing the instructions at this point and buffering them in some processed form. Using the unprocessed form is simpler. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
2011-03-27vcs-svn: read inline data from deltasLibravatar Jonathan Nieder1-11/+35
Each window of an svndiff0-format delta includes a section for novel text to be copied to the postimage (in the order it appears in the window, possibly interspersed with other data). Slurp in this data when encountering it. It is not actually necessary to do so --- it would be just as easy to copy from delta to output as part of interpreting the relevant instructions --- but this way, the code that interprets svndiff0 instructions can proceed very quickly because it does not require I/O. Subversion's svndiff0 parser rejects deltas that do not consume all the novel text that was provided. Omit that check for now so we can test the new functionality right away, rather than waiting to learn instructions that consume data. Do check for truncated data sections. Subversion's parser rejects deltas that end in the middle of a declared novel-text section, so it should be safe for us to reject them, too. Improved-by: Ramkumar Ramachandra <artagnon@gmail.com> Improved-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
2011-03-27vcs-svn: read the preimage when applying deltasLibravatar Jonathan Nieder1-0/+2
The source view offset heading each svndiff0 window represents a number of bytes past the beginning of the preimage. Together with the source view length, it dictates to the delta applier what portion of the preimage instructions will refer to. Read that portion right away using the sliding window code. Maybe some day we will use mmap to read data more lazily. Subversion's implementation tolerates source view offsets pointing past the end of the preimage file but we do not, for simplicity. This does not teach the delta applier to read instructions or copy data from the source view. Deltas that could produce nonempty output will still be rejected. Improved-by: Ramkumar Ramachandra <artagnon@gmail.com> Improved-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
2011-03-27vcs-svn: parse svndiff0 window headerLibravatar Jonathan Nieder1-5/+87
Each window in a subversion delta (svndiff0-format file) starts with a window header, consisting of five integers with variable-length representation: source view offset source view length output length instructions length auxiliary data length Parse it. The result is not usable for deltas with nonempty postimage yet; in fact, this only adds support for deltas without any instructions or auxiliary data. This is a good place to stop, though, since that little support lets us add some simple passing tests concerning error handling to the test suite. Improved-by: Ramkumar Ramachandra <artagnon@gmail.com> Improved-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-27vcs-svn: skeleton of an svn delta parserLibravatar Jonathan Nieder2-0/+62
A delta in the subversion delta (svndiff0) format consists of the magic bytes SVN\0 followed by a sequence of windows of a certain well specified format (starting with five integers). Add an svndiff0_apply function and test-svn-fe -d commandline tool to parse such a delta in the special case of not including any windows. Later patches will add features to turn this into a fully functional delta applier for svn-fe to use to parse the streams produced by "svnrdump dump" and "svnadmin dump --deltas". The content of symlinks starts with the word "link " in Subversion's worldview, so we need to be able to prepend that text to input for the sake of delta application. So initialization of the input state of the delta preimage is left to the calling program, giving callers a chance to seed the buffer with text of their choice. Improved-by: Ramkumar Ramachandra <artagnon@gmail.com> Improved-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-27vcs-svn: make buffer_read_binary API more convenientLibravatar Jonathan Nieder2-4/+4
buffer_read_binary is a thin wrapper around fread, but its signature is wrong: - fread can fill an arbitrary in-memory buffer. buffer_read_binary is limited to buffers whose size is representable by a 32-bit integer. - The result from fread is the number of bytes actually read. buffer_read_binary only reports the number of bytes read by incrementing sb->len by that amount and returns void. Fix both: let buffer_read_binary accept a size_t instead of uint32_t for the number of bytes to read and as a convenience return the number of bytes actually read. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-27vcs-svn: learn to maintain a sliding view of a fileLibravatar Jonathan Nieder3-0/+96
Each section of a Subversion-format delta only requires examining (and keeping in random-access memory) a small portion of the preimage. At any moment, this portion starts at a certain file offset and has a well-defined length, and as the delta is applied, the portion advances from the beginning to the end of the preimage. Add a move_window function to keep track of this view into the preimage. You can use it like this: buffer_init(f, NULL); struct sliding_view window = SLIDING_VIEW_INIT(f); move_window(&window, 3, 7); /* (1) */ move_window(&window, 5, 5); /* (2) */ move_window(&window, 12, 2); /* (3) */ strbuf_release(&window.buf); buffer_deinit(f); The data structure is called sliding_view instead of _window to prevent confusion with svndiff0 Windows. In this example, (1) reads 10 bytes and discards the first 3; (2) discards the first 2, which are not needed any more; and (3) skips 2 bytes and reads 2 new bytes to work with. When move_window returns, the file position indicator is at position window->off + window->width and the data from positions window->off to the current file position are stored in window->buf. This function performs only sequential access from the input file and never seeks, so it can be safely used on pipes and sockets. On end-of-file, move_window silently reads less than the caller requested. On other errors, it prints a message and returns -1. Helped-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-27vcs-svn: add missing cast to printf argumentLibravatar Jonathan Nieder1-1/+2
gcc -m32 correctly warns: vcs-svn/fast_export.c: In function 'fast_export_commit': vcs-svn/fast_export.c:54:2: warning: format '%llu' expects argument of type 'long long unsigned int', but argument 2 has type 'unsigned int' [-Wformat] Fix it. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-26Merge branch 'svn-fe' of git://repo.or.cz/git/jrnLibravatar Junio C Hamano8-55/+39
* 'svn-fe' of git://repo.or.cz/git/jrn: vcs-svn: handle log message with embedded NUL vcs-svn: avoid unnecessary copying of log message and author vcs-svn: remove buffer_read_string vcs-svn: make reading of properties binary-safe
2011-03-26vcs-svn: avoid using ls command twiceLibravatar David Barr3-24/+6
Currently there are two functions to retrieve the mode and content at a path: const char *repo_read_path(const uint32_t *path); uint32_t repo_read_mode(const uint32_t *path) Replace them with a single function with two return values. This means we can use one round-trip to get the same information from fast-import that previously took two. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-26vcs-svn: handle log message with embedded NULLibravatar Jonathan Nieder5-13/+17
Pass the log message by strbuf instead of as a C-style string and use fwrite instead of printf to write it to fast-import so embedded '\0' bytes can be preserved. Currently "git log" doesn't show the embedded NULs but "git cat-file commit" can. While at it, stop including system headers from repo_tree.h. git source files need to include git-compat-util.h (or cache.h or builtin.h) sooner to ensure the appropriate feature test macros are defined. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-26vcs-svn: avoid unnecessary copying of log message and authorLibravatar Jonathan Nieder1-10/+10
Use strbuf_swap when storing the svn:log and svn:author properties, so pointers to rather than the contents of buffers get copied. The main effect should be to make the code a little easier to read. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-26vcs-svn: remove buffer_read_stringLibravatar Jonathan Nieder3-20/+4
All previous users of buffer_read_string have already been converted to use the more intuitive buffer_read_binary, so remove the old API to avoid some confusion. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-26vcs-svn: make reading of properties binary-safeLibravatar Jonathan Nieder1-14/+10
svn-fe errors out on revision 59151 of the ASF repository: fatal: invalid dump: unexpected end of file The proximate cause is a property with an embedded NUL character. Previously such anomalies were ignored but commit c9d1c8ba (2010-12-28) introduced a check strlen(val) == len to avoid reading uninitialized data when a property list ends early and unfortunately this test does not distinguish between "foo" followed by EOF and the string "foo\0bar\0baz". Fix it by using buffer_read_binary to read to a strbuf and checking the actual length read. Most consumers of properties still use C-style strings, so in practice an author or log message with embedded NULs will be truncated, but a least this way svn-fe won't error out (fixing the regression). Reported-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-22Merge branch 'svn-fe' of git://repo.or.cz/git/jrnLibravatar Junio C Hamano8-172/+265
* 'svn-fe' of git://repo.or.cz/git/jrn: vcs-svn: use strchr to find RFC822 delimiter vcs-svn: implement perfect hash for top-level keys vcs-svn: implement perfect hash for node-prop keys vcs-svn: use strbuf for author, UUID, and URL vcs-svn: use strbuf for revision log vcs-svn: improve reporting of input errors vcs-svn: make buffer_copy_bytes return length read vcs-svn: make buffer_skip_bytes return length read vcs-svn: improve support for reading large files vcs-svn: allow input errors to be detected promptly vcs-svn: simplify repo_modify_path and repo_copy vcs-svn: handle_node: use repo_read_path vcs-svn: introduce repo_read_path to check the content at a path
2011-03-22Merge branch 'db/length-as-hash' into svn-feLibravatar Jonathan Nieder1-69/+105
* db/length-as-hash: vcs-svn: use strchr to find RFC822 delimiter vcs-svn: implement perfect hash for top-level keys vcs-svn: implement perfect hash for node-prop keys Conflicts: vcs-svn/svndump.c
2011-03-22vcs-svn: drop obj_poolLibravatar David Barr1-61/+0
This reverts commit 4709455db3891f6cad9a96a574296b4926f70cbe (Add memory pool library, 2010-08-09). svn-fe uses strbufs to avoid memory allocation overhead nowadays. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-22vcs-svn: drop treapLibravatar David Barr3-349/+0
This reverts commit 951f316470acc7c785c460a4e40735b22822349f (Add treap implementation, 2010-08-09). The string_pool was trp.h's last user. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-22vcs-svn: drop string_poolLibravatar David Barr3-168/+0
This reverts commit 1d73b52f5ba4184de6acf474f14668001304a10c (Add string-specific memory pool, 2010-08-09). Now that svn-fe does not need to maintain a growing collection of strings (paths) over a long period of time, the string_pool is not needed. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-22vcs-svn: pass paths through to fast-importLibravatar David Barr5-62/+62
Now that there is no internal representation of the repo, it is not necessary to tokenise paths. Use strbuf instead and bypass string_pool. This means svn-fe can handle arbitrarily long paths (as long as a strbuf can fit them), with arbitrarily many path components. While at it, since we now treat paths in their entirety, only quote when necessary. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-22Merge branch 'db/strbufs-for-metadata' into db/svn-fe-code-purgeLibravatar Jonathan Nieder4-48/+47
* db/strbufs-for-metadata: vcs-svn: use strbuf for author, UUID, and URL vcs-svn: use strbuf for revision log Conflicts: vcs-svn/fast_export.c vcs-svn/fast_export.h vcs-svn/repo_tree.c vcs-svn/svndump.c
2011-03-22Merge branch 'db/length-as-hash' (early part) into db/svn-fe-code-purgeLibravatar Jonathan Nieder5-93/+161
* 'db/length-as-hash' (early part): vcs-svn: implement perfect hash for top-level keys vcs-svn: implement perfect hash for node-prop keys vcs-svn: improve reporting of input errors vcs-svn: make buffer_copy_bytes return length read vcs-svn: make buffer_skip_bytes return length read vcs-svn: improve support for reading large files Conflicts: vcs-svn/fast_export.c vcs-svn/svndump.c
2011-03-22vcs-svn: use strchr to find RFC822 delimiterLibravatar David Barr1-2/+5
This is a small optimisation (4% reduction in user time) but is the largest artifact within the parsing portion of svndump.c Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-22vcs-svn: implement perfect hash for top-level keysLibravatar David Barr1-50/+59
Instead of interning property names and comparing their string_pool keys, look them up in a table by string length, which should be about as fast. Another small step towards removing dependence on string_pool altogether. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-22vcs-svn: implement perfect hash for node-prop keysLibravatar David Barr1-19/+43
Instead of interning property names and comparing their string_pool keys, look them up in a table by string length, which should be about as fast. This is a small step towards removing dependence on string_pool. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-22vcs-svn: use strbuf for author, UUID, and URLLibravatar David Barr5-30/+43
Use strbufs and strings instead of interned strings for values of rev, dump, and node fields that happen to be strings. After this change, the only remaining string_pool use is for paths in the repo_tree API and internals. Functional change: treat an empty author, UUID, or URL as none at all. So for example, in repos where the first revision has an empty svn:author property, the first rev will be treated as by "nobody" rather than by a person with empty name and email address created by prepending an @ sign to the repository UUID. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-22vcs-svn: use strbuf for revision logLibravatar David Barr1-20/+8
obj_pool is overkill for this application: all that is needed is a buffer that can resize from rev to rev to accomodate differently-sized strings. In the spirit of commit deadcef4 (2010-11-06), use a strbuf instead. This is a small step towards removing dependence on obj_pool.h. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-22vcs-svn: improve reporting of input errorsLibravatar Jonathan Nieder2-5/+37
Catch input errors and exit early enough to print a reasonable diagnosis based on errno. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-22vcs-svn: make buffer_copy_bytes return length readLibravatar Jonathan Nieder2-10/+11
Currently buffer_copy_bytes does not report to its caller whether it encountered an early end of file. Add a return value representing the number of bytes read (but not the number of bytes copied). This way all three unusual conditions can be distinguished: input error with buffer_ferror, output error with ferror(outfile), early end of input by checking the return value. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-22vcs-svn: make buffer_skip_bytes return length readLibravatar Jonathan Nieder3-8/+10
Currently there is no way to detect when input ended if it ended early during buffer_skip_bytes. Tell the calling program how many bytes were actually skipped for easier debugging. Existing callers will still ignore early EOF. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-22vcs-svn: improve support for reading large filesLibravatar Jonathan Nieder2-4/+4
Move from uint32_t to off_t as the fundamental unit of length used by the line_buffer library. Performance would get worse if anything but I think it's worth it for support of deltas that need to skip large pieces (> 4 GiB). Exception: buffer_read_string still takes a uint32_t, since it keeps its result in an in-core obj_pool. Callers still have to be updated to take advantage of this. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-16vcs-svn: remove spurious semicolonsLibravatar Jonathan Nieder2-2/+2
trp_gen is not a statement or function call, so it should not be followed with a semicolon. Noticed by gcc -pedantic. vcs-svn/repo_tree.c:41:81: warning: ISO C does not allow extra ';' outside of a function [-pedantic] Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-07vcs-svn: use mark from previous import for parent commitLibravatar David Barr1-1/+1
With this patch, overlapping incremental imports work. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-07vcs-svn: handle filenames with dq correctlyLibravatar Jonathan Nieder1-10/+9
Quote paths passed to fast-import so filenames with double quotes are not misinterpreted. One might imagine this could help with filenames with newlines, too, but svn does not allow those. Helped-by: David Barr <daivd.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-07vcs-svn: quote paths correctly for ls commandLibravatar David Barr3-1/+13
This bug was found while importing rev 601865 of ASF. [jn: with test] Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-07vcs-svn: eliminate repo_tree structureLibravatar Jonathan Nieder7-340/+184
Rely on fast-import for information about previous revs. This requires always setting up backward flow of information, even for v2 dumps. On the plus side, it simplifies the code by quite a bit and opens the door to further simplifications. [db: adjusted to support final version of the cat-blob patch] [jn: avoiding hard-coding git's name for the empty tree for portability to other backends] Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-07vcs-svn: add a comment before each commitLibravatar Jonathan Nieder3-7/+28
Current svn-fe produces output like this: blob mark :7382321 data 5 hello blob mark :7382322 data 5 Hello commit mark :3 [...] M 100644 :7382321 hello.c M 100644 :7382322 hello2.c This means svn-fe has to keep track of the paths modified in each commit and the corresponding marks, instead of dealing with each file as it arrives in input and then forgetting about it. A better strategy would be to use inline blobs: commit mark :3 [...] M 100644 inline hello.c data 5 hello [...] As a first step towards that, teach svn-fe to notice when the collection of blobs for each commit starts and write a comment ("# commit 3.") there. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-07vcs-svn: save marks for imported commitsLibravatar Jonathan Nieder1-0/+1
This way, a person can use svnadmin dump $path | svn-fe | git fast-import --relative-marks --export-marks=svn-revs to get a list of what commit corresponds to each svn revision (plus some irrelevant blob names) in .git/info/fast-import/svn-revs. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>