summaryrefslogtreecommitdiff
path: root/vcs-svn
AgeCommit message (Collapse)AuthorFilesLines
2012-07-13Merge branch 'jn/vcs-svn'Libravatar Junio C Hamano7-40/+31
vcs-svn updates to clean-up compilation, lift 32-bit limitations, etc. * jn/vcs-svn: vcs-svn: allow 64-bit Prop-Content-Length vcs-svn: suppress a signed/unsigned comparison warning vcs-svn: suppress a signed/unsigned comparison warning vcs-svn: suppress signed/unsigned comparison warnings vcs-svn: use strstr instead of memmem vcs-svn: use constcmp instead of prefixcmp vcs-svn: simplify cleanup in apply_one_window vcs-svn: avoid self-assignment in dummy initialization of pre_off vcs-svn: drop no-op reset methods vcs-svn: suppress -Wtype-limits warning vcs-svn: allow import of > 4GiB files vcs-svn: rename check_overflow and its arguments for clarity
2012-07-05vcs-svn: allow 64-bit Prop-Content-LengthLibravatar Jonathan Nieder1-15/+18
Currently the vcs-svn/ library only pays attention to the presence of the Prop-Content-Length field and doesn't care about its value, but some day we might care about the value. Parse it as an off_t instead of arbitrarily limiting to 32 bits for intuitiveness. So now you can import from a dump with more than 2 GiB of properties for a node. In practice that isn't likely to happen often, and this is mostly meant as a cleanup. Based-on-patch-by: David Barr <davidbarr@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2012-07-05vcs-svn: suppress a signed/unsigned comparison warningLibravatar Jonathan Nieder1-2/+3
All callers pass a nonnegative delta_len, so the code is already safe. Add an assertion to ensure that remains so and add a cast to keep clang and gcc -Wsign-compare from worrying. Reported-by: David Barr <davidbarr@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2012-07-05vcs-svn: suppress a signed/unsigned comparison warningLibravatar David Barr1-1/+1
The preceding code checks that view->max_off is nonnegative and (off + width) fits in an off_t, so this code is already safe. Signed-off-by: David Barr <davidbarr@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2012-07-05vcs-svn: suppress signed/unsigned comparison warningsLibravatar David Barr1-2/+2
These are already safe because both sides of the comparison are nonnegative. This would normally not be important because Git is not -Wsign-compare clean anyway, but we like to keep the vcs-svn/ lib to a higher standard for convenience using it in other projects. Signed-off-by: David Barr <davidbarr@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2012-07-05vcs-svn: use strstr instead of memmemLibravatar David Barr1-1/+1
memmem is a GNU extension. Avoiding it makes the code clearer and makes it easier for projects that don't share git's compat/ code, such as the standalone svn-dump-fast-export project, to reuse the vcs-svn/ library. Signed-off-by: David Barr <davidbarr@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2012-07-05vcs-svn: use constcmp instead of prefixcmpLibravatar David Barr1-1/+1
Since the length of t is already known, we can simplify a little by using memcmp() instead of strncmp() to carry out a prefix comparison. All nearby code already does this. Noticed in the standalone svn-dump-fast-export project which has not needed to implement prefixcmp() yet. Signed-off-by: David Barr <davidbarr@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2012-07-05vcs-svn: simplify cleanup in apply_one_windowLibravatar David Barr1-4/+4
Currently the cleanup code looks like this: free resources return 0; error_out: free resources return -1; Avoid duplicating the "free resources" part by keeping the return value in a variable and sharing code between the success and exceptional case: ret = 0; out: free resources return ret; Noticed in the svn-dump-fast-export project, where using the error() macro in void context produces a warning. Signed-off-by: David Barr <davidbarr@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2012-07-05vcs-svn: avoid self-assignment in dummy initialization of pre_offLibravatar David Barr1-1/+1
Without this change, clang complains: vcs-svn/svndiff.c:298:3: warning: Assigned value is garbage or undefined off_t pre_off = pre_off; /* stupid GCC... */ ^ ~~~~~~~ This code uses an old and common idiom for suppressing an "uninitialized variable" warning, and clang is wrong to warn about it. The idiom tells the compiler to leave the variable uninitialized, which saves a few bytes of code size, and, more importantly, allows valgrind to check at runtime that the variable is properly initialized by the time it is used. But MSVC and clang do not know that idiom, so let's avoid it in vcs-svn/ code. Initialize pre_off to -1, a recognizably meaningless value, to allow future code changes that cause pre_off to be used before it is initialized to be caught early. Signed-off-by: David Barr <davidbarr@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2012-07-05vcs-svn: drop no-op reset methodsLibravatar David Barr5-13/+0
Since v1.7.5~42^2~6 (vcs-svn: remove buffer_read_string) buffer_reset() does nothing thus fast_export_reset() also. Signed-off-by: David Barr <davidbarr@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2012-04-30remove superfluous newlines in error messagesLibravatar Pete Wyckoff1-2/+2
The error handling routines add a newline. Remove the duplicate ones in error messages. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02vcs-svn: suppress a -Wtype-limits warningLibravatar Jonathan Nieder1-3/+3
On 32-bit architectures with 64-bit file offsets, gcc 4.3 and earlier produce the following warning: CC vcs-svn/sliding_window.o vcs-svn/sliding_window.c: In function `check_overflow': vcs-svn/sliding_window.c:36: warning: comparison is always false \ due to limited range of data type The warning appears even when gcc is run without any warning flags (this is gcc bug 12963). In later versions the same warning can be reproduced with -Wtype-limits, which is implied by -Wextra. On 64-bit architectures it really is possible for a size_t not to be representable as an off_t so the check this is warning about is not actually redundant. But even false positives are distracting. Avoid the warning by making the "len" argument to check_overflow a uintmax_t; no functional change intended. Reported-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02vcs-svn: allow import of > 4GiB filesLibravatar Jonathan Nieder3-14/+26
There is no reason in principle that an svn-format dump would not be able to represent a file whose length does not fit in a 32-bit integer. Use off_t consistently to represent file lengths (in place of using uint32_t in some contexts) so we can handle that. Most svn-fe code is already ready to do that without this patch and passes values of type off_t around. The type mismatch from stragglers was noticed with gcc -Wtype-limits. While at it, tighten the parsing of the Text-content-length field to make sure it is a number and does not overflow, and tighten other overflow checks as that value is passed around and manipulated. Inspired-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02vcs-svn: rename check_overflow arguments for clarityLibravatar Ramsay Jones1-7/+7
Code using the argument names a and b just doesn't look right (not sure why!). Use more explicit names "offset" and "len" to make their type and meaning clearer. Also rename check_overflow() to check_offset_overflow() to clarify that we are making sure that "len" bytes beyond "offset" still fits the type to represent an offset. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02vcs-svn: suppress -Wtype-limits warningLibravatar Jonathan Nieder1-3/+3
On 32-bit architectures with 64-bit file offsets, gcc 4.3 and earlier produce the following warning: CC vcs-svn/sliding_window.o vcs-svn/sliding_window.c: In function `check_overflow': vcs-svn/sliding_window.c:36: warning: comparison is always false \ due to limited range of data type The warning appears even when gcc is run without any warning flags (PR12963). In later versions it can be reproduced with -Wtype-limits, which is implied by -Wextra. On 64-bit architectures it really is possible for a size_t not to be representable as an off_t so the check being warned about is not actually redundant. But even false positives are distracting. Avoid the warning by making the "len" argument to check_overflow a uintmax_t; no functional change intended. Reported-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2012-02-02vcs-svn: allow import of > 4GiB filesLibravatar Jonathan Nieder3-14/+26
There is no reason in principle that an svn-format dump would not be able to represent a file whose length does not fit in a 32-bit integer. Use off_t consistently (instead of uint32_t) to represent file lengths so we can handle that. Most of our code is already ready to do that without this patch and already passes values of type off_t around. The type mismatch due to stragglers was noticed with gcc -Wtype-limits. Inspired-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2012-02-02vcs-svn: rename check_overflow and its arguments for clarityLibravatar Ramsay Jones1-7/+7
The canonical interpretation of a range a,b is as an interval [a,b), not [a,a+b), so this function taking argument names a and b feels unnatural. Use more explicit names "offset" and "len" to make the arguments' type and function clearer. While at it, rename the function to convey that we are making sure the sum of this offset and length do not overflow an off_t, not a size_t. [jn: split out from a patch from Ramsay Jones, then improved with advice from Thomas Rast, Dmitry Ivankov, and David Barr] Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Dmitry Ivankov <divanorama@gmail.com>
2012-01-27vcs-svn/svndiff.c: squelch false "unused" warning from gccLibravatar Junio C Hamano1-1/+1
Curiously, pre_len given to read_length() does not trigger the same warning even though the code structure is the same. Most likely this is because read_offset() is used only once and inlining it will make gcc realize that it has a chance to do more flow analysis. Alas, the analysis is flawed, so it does not help X-<. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-27Merge branch 'svn-fe' of git://repo.or.cz/git/jrn into jn/svn-feLibravatar Junio C Hamano18-938/+789
This simplifies svn-fe a great deal and fulfills a longstanding wish: support for dumps with deltas in them, and incremental imports. The cost is that commandline usage of the svn-fe tool becomes a little more complicated since it no longer keeps state itself but instead reads blobs back from fast-import in order to copy them between revisions and apply deltas to them. Also removes a couple of custom data structures and replaces them with strbufs like other parts of Git. * 'svn-fe' of git://repo.or.cz/git/jrn: (32 commits) vcs-svn: reset first_commit_done in fast_export_init vcs-svn: do not initialize report_buffer twice vcs-svn: avoid hangs from corrupt deltas vcs-svn: guard against overflow when computing preimage length vcs-svn: cap number of bytes read from sliding view test-svn-fe: split off "test-svn-fe -d" into a separate function vcs-svn: implement text-delta handling vcs-svn: let deltas use data from preimage vcs-svn: let deltas use data from postimage vcs-svn: verify that deltas consume all inline data vcs-svn: implement copyfrom_data delta instruction vcs-svn: read instructions from deltas vcs-svn: read inline data from deltas vcs-svn: read the preimage when applying deltas vcs-svn: parse svndiff0 window header vcs-svn: skeleton of an svn delta parser vcs-svn: make buffer_read_binary API more convenient vcs-svn: learn to maintain a sliding view of a file Makefile: list one vcs-svn/xdiff object or header per line vcs-svn: avoid using ls command twice ... Conflicts: Makefile contrib/svn-fe/svn-fe.txt
2011-12-21Fix a bitwise negation assignment issue spotted by Sun StudioLibravatar Ævar Arnfjörð Bjarmason2-3/+3
Change direct and indirect assignments of the bitwise negation of 0 to uint32_t variables to have a "U" suffix. I.e. ~0U instead of ~0. This eliminates warnings under Sun Studio 12 Update 1: "vcs-svn/string_pool.c", line 11: warning: initializer will be sign-extended: -1 (E_INIT_SIGN_EXTEND) "vcs-svn/string_pool.c", line 81: warning: initializer will be sign-extended: -1 (E_INIT_SIGN_EXTEND) "vcs-svn/repo_tree.c", line 112: warning: initializer will be sign-extended: -1 (E_INIT_SIGN_EXTEND) "vcs-svn/repo_tree.c", line 112: warning: initializer will be sign-extended: -1 (E_INIT_SIGN_EXTEND) "test-treap.c", line 34: warning: initializer will be sign-extended: -1 (E_INIT_SIGN_EXTEND) The semantics are still the same as demonstrated by this program: $ cat test.c && make test && ./test #include <stdio.h> #include <stdint.h> int main(void) { uint32_t foo = ~0; uint32_t bar = ~0U; printf("foo = <%u> bar = <%u>\n", foo, bar); return 0; } cc test.c -o test "test.c", line 5: warning: initializer will be sign-extended: -1 foo = <4294967295> bar = <4294967295> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-23vcs-svn: reset first_commit_done in fast_export_initLibravatar Dmitry Ivankov1-0/+1
first_commit_done has zero as a default value, but it is not reset back to zero in fast_export_init. Reset it back to zero so that each export will have proper initial state. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-06-21vcs-svn: do not initialize report_buffer twiceLibravatar Dmitry Ivankov1-12/+0
When importing from a dump with deltas, first fast_export_init calls buffer_fdinit, and then init_report_buffer calls fdopen once again when processing the first delta. The second initialization is redundant and leaks a FILE *. Remove the redundant on-demand initialization to fix this. Initializing directly in fast_export_init is simpler and lets the caller pass an int specifying which fd to use instead of hard-coding REPORT_FILENO. Signed-off-by: Dmitry Ivankov <divanorama@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-06-15vcs-svn: avoid hangs from corrupt deltasLibravatar Jonathan Nieder1-6/+9
A corrupt Subversion-format delta can request reads past the end of the preimage. Set sliding_view::max_off so such corruption is caught when it appears rather than blocking in an impossible-to-fulfill read() when input is coming from a socket or pipe. Inspired-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-06-15vcs-svn: guard against overflow when computing preimage lengthLibravatar Jonathan Nieder1-1/+14
Signed integer overflow produces undefined behavior in C and off_t is a signed type. For predictable behavior, add some checks to protect in advance against overflow. On 32-bit systems ftell as called by buffer_tmpfile_prepare_to_read is likely to fail with EOVERFLOW when reading the corresponding postimage, and this patch does not fix that. So it's more of a futureproofing measure than a complete fix. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-06-15Merge branch 'db/delta-applier' into db/text-deltaLibravatar Jonathan Nieder3-2/+5
* db/delta-applier: vcs-svn: cap number of bytes read from sliding view test-svn-fe: split off "test-svn-fe -d" into a separate function
2011-06-15vcs-svn: cap number of bytes read from sliding viewLibravatar Jonathan Nieder2-1/+4
Introduce a "max_off" field in struct sliding_view, roughly representing a maximum number of bytes that can be read from "file". If it is set to a nonnegative integer, a call to move_window() attempting to put the right endpoint beyond that offset will return an error instead. The idea is to use this when applying Subversion-format deltas to prevent reads past the end of the preimage (which has known length). Without such a check, corrupt deltas would cause svn-fe to block indefinitely when data in the input pipe is exhausted. Inspired-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-05-26vcs-svn: implement text-delta handlingLibravatar David Barr3-5/+120
Handle input in Subversion's dumpfile format, version 3. This is the format produced by "svnrdump dump" and "svnadmin dump --deltas", and the main difference between v3 dumpfiles and the dumpfiles already handled is that these can include nodes whose properties and text are expressed relative to some other node. To handle such nodes, we find which node the text and properties are based on, handle its property changes, use the cat-blob command to request the basis blob from the fast-import backend, use the svndiff0_apply() helper to apply the text delta on the fly, writing output to a temporary file, and then measure that postimage file's length and write its content to the fast-import stream. The temporary postimage file is shared between delta-using nodes to avoid some file system overhead. The svn-fe interface needs to be more complicated to accomodate the backward flow of information from the fast-import backend to svn-fe. The backflow fd is not needed when parsing streams without deltas, though, so existing scripts using svn-fe on v2 dumps should continue to work. NEEDSWORK: generalize interface so caller sets the backflow fd, close temporary file before exiting Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-05-26Merge branch 'db/delta-applier' into db/text-deltaLibravatar Jonathan Nieder7-4/+418
* db/delta-applier: vcs-svn: let deltas use data from preimage vcs-svn: let deltas use data from postimage vcs-svn: verify that deltas consume all inline data vcs-svn: implement copyfrom_data delta instruction vcs-svn: read instructions from deltas vcs-svn: read inline data from deltas vcs-svn: read the preimage when applying deltas vcs-svn: parse svndiff0 window header vcs-svn: skeleton of an svn delta parser vcs-svn: make buffer_read_binary API more convenient vcs-svn: learn to maintain a sliding view of a file Makefile: list one vcs-svn/xdiff object or header per line Conflicts: Makefile vcs-svn/LICENSE
2011-05-26Merge branch 'db/svn-fe-code-purge' into svn-feLibravatar Jonathan Nieder12-636/+58
* db/svn-fe-code-purge: vcs-svn: drop obj_pool vcs-svn: drop treap vcs-svn: drop string_pool vcs-svn: pass paths through to fast-import Conflicts: vcs-svn/fast_export.c vcs-svn/fast_export.h vcs-svn/repo_tree.c vcs-svn/repo_tree.h vcs-svn/string_pool.c vcs-svn/svndump.c vcs-svn/trp.txt
2011-05-26Merge branch 'db/vcs-svn-incremental' into svn-feLibravatar Jonathan Nieder7-355/+247
This teaches svn-fe to incrementally import into an existing repository (at last!) at the expense of less convenient UI. Think of it as growing pains. This opens the door to many excellent things, and it would be a bad idea to discourage people from building on it for much longer. * db/vcs-svn-incremental: vcs-svn: avoid using ls command twice vcs-svn: use mark from previous import for parent commit vcs-svn: handle filenames with dq correctly vcs-svn: quote paths correctly for ls command vcs-svn: eliminate repo_tree structure vcs-svn: add a comment before each commit vcs-svn: save marks for imported commits vcs-svn: use higher mark numbers for blobs vcs-svn: set up channel to read fast-import cat-blob response Conflicts: t/t9010-svn-fe.sh vcs-svn/fast_export.c vcs-svn/fast_export.h vcs-svn/repo_tree.c vcs-svn/svndump.c
2011-04-27Merge branch 'rj/sparse'Libravatar Junio C Hamano1-0/+1
* rj/sparse: sparse: Fix some "symbol not declared" warnings sparse: Fix errors due to missing target-specific variables sparse: Fix an "symbol 'merge_file' not decared" warning sparse: Fix an "symbol 'format_subject' not declared" warning sparse: Fix some "Using plain integer as NULL pointer" warnings sparse: Fix an "symbol 'cmd_index_pack' not declared" warning Makefile: Use cgcc rather than sparse in the check target
2011-04-22sparse: Fix some "symbol not declared" warningsLibravatar Ramsay Jones1-0/+1
In particular, sparse issues the "symbol 'a_symbol' was not declared. Should it be static?" warnings for the following symbols: attr.c:468:12: 'git_etc_gitattributes' attr.c:476:5: 'git_attr_system' vcs-svn/svndump.c:282:6: 'svndump_read' vcs-svn/svndump.c:417:5: 'svndump_init' vcs-svn/svndump.c:432:6: 'svndump_deinit' vcs-svn/svndump.c:445:6: 'svndump_reset' The symbols in attr.c only require file scope, so we add the static modifier to their declaration. The symbols in vcs-svn/svndump.c are external symbols, and they already have extern declarations in the "svndump.h" header file, so we simply include the header in svndump.c. Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-13remove doubled words, e.g., s/to to/to/, and fix related typosLibravatar Jim Meyering1-1/+1
I found that some doubled words had snuck back into projects from which I'd already removed them, so now there's a "syntax-check" makefile rule in gnulib to help prevent recurrence. Running the command below spotted a few in git, too: git ls-files | xargs perl -0777 -n \ -e 'while (/\b(then?|[iao]n|i[fst]|but|f?or|at|and|[dt])\s+\1\b/gims)' \ -e '{$n=($` =~ tr/\n/\n/ + 1); ($v=$&)=~s/\n/\\n/g;' \ -e 'print "$ARGV:$n:$v\n"}' Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-30Merge branch 'svn-fe' of git://repo.or.cz/git/jrnLibravatar Junio C Hamano1-1/+2
* 'svn-fe' of git://repo.or.cz/git/jrn: tests: kill backgrounded processes more robustly vcs-svn: a void function shouldn't try to return something tests: make sure input to sed is newline terminated vcs-svn: add missing cast to printf argument
2011-03-29vcs-svn: a void function shouldn't try to return somethingLibravatar Michael Witten1-1/+2
As v1.7.4-rc0~184 (2010-10-04) and C99 §6.8.6.4.1 remind us, standard C does not permit returning an expression of type void, even for a tail call. Noticed with gcc -pedantic: vcs-svn/svndump.c: In function 'handle_node': vcs-svn/svndump.c:213:3: warning: ISO C forbids 'return' with expression, in function returning void [-pedantic] [jn: with simplified log message] Signed-off-by: Michael Witten <mfwitten@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-28vcs-svn: add missing cast to printf argumentLibravatar Jonathan Nieder1-1/+2
gcc -m32 correctly warns: vcs-svn/fast_export.c: In function 'fast_export_commit': vcs-svn/fast_export.c:54:2: warning: format '%llu' expects argument of type 'long long unsigned int', but argument 2 has type 'unsigned int' [-Wformat] Fix it. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-28vcs-svn: let deltas use data from preimageLibravatar Jonathan Nieder1-5/+23
The copyfrom_source instruction appends data from the preimage buffer to the end of output. Its arguments are a length and an offset relative to the beginning of the source view. With this change, the delta applier is able to reproduce all 5,636,613 blobs in the early history of the ASF repository. Tested with mkfifo backflow svn-fe <svn-asf-public-r0:940166 3<backflow | git fast-import --cat-blob-fd=3 3>backflow with svn-asf-public-r0:940166 produced by whatever version of Subversion the dumps in /dump/ on svn.apache.org use (presumably 1.6.something). Improved-by: Ramkumar Ramachandra <artagnon@gmail.com> Improved-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
2011-03-27vcs-svn: let deltas use data from postimageLibravatar Jonathan Nieder1-2/+26
The copyfrom_target instruction copies appends data that is already present in the current output view to the end of output. (The offset argument is relative to the beginning of output produced in the current window.) The region copied is allowed to run past the end of the existing output. To support that case, copy one character at a time rather than calling memcpy or memmove. This allows copyfrom_target to be used once to repeat a string many times. For example: COPYFROM_DATA 2 COPYFROM_OUTPUT 10, 0 DATA "ab" would produce the output "ababababababababababab". Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
2011-03-27vcs-svn: verify that deltas consume all inline dataLibravatar Jonathan Nieder1-0/+2
By constraining the format of deltas, we can more easily detect corruption and other breakage. Requiring deltas not to provide unconsumed data also opens the possibility of ignoring the declared amount of novel data and simply streaming the data as needed to fulfill copyfrom_data requests. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
2011-03-27vcs-svn: implement copyfrom_data delta instructionLibravatar Jonathan Nieder1-7/+108
The copyfrom_data instruction copies a few bytes verbatim from the novel text section of a window to the postimage. [jn: with memory leak fix from David] Improved-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
2011-03-27vcs-svn: read instructions from deltasLibravatar Jonathan Nieder1-2/+5
Buffer the instruction section upon encountering it for later interpretation. An alternative design would involve parsing the instructions at this point and buffering them in some processed form. Using the unprocessed form is simpler. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
2011-03-27vcs-svn: read inline data from deltasLibravatar Jonathan Nieder1-11/+35
Each window of an svndiff0-format delta includes a section for novel text to be copied to the postimage (in the order it appears in the window, possibly interspersed with other data). Slurp in this data when encountering it. It is not actually necessary to do so --- it would be just as easy to copy from delta to output as part of interpreting the relevant instructions --- but this way, the code that interprets svndiff0 instructions can proceed very quickly because it does not require I/O. Subversion's svndiff0 parser rejects deltas that do not consume all the novel text that was provided. Omit that check for now so we can test the new functionality right away, rather than waiting to learn instructions that consume data. Do check for truncated data sections. Subversion's parser rejects deltas that end in the middle of a declared novel-text section, so it should be safe for us to reject them, too. Improved-by: Ramkumar Ramachandra <artagnon@gmail.com> Improved-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
2011-03-27vcs-svn: read the preimage when applying deltasLibravatar Jonathan Nieder1-0/+2
The source view offset heading each svndiff0 window represents a number of bytes past the beginning of the preimage. Together with the source view length, it dictates to the delta applier what portion of the preimage instructions will refer to. Read that portion right away using the sliding window code. Maybe some day we will use mmap to read data more lazily. Subversion's implementation tolerates source view offsets pointing past the end of the preimage file but we do not, for simplicity. This does not teach the delta applier to read instructions or copy data from the source view. Deltas that could produce nonempty output will still be rejected. Improved-by: Ramkumar Ramachandra <artagnon@gmail.com> Improved-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
2011-03-27vcs-svn: parse svndiff0 window headerLibravatar Jonathan Nieder1-5/+87
Each window in a subversion delta (svndiff0-format file) starts with a window header, consisting of five integers with variable-length representation: source view offset source view length output length instructions length auxiliary data length Parse it. The result is not usable for deltas with nonempty postimage yet; in fact, this only adds support for deltas without any instructions or auxiliary data. This is a good place to stop, though, since that little support lets us add some simple passing tests concerning error handling to the test suite. Improved-by: Ramkumar Ramachandra <artagnon@gmail.com> Improved-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-27vcs-svn: skeleton of an svn delta parserLibravatar Jonathan Nieder2-0/+62
A delta in the subversion delta (svndiff0) format consists of the magic bytes SVN\0 followed by a sequence of windows of a certain well specified format (starting with five integers). Add an svndiff0_apply function and test-svn-fe -d commandline tool to parse such a delta in the special case of not including any windows. Later patches will add features to turn this into a fully functional delta applier for svn-fe to use to parse the streams produced by "svnrdump dump" and "svnadmin dump --deltas". The content of symlinks starts with the word "link " in Subversion's worldview, so we need to be able to prepend that text to input for the sake of delta application. So initialization of the input state of the delta preimage is left to the calling program, giving callers a chance to seed the buffer with text of their choice. Improved-by: Ramkumar Ramachandra <artagnon@gmail.com> Improved-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-27vcs-svn: make buffer_read_binary API more convenientLibravatar Jonathan Nieder2-4/+4
buffer_read_binary is a thin wrapper around fread, but its signature is wrong: - fread can fill an arbitrary in-memory buffer. buffer_read_binary is limited to buffers whose size is representable by a 32-bit integer. - The result from fread is the number of bytes actually read. buffer_read_binary only reports the number of bytes read by incrementing sb->len by that amount and returns void. Fix both: let buffer_read_binary accept a size_t instead of uint32_t for the number of bytes to read and as a convenience return the number of bytes actually read. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-27vcs-svn: learn to maintain a sliding view of a fileLibravatar Jonathan Nieder3-0/+96
Each section of a Subversion-format delta only requires examining (and keeping in random-access memory) a small portion of the preimage. At any moment, this portion starts at a certain file offset and has a well-defined length, and as the delta is applied, the portion advances from the beginning to the end of the preimage. Add a move_window function to keep track of this view into the preimage. You can use it like this: buffer_init(f, NULL); struct sliding_view window = SLIDING_VIEW_INIT(f); move_window(&window, 3, 7); /* (1) */ move_window(&window, 5, 5); /* (2) */ move_window(&window, 12, 2); /* (3) */ strbuf_release(&window.buf); buffer_deinit(f); The data structure is called sliding_view instead of _window to prevent confusion with svndiff0 Windows. In this example, (1) reads 10 bytes and discards the first 3; (2) discards the first 2, which are not needed any more; and (3) skips 2 bytes and reads 2 new bytes to work with. When move_window returns, the file position indicator is at position window->off + window->width and the data from positions window->off to the current file position are stored in window->buf. This function performs only sequential access from the input file and never seeks, so it can be safely used on pipes and sockets. On end-of-file, move_window silently reads less than the caller requested. On other errors, it prints a message and returns -1. Helped-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-27vcs-svn: add missing cast to printf argumentLibravatar Jonathan Nieder1-1/+2
gcc -m32 correctly warns: vcs-svn/fast_export.c: In function 'fast_export_commit': vcs-svn/fast_export.c:54:2: warning: format '%llu' expects argument of type 'long long unsigned int', but argument 2 has type 'unsigned int' [-Wformat] Fix it. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2011-03-26Merge branch 'svn-fe' of git://repo.or.cz/git/jrnLibravatar Junio C Hamano8-55/+39
* 'svn-fe' of git://repo.or.cz/git/jrn: vcs-svn: handle log message with embedded NUL vcs-svn: avoid unnecessary copying of log message and author vcs-svn: remove buffer_read_string vcs-svn: make reading of properties binary-safe
2011-03-26vcs-svn: avoid using ls command twiceLibravatar David Barr3-24/+6
Currently there are two functions to retrieve the mode and content at a path: const char *repo_read_path(const uint32_t *path); uint32_t repo_read_mode(const uint32_t *path) Replace them with a single function with two return values. This means we can use one round-trip to get the same information from fast-import that previously took two. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>