Age | Commit message (Collapse) | Author | Files | Lines |
|
Shifting 'unsigned char' or 'unsigned short' left can result in sign
extension errors, since the C integer promotion rules means that the
unsigned char/short will get implicitly promoted to a signed 'int' due to
the shift (or due to other operations).
This normally doesn't matter, but if you shift things up sufficiently, it
will now set the sign bit in 'int', and a subsequent cast to a bigger type
(eg 'long' or 'unsigned long') will now sign-extend the value despite the
original expression being unsigned.
One example of this would be something like
unsigned long size;
unsigned char c;
size += c << 24;
where despite all the variables being unsigned, 'c << 24' ends up being a
signed entity, and will get sign-extended when then doing the addition in
an 'unsigned long' type.
Since git uses 'unsigned char' pointers extensively, we actually have this
bug in a couple of places.
I may have missed some, but this is the result of looking at
git grep '[^0-9 ][ ]*<<[ ][a-z]' -- '*.c' '*.h'
git grep '<<[ ]*24'
which catches at least the common byte cases (shifting variables by a
variable amount, and shifting by 24 bits).
I also grepped for just 'unsigned char' variables in general, and
converted the ones that most obviously ended up getting implicitly cast
immediately anyway (eg hash_name(), encode_85()).
In addition to just avoiding 'unsigned char', this patch also tries to use
a common idiom for the delta header size thing. We had three different
variations on it: "& 0x7fUL" in one place (getting the sign extension
right), and "& ~0x80" and "& 0x7f" in two other places (not getting it
right). Apart from making them all just avoid using "unsigned char" at
all, I also unified them to then use a simple "& 0x7f".
I considered making a sparse extension which warns about doing implicit
casts from unsigned types to signed types, but it gets rather complex very
quickly, so this is just a hack.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Delta indices, at least on 64-bit platforms, tend to be larger than
the actual uncompressed data. As such, keeping track of this storage
is important if you want to successfully limit the memory size of your
pack window.
Squirrel away the total allocation size inside the delta_index struct,
and add an accessor "sizeof_delta_index" to access it.
Signed-off-by: Brian Downing <bdowning@lavos.net>
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Avoid creating a delta index for objects with maximum depth since they
are not going to be used as delta base anyway. This also reduce peak
memory usage slightly as the current object's delta index is not useful
until the next object in the loop is considered for deltification. This
saves a bit more than 1% on CPU usage.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
This patch splits the diff-delta interface into index creation and delta
generation. A wrapper is provided to preserve the diff-delta() call.
This will allow for an optimization in pack-objects.c where the source
object could be fixed and a full window of objects tentatively tried
against
that same source object without recomputing the source index each time.
This patch only restructure things, plus a couple cleanups for good
measure. There is no performance change yet.
Signed-off-by: Nicolas Pitre <nico@cam.org>
|
|
Let's avoid going south with invalid delta data.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
My kernel work habit made me look at the generated assembly for the
delta code, and one obvious albeit small improvement is this patch.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
This is a wrap-up patch including all the cleanups I've done to the
delta code and its usage. The most important change is the
factorization of the delta header handling code.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Packed delta files created by git-pack-objects seems to be the
way to go, and existing "delta" object handling code has exposed
the object representation details to too many places. Remove it
while we refactor code to come up with a proper interface in
sha1_file.c.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Anything that generates a delta to see if two objects are close usually
isn't interested in the delta ends up being bigger than some specified
size, and this allows us to stop delta generation early when that
happens.
|
|
Make 'sha1' parameters const where possible
Signed-off-by: Jason McMullan <jason.mcmullan@timesys.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This adds knowledge of delta objects to fsck-cache and various object
parsing code. A new switch to git-fsck-cache is provided to display the
maximum delta depth found in a repository.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch adds the basic library functions to create and replay delta
information. Also included is a test-delta utility to validate the
code.
diff-delta was based on LibXDiff written by Davide Libenzi
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|