summaryrefslogtreecommitdiff
path: root/ewah/ewok.h
AgeCommit message (Collapse)AuthorFilesLines
2018-06-18ewah_read_mmap: bounds-check mmap readsLibravatar Jeff King1-1/+1
The on-disk ewah format tells us how big the ewah data is, and we blindly read that much from the buffer without considering whether the mmap'd data is long enough, which can lead to out-of-bound reads. Let's make sure we have data available before reading it, both for the ewah header/footer as well as for the bit data itself. In particular: - keep our ptr/len pair in sync as we move through the buffer, and check it before each read - check the size for integer overflow (this should be impossible on 64-bit, as the size is given as a 32-bit count of 8-byte words, but is possible on a 32-bit system) - return the number of bytes read as an ssize_t instead of an int, again to prevent integer overflow - compute the return value using a pointer difference; this should yield the same result as the existing code, but makes it more obvious that we got our computations right The included test is far from comprehensive, as it just picks a static point at which to truncate the generated bitmap. But in practice this will hit in the middle of an ewah and make sure we're at least exercising this code. Reported-by: Luat Nguyen <root@l4w.io> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-09Replace Free Software Foundation address in license noticesLibravatar Todd Zullinger1-2/+1
The mailing address for the FSF has changed over the years. Rather than updating the address across all files, refer readers to gnu.org, as the GNU GPL documentation now suggests for license notices. The mailing address is retained in the full license files (COPYING and LGPL-2.1). The old address is still present in t/diff-lib/COPYING. This is intentional, as the file is used in tests and the contents are not expected to change. Signed-off-by: Todd Zullinger <tmz@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22convert ewah/bitmap code to use xmallocLibravatar Jeff King1-10/+0
This code was originally written with the idea that it could be spun off into its own ewah library, and uses the overrideable ewah_malloc to do allocations. We plug in xmalloc as our ewah_malloc, of course. But over the years the ewah code itself has become more entangled with git, and the return value of many ewah_malloc sites is not checked. Let's just drop the level of indirection and use xmalloc and friends directly. This saves a few lines, and will let us adapt these sites to our more advanced malloc helpers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-24Merge branch 'es/osx-header-pollutes-mask-macro'Libravatar Junio C Hamano1-1/+1
* es/osx-header-pollutes-mask-macro: ewah: use less generic macro name ewah/bitmap: silence warning about MASK macro redefinition
2015-06-03ewah: use less generic macro nameLibravatar Jeff King1-1/+1
The ewah/ewok.h header pollutes the global namespace with "BITS_IN_WORD", without any specific notion that we are talking about the bits in an eword_t. We can give this the more specific name "BITS_IN_EWORD". Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12ewah: add convenient wrapper ewah_serialize_strbuf()Libravatar Nguyễn Thái Ngọc Duy1-0/+2
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-18Merge branch 'jk/pack-bitmap'Libravatar Junio C Hamano1-1/+2
The pack bitmap support did not build with older versions of GCC. * jk/pack-bitmap: ewah: fix building with gcc < 3.4.0
2015-02-04ewah: fix building with gcc < 3.4.0Libravatar Tom G. Christensen1-1/+2
The __builtin_ctzll function was added in gcc 3.4.0. This extends the check for gcc so that use of __builtin_ctzll is only enabled if gcc >= 3.4.0. Signed-off-by: Tom G. Christensen <tgc@statsbiblioteket.dk> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-29ewah: delete unused ewah_read_mmap_native declarationLibravatar Nguyễn Thái Ngọc Duy1-1/+0
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-29ewah: fix constness of ewah_read_mmapLibravatar Nguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30ewah: compressed bitmap implementationLibravatar Vicent Marti1-0/+233
EWAH is a word-aligned compressed variant of a bitset (i.e. a data structure that acts as a 0-indexed boolean array for many entries). It uses a 64-bit run-length encoding (RLE) compression scheme, trading some compression for better processing speed. The goal of this word-aligned implementation is not to achieve the best compression, but rather to improve query processing time. As it stands right now, this EWAH implementation will always be more efficient storage-wise than its uncompressed alternative. EWAH arrays will be used as the on-disk format to store reachability bitmaps for all objects in a repository while keeping reasonable sizes, in the same way that JGit does. This EWAH implementation is a mostly straightforward port of the original `javaewah` library that JGit currently uses. The library is self-contained and has been embedded whole (4 files) inside the `ewah` folder to ease redistribution. The library is re-licensed under the GPLv2 with the permission of Daniel Lemire, the original author. The source code for the C version can be found on GitHub: https://github.com/vmg/libewok The original Java implementation can also be found on GitHub: https://github.com/lemire/javaewah [jc: stripped debug-only code per Peff's $gmane/239768] Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>