summaryrefslogtreecommitdiff
path: root/sha1_file.c
AgeCommit message (Collapse)AuthorFilesLines
2017-11-27Merge branch 'tb/add-renormalize'Libravatar Junio C Hamano1-2/+14
"git add --renormalize ." is a new and safer way to record the fact that you are correcting the end-of-line convention and other "convert_to_git()" glitches in the in-repository data. * tb/add-renormalize: add: introduce "--renormalize"
2017-11-17add: introduce "--renormalize"Libravatar Torsten Bögershausen1-2/+14
Make it safer to normalize the line endings in a repository. Files that had been commited with CRLF will be commited with LF. The old way to normalize a repo was like this: # Make sure that there are not untracked files $ echo "* text=auto" >.gitattributes $ git read-tree --empty $ git add . $ git commit -m "Introduce end-of-line normalization" The user must make sure that there are no untracked files, otherwise they would have been added and tracked from now on. The new "add --renormalize" does not add untracked files: $ echo "* text=auto" >.gitattributes $ git add --renormalize . $ git commit -m "Introduce end-of-line normalization" Note that "git add --renormalize <pathspec>" is the short form for "git add -u --renormalize <pathspec>". While at it, document that the same renormalization may be needed, whenever a clean filter is added or changed. Helped-By: Junio C Hamano <gitster@pobox.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-15Merge branch 'jk/info-alternates-fix'Libravatar Junio C Hamano1-1/+3
We used to add an empty alternate object database to the system that does not help anything; it has been corrected. * jk/info-alternates-fix: link_alt_odb_entries: make empty input a noop
2017-11-13link_alt_odb_entries: make empty input a noopLibravatar Jeff King1-1/+3
If an empty string is passed to link_alt_odb_entries(), our loop finds no entries and we link nothing. But we still do some preparatory work to normalize the object directory path, even though we'll never look at the result. This triggers in basically every git process, since we feed the usually-empty ALTERNATE_DB_ENVIRONMENT to the function. Let's detect early that there's nothing to do and return. While we're at it, let's treat NULL the same as an empty string as a favor to our callers. That saves prepare_alt_odb() from having to cover this case. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-09Merge branch 'rs/hex-to-bytes-cleanup'Libravatar Junio C Hamano1-13/+11
Code cleanup. * rs/hex-to-bytes-cleanup: sha1_file: use hex_to_bytes() http-push: use hex_to_bytes() notes: move hex_to_bytes() to hex.c and export it
2017-11-06Merge branch 'bc/object-id'Libravatar Junio C Hamano1-16/+16
Conversion from uchar[20] to struct object_id continues. * bc/object-id: (25 commits) refs/files-backend: convert static functions to object_id refs: convert read_raw_ref backends to struct object_id refs: convert peel_object to struct object_id refs: convert resolve_ref_unsafe to struct object_id worktree: convert struct worktree to object_id refs: convert resolve_gitlink_ref to struct object_id Convert remaining callers of resolve_gitlink_ref to object_id sha1_file: convert index_path and index_fd to struct object_id refs: convert reflog_expire parameter to struct object_id refs: convert read_ref_at to struct object_id refs: convert peel_ref to struct object_id builtin/pack-objects: convert to struct object_id pack-bitmap: convert traverse_bitmap_commit_list to object_id refs: convert dwim_log to struct object_id builtin/reflog: convert remaining unsigned char uses to object_id refs: convert dwim_ref and expand_ref to struct object_id refs: convert read_ref and read_ref_full to object_id refs: convert resolve_refdup and refs_resolve_refdup to struct object_id Convert check_connected to use struct object_id refs: update ref transactions to use struct object_id ...
2017-11-06Merge branch 'ma/lockfile-fixes'Libravatar Junio C Hamano1-11/+8
An earlier update made it possible to use an on-stack in-core lockfile structure (as opposed to having to deliberately leak an on-heap one). Many codepaths have been updated to take advantage of this new facility. * ma/lockfile-fixes: read_cache: roll back lock in `update_index_if_able()` read-cache: leave lock in right state in `write_locked_index()` read-cache: drop explicit `CLOSE_LOCK`-flag cache.h: document `write_locked_index()` apply: remove `newfd` from `struct apply_state` apply: move lockfile into `apply_state` cache-tree: simplify locking logic checkout-index: simplify locking logic tempfile: fix documentation on `delete_tempfile()` lockfile: fix documentation on `close_lock_file_gently()` treewide: prefer lockfiles on the stack sha1_file: do not leak `lock_file`
2017-11-01sha1_file: use hex_to_bytes()Libravatar René Scharfe1-13/+11
The path of a loose object contains its hash value encoded into two substrings of 2 and 38 hexadecimal digits separated by a slash. The first part is handed to for_each_file_in_obj_subdir() in decoded form as subdir_nr. The current code builds a full hexadecimal representation of the hash in a temporary buffer, then uses get_oid_hex() to decode it. Avoid the intermediate step by taking subdir_nr as-is and using hex_to_bytes() directly on the second substring. That's shorter and easier. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16refs: convert resolve_gitlink_ref to struct object_idLibravatar brian m. carlson1-1/+1
Convert the declaration and definition of resolve_gitlink_ref to use struct object_id and apply the following semantic patch: @@ expression E1, E2, E3; @@ - resolve_gitlink_ref(E1, E2, E3.hash) + resolve_gitlink_ref(E1, E2, &E3) @@ expression E1, E2, E3; @@ - resolve_gitlink_ref(E1, E2, E3->hash) + resolve_gitlink_ref(E1, E2, E3) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16sha1_file: convert index_path and index_fd to struct object_idLibravatar brian m. carlson1-15/+15
Convert these two functions and the functions that underlie them to take pointers to struct object_id. This is a prerequisite to convert resolve_gitlink_ref. Fix a stray tab in the middle of the index_mem call in index_pipe by converting it to a space. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-11Merge branch 'jk/sha1-loose-object-info-fix'Libravatar Junio C Hamano1-2/+6
Leakfix and futureproofing. * jk/sha1-loose-object-info-fix: sha1_loose_object_info: handle errors from unpack_sha1_rest
2017-10-06sha1_loose_object_info: handle errors from unpack_sha1_restLibravatar Jeff King1-2/+6
When a caller of sha1_object_info_extended() sets the "contentp" field in object_info, we call unpack_sha1_rest() but do not check whether it signaled an error. This causes two problems: 1. We pass back NULL to the caller via the contentp field, but the function returns "0" for success. A caller might reasonably expect after a successful return that it can access contentp without a NULL check and segfault. As it happens, this is impossible to trigger in the current code. There is exactly one caller which uses contentp, read_object(). And the only thing it does after a successful call is to return the content pointer to its caller, using NULL as a sentinel for errors. So in effect it converts the success code from sha1_object_info_extended() back into an error! But this is still worth addressing avoid problems for future users of "contentp". 2. Callers of unpack_sha1_rest() are expected to close the zlib stream themselves on error. Which means that we're leaking the stream. The problem in (1) comes from from c84a1f3ed4 (sha1_file: refactor read_object, 2017-06-21), which added the contentp field. Before that, we called unpack_sha1_rest() via unpack_sha1_file(), which directly used the NULL to signal an error. But note that the leak in (2) is actually older than that. The original unpack_sha1_file() directly returned the result of unpack_sha1_rest() to its caller, when it should have been closing the zlib stream itself on error. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-06sha1_file: do not leak `lock_file`Libravatar Martin Ågren1-11/+8
There is no longer any need to allocate and leak a `struct lock_file`. Initialize it on the stack instead. Before this patch, we set `lock = NULL` to signal that we have already rolled back, and that we should not do any more work. We need to take another approach now that we cannot assign NULL. We could, e.g., use `is_lock_file_locked()`. But we already have another variable that we could use instead, `found`. Its scope is only too small. Bump `found` to the scope of the whole function and rearrange the "roll back or write?"-checks to a straightforward if-else on `found`. This also future-proves the code by making it obvious that we intend to take exactly one of these paths. Improved-by: Jeff King <peff@peff.net> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-03Merge branch 'jk/read-in-full'Libravatar Junio C Hamano1-3/+8
Code clean-up to prevent future mistakes by copying and pasting code that checks the result of read_in_full() function. * jk/read-in-full: worktree: check the result of read_in_full() worktree: use xsize_t to access file size distinguish error versus short read from read_in_full() avoid looking at errno for short read_in_full() returns prefer "!=" when checking read_in_full() result notes-merge: drop dead zero-write code files-backend: prefer "0" for write_in_full() error check
2017-09-27avoid looking at errno for short read_in_full() returnsLibravatar Jeff King1-3/+8
When a caller tries to read a particular set of bytes via read_in_full(), there are three possible outcomes: 1. An error, in which case -1 is returned and errno is set. 2. A short read, in which fewer bytes are returned and errno is unspecified (we never saw a read error, so we may have some random value from whatever syscall failed last). 3. The full read completed successfully. Many callers handle cases 1 and 2 together by just checking the result against the requested size. If their combined error path looks at errno (e.g., by calling die_errno), they may report a nonsense value. Let's fix these sites by having them distinguish between the two error cases. That avoids the random errno confusion, and lets us give more detailed error messages. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25Merge branch 'jk/info-alternates-fix'Libravatar Junio C Hamano1-20/+11
A regression fix for 2.11 that made the code to read the list of alternate object stores overrun the end of the string. * jk/info-alternates-fix: read_info_alternates: warn on non-trivial errors read_info_alternates: read contents into strbuf
2017-09-25Merge branch 'jk/write-in-full-fix'Libravatar Junio C Hamano1-1/+1
Many codepaths did not diagnose write failures correctly when disks go full, due to their misuse of write_in_full() helper function, which have been corrected. * jk/write-in-full-fix: read_pack_header: handle signed/unsigned comparison in read result config: flip return value of store_write_*() notes-merge: use ssize_t for write_in_full() return value pkt-line: check write_in_full() errors against "< 0" convert less-trivial versions of "write_in_full() != len" avoid "write_in_full(fd, buf, len) != len" pattern get-tar-commit-id: check write_in_full() return against 0 config: avoid "write_in_full(fd, buf, len) < len" pattern
2017-09-20read_info_alternates: warn on non-trivial errorsLibravatar Jeff King1-0/+1
When we fail to open $GIT_DIR/info/alternates, we silently assume there are no alternates. This is the right thing to do for ENOENT, but not for other errors. A hard error is probably overkill here. If we fail to read an alternates file then either we'll complete our operation anyway, or we'll fail to find some needed object. Either way, a warning is good idea. And we already have a helper function to handle this pattern; let's just call warn_on_fopen_error(). Note that technically the errno from strbuf_read_file() might be from a read() error, not open(). But since read() would never return ENOENT or ENOTDIR, and since it produces a generic "unable to access" error, it's suitable for handling errors from either. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-20Merge branch 'jk/info-alternates-fix-2.11' into jk/info-alternates-fixLibravatar Junio C Hamano1-20/+10
* jk/info-alternates-fix-2.11: read_info_alternates: read contents into strbuf
2017-09-20read_info_alternates: read contents into strbufLibravatar Jeff King1-20/+10
This patch fixes a regression in v2.11.1 where we might read past the end of an mmap'd buffer. It was introduced in cf3c635210. The link_alt_odb_entries() function has always taken a ptr/len pair as input. Until cf3c635210 (alternates: accept double-quoted paths, 2016-12-12), we made a copy of those bytes in a string. But after that commit, we switched to parsing the input left-to-right, and we ignore "len" totally, instead reading until we hit a NUL. This has mostly gone unnoticed for a few reasons: 1. All but one caller passes a NUL-terminated string, with "len" pointing to the NUL. 2. The remaining caller, read_info_alternates(), passes in an mmap'd file. Unless the file is an exact multiple of the page size, it will generally be followed by NUL padding to the end of the page, which just works. The easiest way to demonstrate the problem is to build with: make SANITIZE=address NO_MMAP=Nope test Any test which involves $GIT_DIR/info/alternates will fail, as the mmap emulation (correctly) does not add an extra NUL, and ASAN complains about reading past the end of the buffer. One solution would be to teach link_alt_odb_entries() to respect "len". But it's actually a bit tricky, since we depend on unquote_c_style() under the hood, and it has no ptr/len variant. We could also just make a NUL-terminated copy of the input bytes and operate on that. But since all but one caller already is passing a string, instead let's just fix that caller to provide NUL-terminated input in the first place, by swapping out mmap for strbuf_read_file(). There's no advantage to using mmap on the alternates file. It's not expected to be large (and anyway, we're copying its contents into an in-memory linked list). Nor is using git_open() buying us anything here, since we don't keep the descriptor open for a long period of time. Let's also drop the "len" parameter entirely from link_alt_odb_entries(), since it's completely ignored. That will avoid any new callers re-introducing a similar bug. Reported-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-19Merge branch 'rs/strbuf-leakfix'Libravatar Junio C Hamano1-3/+3
Many leaks of strbuf have been fixed. * rs/strbuf-leakfix: (34 commits) wt-status: release strbuf after use in wt_longstatus_print_tracking() wt-status: release strbuf after use in read_rebase_todolist() vcs-svn: release strbuf after use in end_revision() utf8: release strbuf on error return in strbuf_utf8_replace() userdiff: release strbuf after use in userdiff_get_textconv() transport-helper: release strbuf after use in process_connect_service() sequencer: release strbuf after use in save_head() shortlog: release strbuf after use in insert_one_record() sha1_file: release strbuf on error return in index_path() send-pack: release strbuf on error return in send_pack() remote: release strbuf after use in set_url() remote: release strbuf after use in migrate_file() remote: release strbuf after use in read_remote_branches() refs: release strbuf on error return in write_pseudoref() notes: release strbuf after use in notes_copy_from_stdin() merge: release strbuf after use in write_merge_heads() merge: release strbuf after use in save_state() mailinfo: release strbuf on error return in handle_boundary() mailinfo: release strbuf after use in handle_from() help: release strbuf on error return in exec_woman_emacs() ...
2017-09-14read_pack_header: handle signed/unsigned comparison in read resultLibravatar Jeff King1-1/+1
The result of read_in_full() may be -1 if we saw an error. But in comparing it to a sizeof() result, that "-1" will be promoted to size_t. In fact, the largest possible size_t which is much bigger than our struct size. This means that our "< sizeof(header)" error check won't trigger. In practice, we'd go on to read uninitialized memory and compare it to the PACK signature, which is likely to fail. But we shouldn't get there. We can fix this by making a direct "!=" comparison to the requested size, rather than "<". This means that errors get lumped in with short reads, but that's sufficient for our purposes here. There's no PH_ERROR tp represent our case. And anyway, this function reads from pipes and network sockets. A network error may racily appear as EOF to us anyway if there's data left in the socket buffers. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-10Merge branch 'sb/sha1-file-cleanup' into maintLibravatar Junio C Hamano1-1/+2
Code clean-up. * sb/sha1-file-cleanup: sha1_file: make read_info_alternates static
2017-09-10Merge branch 'rs/find-pack-entry-bisection' into maintLibravatar Junio C Hamano1-2/+2
Code clean-up. * rs/find-pack-entry-bisection: sha1_file: avoid comparison if no packed hash matches the first byte
2017-09-10Merge branch 'rs/unpack-entry-leakfix' into maintLibravatar Junio C Hamano1-2/+3
Memory leak in an error codepath has been plugged. * rs/unpack-entry-leakfix: sha1_file: release delta_stack on error in unpack_entry()
2017-09-07sha1_file: release strbuf on error return in index_path()Libravatar Rene Scharfe1-3/+3
strbuf_readlink() already frees the buffer for us on error. Clean up if write_sha1_file() fails as well instead of returning early. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06Merge branch 'po/read-graft-line'Libravatar Junio C Hamano1-1/+1
Conversion from uchar[20] to struct object_id continues; this is to ensure that we do not assume sizeof(struct object_id) is the same as the length of SHA-1 hash (or length of longest hash we support). * po/read-graft-line: commit: rewrite read_graft_line commit: allocate array using object_id size commit: replace the raw buffer with strbuf in read_graft_line sha1_file: fix definition of null_sha1
2017-08-26Merge branch 'jt/packmigrate'Libravatar Junio C Hamano1-1891/+15
Code movement to make it easier to hack later. * jt/packmigrate: (23 commits) pack: move for_each_packed_object() pack: move has_pack_index() pack: move has_sha1_pack() pack: move find_pack_entry() and make it global pack: move find_sha1_pack() pack: move find_pack_entry_one(), is_pack_valid() pack: move check_pack_index_ptr(), nth_packed_object_offset() pack: move nth_packed_object_{sha1,oid} pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry() pack: move unpack_object_header() pack: move get_size_from_delta() pack: move unpack_object_header_buffer() pack: move {,re}prepare_packed_git and approximate_object_count pack: move install_packed_git() pack: move add_packed_git() pack: move unuse_pack() pack: move use_pack() pack: move pack-closing functions pack: move release_pack_memory() pack: move open_pack_index(), parse_pack_index() ...
2017-08-26Merge branch 'po/object-id'Libravatar Junio C Hamano1-16/+16
* po/object-id: sha1_file: convert index_stream to struct object_id sha1_file: convert hash_sha1_file_literally to struct object_id sha1_file: convert index_fd to struct object_id sha1_file: convert index_path to struct object_id read-cache: convert to struct object_id builtin/hash-object: convert to struct object_id
2017-08-23pack: move for_each_packed_object()Libravatar Jonathan Tan1-40/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move has_pack_index()Libravatar Jonathan Tan1-8/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move has_sha1_pack()Libravatar Jonathan Tan1-6/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move find_pack_entry() and make it globalLibravatar Jonathan Tan1-53/+0
This function needs to be global as it is used by sha1_file.c and will be used by packfile.c. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move find_sha1_pack()Libravatar Jonathan Tan1-13/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move find_pack_entry_one(), is_pack_valid()Libravatar Jonathan Tan1-73/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move check_pack_index_ptr(), nth_packed_object_offset()Libravatar Jonathan Tan1-33/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move nth_packed_object_{sha1,oid}Libravatar Jonathan Tan1-31/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry()Libravatar Jonathan Tan1-663/+14
Both sha1_file.c and packfile.c now need read_object(), so a copy of read_object() was created in packfile.c. This patch makes both mark_bad_packed_object() and has_packed_and_bad() global. Unlike most of the other patches in this series, these 2 functions need to remain global. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move unpack_object_header()Libravatar Jonathan Tan1-26/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move get_size_from_delta()Libravatar Jonathan Tan1-39/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move unpack_object_header_buffer()Libravatar Jonathan Tan1-25/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move {,re}prepare_packed_git and approximate_object_countLibravatar Jonathan Tan1-214/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move install_packed_git()Libravatar Jonathan Tan1-9/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move add_packed_git()Libravatar Jonathan Tan1-61/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move unuse_pack()Libravatar Jonathan Tan1-9/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move use_pack()Libravatar Jonathan Tan1-285/+0
The function open_packed_git() needs to be temporarily made global. Its scope will be restored to static in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move pack-closing functionsLibravatar Jonathan Tan1-55/+0
The function close_pack_fd() needs to be temporarily made global. Its scope will be restored to static in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move release_pack_memory()Libravatar Jonathan Tan1-49/+0
The function unuse_one_window() needs to be temporarily made global. Its scope will be restored to static in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move open_pack_index(), parse_pack_index()Libravatar Jonathan Tan1-140/+0
alloc_packed_git() in packfile.c is duplicated from sha1_file.c. In a subsequent commit, alloc_packed_git() will be removed from sha1_file.c. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move pack_report()Libravatar Jonathan Tan1-24/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>