summaryrefslogtreecommitdiff
path: root/t
AgeCommit message (Collapse)AuthorFilesLines
2021-12-10Merge branch 'ab/checkout-branch-info-leakfix'Libravatar Junio C Hamano35-0/+43
Leakfix. * ab/checkout-branch-info-leakfix: checkout: fix "branch info" memory leaks
2021-12-10Merge branch 'jk/t5319-midx-corruption-test-deflake'Libravatar Junio C Hamano1-2/+4
Test fix. * jk/t5319-midx-corruption-test-deflake: t5319: corrupt more bytes of the midx checksum
2021-12-10Merge branch 'tw/var-default-branch'Libravatar Junio C Hamano1-0/+20
"git var GIT_DEFAULT_BRANCH" is a way to see what name is used for the newly created branch if "git init" is run. * tw/var-default-branch: var: add GIT_DEFAULT_BRANCH variable
2021-12-10Merge branch 'jk/strbuf-addftime-seconds-since-epoch'Libravatar Junio C Hamano1-0/+4
The "--date=format:<strftime>" gained a workaround for the lack of system support for a non-local timezone to handle "%s" placeholder. * jk/strbuf-addftime-seconds-since-epoch: strbuf_addftime(): handle "%s" manually
2021-12-10Merge branch 'if/redact-packfile-uri'Libravatar Junio C Hamano1-0/+51
Redact the path part of packfile URI that appears in the trace output. * if/redact-packfile-uri: http-fetch: redact url on die() message fetch-pack: redact packfile urls in traces
2021-12-10Merge branch 'gc/remote-with-fewer-static-global-variables'Libravatar Junio C Hamano1-0/+9
Code clean-up to eventually allow information on remotes defined for an arbitrary repository to be read. * gc/remote-with-fewer-static-global-variables: remote: die if branch is not found in repository remote: remove the_repository->remote_state from static methods remote: use remote_state parameter internally remote: move static variables into per-repository struct t5516: add test case for pushing remote refspecs
2021-12-10Merge branch 'vd/sparse-sparsity-fix-on-read'Libravatar Junio C Hamano2-2/+34
Ensure that the sparseness of the in-core index matches the index.sparse configuration specified by the repository immediately after the on-disk index file is read. * vd/sparse-sparsity-fix-on-read: sparse-index: update do_read_index to ensure correct sparsity sparse-index: add ensure_correct_sparsity function sparse-index: avoid unnecessary cache tree clearing test-read-cache.c: prepare_repo_settings after config init
2021-11-29Merge branch 'mc/clean-smudge-with-llp64'Libravatar Junio C Hamano3-4/+47
The clean/smudge conversion code path has been prepared to better work on platforms where ulong is narrower than size_t. * mc/clean-smudge-with-llp64: clean/smudge: allow clean filters to process extremely large files odb: guard against data loss checking out a huge file git-compat-util: introduce more size_t helpers odb: teach read_blob_entry to use size_t t1051: introduce a smudge filter test for extremely large files test-lib: add prerequisite for 64-bit platforms test-tool genzeros: generate large amounts of data more efficiently test-genzeros: allow more than 2G zeros in Windows
2021-11-29Merge branch 'tb/plug-pack-bitmap-leaks'Libravatar Junio C Hamano1-1/+2
Leakfix. * tb/plug-pack-bitmap-leaks: pack-bitmap.c: more aggressively free in free_bitmap_index() pack-bitmap.c: don't leak type-level bitmaps midx.c: write MIDX filenames to strbuf builtin/multi-pack-index.c: don't leak concatenated options builtin/repack.c: avoid leaking child arguments builtin/pack-objects.c: don't leak memory via arguments t/helper/test-read-midx.c: free MIDX within read_midx_file() midx.c: don't leak MIDX from verify_midx_file midx.c: clean up chunkfile after reading the MIDX
2021-11-29Merge branch 'tp/send-email-completion'Libravatar Junio C Hamano1-0/+3
The command line complation for "git send-email" options have been tweaked to make it easier to keep it in sync with the command itself. * tp/send-email-completion: send-email docs: add format-patch options send-email: programmatically generate bash completions
2021-11-29Merge branch 'jc/fix-ref-sorting-parse'Libravatar Junio C Hamano2-1/+37
Things like "git -c branch.sort=bogus branch new HEAD", i.e. the operation modes of the "git branch" command that do not need the sort key information, no longer errors out by seeing a bogus sort key. * jc/fix-ref-sorting-parse: for-each-ref: delay parsing of --sort=<atom> options
2021-11-29Merge branch 'so/stash-staged'Libravatar Junio C Hamano1-0/+11
"git stash" learned the "--staged" option to stash away what has been added to the index (and nothing else). * so/stash-staged: stash: get rid of unused argument in stash_staged() stash: implement '--staged' option for 'push' and 'save'
2021-11-29Merge branch 'ab/refs-errno-cleanup'Libravatar Junio C Hamano3-1/+89
The "remainder" of hn/refs-errno-cleanup topic. * ab/refs-errno-cleanup: (21 commits) refs API: post-migration API renaming [2/2] refs API: post-migration API renaming [1/2] refs API: don't expose "errno" in run_transaction_hook() refs API: make expand_ref() & repo_dwim_log() not set errno refs API: make resolve_ref_unsafe() not set errno refs API: make refs_ref_exists() not set errno refs API: make refs_resolve_refdup() not set errno refs tests: ignore ignore errno in test-ref-store helper refs API: ignore errno in worktree.c's find_shared_symref() refs API: ignore errno in worktree.c's add_head_info() refs API: make files_copy_or_rename_ref() et al not set errno refs API: make loose_fill_ref_dir() not set errno refs API: make resolve_gitlink_ref() not set errno refs API: remove refs_read_ref_full() wrapper refs/files: remove "name exist?" check in lock_ref_oid_basic() reflog tests: add --updateref tests refs API: make refs_rename_ref_available() static refs API: make parse_loose_ref_contents() not set errno refs API: make refs_read_raw_ref() not set errno refs API: add a version of refs_resolve_ref_unsafe() with "errno" ...
2021-11-29Merge branch 'ow/stash-count-in-status-porcelain-output'Libravatar Junio C Hamano1-0/+15
Allow "git status --porcelain=v2" to show the number of stash entries with --show-stash like the normal output does. * ow/stash-count-in-status-porcelain-output: status: print stash info with --porcelain=v2 --show-stash status: count stash entries in separate function
2021-11-29Merge branch 'jk/loosen-urlmatch'Libravatar Junio C Hamano1-1/+1
Treat "_" as any other URL-valid characters in an URL when matching the per-URL configuration variable names. * jk/loosen-urlmatch: urlmatch: add underscore to URL_HOST_CHARS
2021-11-24sparse-index: update do_read_index to ensure correct sparsityLibravatar Victoria Dye1-0/+31
Unless `command_requires_full_index` forces index expansion, ensure in-core index sparsity matches config settings on read by calling `ensure_correct_sparsity`. This makes the behavior of the in-core index more consistent between different methods of updating sparsity: manually changing the `index.sparse` config setting vs. executing `git sparse-checkout --[no-]sparse-index init` Although index sparsity is normally updated with `git sparse-checkout init`, ensuring correct sparsity after a manual `index.sparse` change has some practical benefits: 1. It allows for command-by-command sparsity toggling with `-c index.sparse=<true|false>`, e.g. when troubleshooting issues with the sparse index. 2. It prevents users from experiencing abnormal slowness after setting `index.sparse` to `true` due to use of a full index in all commands until the on-disk index is updated. Helped-by: Junio C Hamano <gitster@pobox.com> Co-authored-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Victoria Dye <vdye@github.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-24test-read-cache.c: prepare_repo_settings after config initLibravatar Victoria Dye1-2/+3
Move `prepare_repo_settings` after the git directory has been set up in `test-read-cache.c`. The git directory settings must be initialized to properly assign repo settings using the worktree-level git config. Signed-off-by: Victoria Dye <vdye@github.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-22Merge branch 'ds/add-rm-with-sparse-index'Libravatar Junio C Hamano1-0/+17
Regression fix for 2.34 * ds/add-rm-with-sparse-index: dir: revert "dir: select directories correctly"
2021-11-22dir: revert "dir: select directories correctly"Libravatar Derrick Stolee1-0/+17
This reverts commit f6526728f950cacfd5b5e42bcc65f2c47f3da654. The change in f652672 (dir: select directories correctly, 2021-09-24) caused a regression in directory-based matches with non-cone-mode patterns, especially for .gitignore patterns. A test is included to prevent this regression in the future. The commit ed495847 (dir: fix pattern matching on dirs, 2021-09-24) was reverted in 5ceb663 (dir: fix directory-matching bug, 2021-11-02) for similar reasons. Neither commit changed tests, and tests added later in the series continue to pass when these commits are reverted. Reported-by: Danial Alihosseini <danial.alihosseini@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-21Merge branch 'ev/pull-already-up-to-date-is-noop'Libravatar Junio C Hamano1-0/+6
"git pull" with any strategy when the other side is behind us should succeed as it is a no-op, but doesn't. * ev/pull-already-up-to-date-is-noop: pull: should be noop when already-up-to-date
2021-11-21Merge branch 'hm/paint-hits-in-log-grep'Libravatar Junio C Hamano1-48/+0
"git grep" looking in a blob that has non-UTF8 payload was completely broken when linked with certain versions of PCREv2 library in the latest release. * hm/paint-hits-in-log-grep: Revert "grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data"
2021-11-19Revert "grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data"Libravatar Junio C Hamano1-48/+0
This reverts commit ae39ba431ab861548eb60b4bd2e1d8b8813db76f, as it breaks "grep" when looking for a string in non UTF-8 haystack, when linked with certain versions of PCREv2 library. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-18t5516: add test case for pushing remote refspecsLibravatar Glen Choo1-0/+9
"git push remote-name" (that is, with no refspec given on the command line) should push the refspecs in remote.remote-name.push. There is no test case that checks this behavior in detached HEAD, so add one. Signed-off-by: Glen Choo <chooglen@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-18pull: should be noop when already-up-to-dateLibravatar Erwin Villejo1-0/+6
The already-up-to-date pull bug was fixed for --ff-only but it did not include the case where --ff or --ff-only are not specified. This updates the --ff-only fix to include the case where --ff or --ff-only are not specified in command line flags or config. Signed-off-by: Erwin Villejo <erwin.villejo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-18t5319: corrupt more bytes of the midx checksumLibravatar Jeff King1-2/+4
One of the tests in t5319 corrupts the checksum of the midx file by writing a single 0xff over the final byte, and then confirms that we detect the problem. This usually works fine, but would break if the actual checksum ended with that same byte already. It seems like this should happen in 1 out of 256 test runs, but it turns out to be less often in practice. The contents of the midx are mostly deterministic because it's based on the objects, and we remove most sources of randomness by setting GIT_COMMITTER_DATE, etc. However, there's still some randomness: some objects are duplicated between packs, and the midx must decide which to use, which can be based on timing. So very occasionally we can end up with a real 0xff byte, and the test fails. The most robust fix would be to read out the final byte and then change it to something else (e.g., adding 1 mod 256). But that's awkward to do in shell. Let's just blindly corrupt 10 bytes instead of 1, which reduces our chances of an accidental noop to 1 in 2^80. Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-18checkout: fix "branch info" memory leaksLibravatar Ævar Arnfjörð Bjarmason35-0/+43
The "checkout" command is one of the main sources of leaks in the test suite, let's fix the common ones by not leaking from the "struct branch_info". Doing this is rather straightforward, albeit verbose, we need to xstrdup() constant strings going into the struct, and free() the ones we clobber as we go along. This also means that we can delete previous partial leak fixes in this area, i.e. the "path_to_free" accounting added by 96ec7b1e708 (Convert resolve_ref+xstrdup to new resolve_refdup function, 2011-12-13). There was some discussion about whether "we should retain the "const char *" here and cast at free() time, or have it be a "char *". Since this is not a public API with any sort of API boundary let's use "char *", as is already being done for the "refname" member of the same struct. The tests to mark as passing were found with: rm .prove; GIT_SKIP_TESTS=t0027 prove -j8 --state=save t[0-9]*.sh :: --immediate # apply & compile this change prove -j8 --state=failed :: --immediate I.e. the ones that were newly passing when the --state=failed command was run. I left out "t3040-subprojects-basic.sh" and "t4131-apply-fake-ancestor.sh" to to optimization-level related differences similar to the ones noted in[1], except that these would be something the current 'linux-leaks' job would run into. 1. https://lore.kernel.org/git/cover-v3-0.6-00000000000-20211022T175227Z-avarab@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-12Merge branch 'ab/fsck-unexpected-type'Libravatar Junio C Hamano1-0/+8
Regression fix. * ab/fsck-unexpected-type: object-file: free(*contents) only in read_loose_object() caller object-file: fix SEGV on free() regression in v2.34.0-rc2
2021-11-12Merge branch 'ps/connectivity-optim'Libravatar Junio C Hamano1-31/+0
Regression fix. * ps/connectivity-optim: Revert "connected: do not sort input revisions"
2021-11-11Revert "connected: do not sort input revisions"Libravatar Junio C Hamano1-31/+0
This reverts commit f45022dc2fd692fd024f2eb41a86a66f19013d43, as this is like breakage in the traversal more likely. In a history with 10 single strand of pearls, 1-->2-->3--...->7-->8-->9-->10 asking "rev-list --unsorted-input 1 10 --not 9 8 7 6 5 4" fails to paint the bottom 1 uninteresting as the traversal stops, without completing the propagation of uninteresting bit starting at 4 down through 3 and 2 to 1.
2021-11-11object-file: fix SEGV on free() regression in v2.34.0-rc2Libravatar Ævar Arnfjörð Bjarmason1-0/+8
Fix a regression introduced in my 96e41f58fe1 (fsck: report invalid object type-path combinations, 2021-10-01). When fsck-ing blobs larger than core.bigFileThreshold, we'd free() a pointer to uninitialized memory. This issue would have been caught by SANITIZE=address, but since it involves core.bigFileThreshold, none of the existing tests in our test suite covered it. Running them with the "big_file_threshold" in "environment.c" changed to say "6" would have shown this failure, but let's add a dedicated test for this scenario based on Han Xin's report[1]. The bug was introduced between v9 and v10[2] of the fsck series merged in 061a21d36d8 (Merge branch 'ab/fsck-unexpected-type', 2021-10-25). 1. https://lore.kernel.org/git/20211111030302.75694-1-hanxin.hx@alibaba-inc.com/ 2. https://lore.kernel.org/git/cover-v10-00.17-00000000000-20211001T091051Z-avarab@gmail.com/ Reported-by: Han Xin <chiyutianyi@gmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-11fetch-pack: redact packfile urls in tracesLibravatar Ivan Frade1-0/+51
In some setups, packfile uris act as bearer token. It is not recommended to expose them plainly in logs, although in special circunstances (e.g. debug) it makes sense to write them. Redact the packfile URL paths by default, unless the GIT_TRACE_REDACT variable is set to false. This mimics the redacting of the Authorization header in HTTP. Signed-off-by: Ivan Frade <ifrade@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-10Merge branch 'jk/ssh-signing-fix'Libravatar Junio C Hamano1-0/+6
Reject OpenSSH 8.7 whose "ssh-keygen -Y find-principals" is unusable from running the ssh signature tests. * jk/ssh-signing-fix: t/lib-gpg: avoid broken versions of ssh-keygen
2021-11-10Merge branch 'jc/fix-pull-ff-only-when-already-up-to-date'Libravatar Junio C Hamano1-1/+15
"git pull --ff-only" and "git pull --rebase --ff-only" should make it a no-op to attempt pulling from a remote that is behind us, but instead the command errored out by saying it was impossible to fast-forward, which may technically be true, but not a useful thing to diagnose as an error. This has been corrected. * jc/fix-pull-ff-only-when-already-up-to-date: pull: --ff-only should make it a noop when already-up-to-date
2021-11-10t/lib-gpg: avoid broken versions of ssh-keygenLibravatar Jeff King1-0/+6
The "-Y find-principals" option of ssh-keygen seems to be broken in Debian's openssh-client 1:8.7p1-1, whereas it works fine in 1:8.4p1-5. This causes several failures for GPGSSH tests. We fulfill the prerequisite because generating the keys works fine, but actually verifying a signature causes results ranging from bogus results to ssh-keygen segfaulting. We can find the broken version during the prereq check by feeding it empty input. This should result in it complaining to stderr, but in the broken version it triggers the segfault, causing the GPGSSH tests to be skipped. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-09Merge branch 'ad/ssh-signing-testfix'Libravatar Junio C Hamano1-0/+1
Fix ssh-signing test to work on a platform where the default ACL is overly loose to upset OpenSSH (reported on an installation of Cygwin). * ad/ssh-signing-testfix: t/lib-git.sh: fix ACL-related permissions failure
2021-11-05t/lib-git.sh: fix ACL-related permissions failureLibravatar Adam Dinwoodie1-0/+1
As well as checking that the relevant functionality is available, the GPGSSH prerequisite check creates the SSH keys that are used by the test functions it gates. If these keys are created in a directory that has a default Access Control List, the key files can inherit those permissions. This can result in a scenario where the private keys are created successfully, so the prerequisite check passes and the tests are run, but the key files have permissions that are too permissive, meaning OpenSSH will refuse to load them and the tests will fail. To avoid this happening, before creating the keys, clear any default ACL set on the directory that will contain them. This step allowed to fail; if setfacl isn't present, that's a very likely indicator that the filesystem in question simply doesn't support default ACLs. Helped-by: Fabian Stelzer <fs@gigacodes.de> Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-04strbuf_addftime(): handle "%s" manuallyLibravatar Jeff King1-0/+4
The strftime() function has a non-standard "%s" extension, which prints the number of seconds since the epoch. But the "struct tm" we get has already been adjusted for a particular time zone; going back to an epoch time requires knowing that zone offset. Since strftime() doesn't take such an argument, round-tripping to a "struct tm" and back to the "%s" format may produce the wrong value (off by tz_offset seconds). Since we're already passing in the zone offset courtesy of c3fbf81a85 (strbuf: let strbuf_addftime handle %z and %Z itself, 2017-06-15), we can use that same value to adjust our epoch seconds accordingly. Note that the description above makes it sound like strftime()'s "%s" is useless (and really, the issue is shared by mktime(), which is what strftime() would use under the hood). But it gets the two cases for which it's designed correct: - the result of gmtime() will have a zero offset, so no adjustment is necessary - the result of localtime() will be offset by the local zone offset, and mktime() and strftime() are defined to assume this offset when converting back (there's actually some magic here; some implementations record this in the "struct tm", but we can't portably access or manipulate it. But they somehow "know" whether a "struct tm" is from gmtime() or localtime()). This latter point means that "format-local:%s" actually works correctly already, because in that case we rely on the system routines due to 6eced3ec5e (date: use localtime() for "-local" time formats, 2017-06-15). Our problem comes when trying to show times in the author's zone, as the system routines provide no mechanism for converting in non-local zones. So in those cases we have a "struct tm" that came from gmtime(), but has been manipulated according to our offset. The tests cover the broken round-trip by formatting "%s" for a time in a non-system timezone. We use the made-up "+1234" here, which has two advantages. One, we know it won't ever be the real system zone (and so we're actually testing a case that would break). And two, since it has a minute component, we're testing the full decoding of the +HHMM zone into a number of seconds. Likewise, we test the "-1234" variant to make sure there aren't any sign mistakes. There's one final test, which covers "format-local:%s". As noted, this already passes, but it's important to check that we didn't regress this case. In particular, the caller in show_date() is relying on localtime() to have done the zone adjustment, independent of any tz_offset we compute ourselves. These should match up, since our local_tzoffset() is likewise built around localtime(). But it would be easy for a caller to forget to pass in a correct tz_offset to strbuf_addftime(). Fortunately show_date() does this correctly (it has to because of the existing handling of %z), and the test continues to pass. So this one is just future-proofing against a change in our assumptions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-04Merge branch 'pw/rebase-r-fixes' into maintLibravatar Junio C Hamano1-0/+29
Regression fix. * pw/rebase-r-fixes: rebase -i: fix rewording with --committer-date-is-author-date
2021-11-04Merge branch 'jk/log-warn-on-bogus-encoding' into maintLibravatar Junio C Hamano1-7/+0
Squelch over-eager warning message added during this cycle. * jk/log-warn-on-bogus-encoding: log: document --encoding behavior on iconv() failure Revert "logmsg_reencode(): warn when iconv() fails"
2021-11-04Merge branch 'ar/no-verify-doc'Libravatar Junio C Hamano1-0/+8
Doc update. * ar/no-verify-doc: Document positive variant of commit and merge option "--no-verify"
2021-11-04Merge branch 'ar/fix-git-pull-no-verify'Libravatar Junio C Hamano1-0/+24
"git pull --no-verify" did not affect the underlying "git merge". * ar/fix-git-pull-no-verify: pull: honor --no-verify and do not call the commit-msg hook
2021-11-03Merge branch 'pw/rebase-r-fixes'Libravatar Junio C Hamano1-0/+29
Regression fix. * pw/rebase-r-fixes: rebase -i: fix rewording with --committer-date-is-author-date
2021-11-03Merge branch 'ds/add-rm-with-sparse-index'Libravatar Junio C Hamano1-0/+26
Regression fix. * ds/add-rm-with-sparse-index: dir: fix directory-matching bug
2021-11-03var: add GIT_DEFAULT_BRANCH variableLibravatar Thomas Weißschuh1-0/+20
Introduce the logical variable GIT_DEFAULT_BRANCH which represents the the default branch name that will be used by "git init". Currently this variable is equivalent to git config init.defaultbranch || 'master' This however will break if at one point the default branch is changed as indicated by `default_branch_name_advice` in `refs.c`. By providing this command ahead of time users of git can make their code forward-compatible. Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-03clean/smudge: allow clean filters to process extremely large filesLibravatar Matt Cooper1-0/+11
The filter system allows for alterations to file contents when they're moved between the database and the worktree. We already made sure that it is possible for smudge filters to produce contents that are larger than `unsigned long` can represent (which matters on systems where `unsigned long` is narrower than `size_t`, most notably 64-bit Windows). Now we make sure that clean filters can _consume_ contents that are larger than that. Note that this commit only allows clean filters' _input_ to be larger than can be represented by `unsigned long`. This change makes only a very minute dent into the much larger project to teach Git to use `size_t` instead of `unsigned long` wherever appropriate. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Matt Cooper <vtbassmatt@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-03odb: teach read_blob_entry to use size_tLibravatar Matt Cooper1-1/+1
There is mixed use of size_t and unsigned long to deal with sizes in the codebase. Recall that Windows defines unsigned long as 32 bits even on 64-bit platforms, meaning that converting size_t to unsigned long narrows the range. This mostly doesn't cause a problem since Git rarely deals with files larger than 2^32 bytes. But adjunct systems such as Git LFS, which use smudge/clean filters to keep huge files out of the repository, may have huge file contents passed through some of the functions in entry.c and convert.c. On Windows, this results in a truncated file being written to the workdir. I traced this to one specific use of unsigned long in write_entry (and a similar instance in write_pc_item_to_fd for parallel checkout). That appeared to be for the call to read_blob_entry, which expects a pointer to unsigned long. By altering the signature of read_blob_entry to expect a size_t, write_entry can be switched to use size_t internally (which all of its callers and most of its callees already used). To avoid touching dozens of additional files, read_blob_entry uses a local unsigned long to call a chain of functions which aren't prepared to accept size_t. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Matt Cooper <vtbassmatt@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-03t1051: introduce a smudge filter test for extremely large filesLibravatar Matt Cooper1-0/+15
The filter system allows for alterations to file contents when they're added to the database or working tree. ("Smudge" when moving to the working tree; "clean" when moving to the database.) This is used natively to handle CRLF to LF conversions. It's also employed by Git-LFS to replace large files from the working tree with small tracking files in the repo and vice versa. Git reads the entire smudged file into memory to convert it into a "clean" form to be used in-core. While this is inefficient, there's a more insidious problem on some platforms due to inconsistency between using unsigned long and size_t for the same type of data (size of a file in bytes). On most 64-bit platforms, unsigned long is 64 bits, and size_t is typedef'd to unsigned long. On Windows, however, unsigned long is only 32 bits (and therefore on 64-bit Windows, size_t is typedef'd to unsigned long long in order to be 64 bits). Practically speaking, this means 64-bit Windows users of Git-LFS can't handle files larger than 2^32 bytes. Other 64-bit platforms don't suffer this limitation. This commit introduces a test exposing the issue; future commits make it pass. The test simulates the way Git-LFS works by having a tiny file checked into the repository and expanding it to a huge file on checkout. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Matt Cooper <vtbassmatt@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-03test-lib: add prerequisite for 64-bit platformsLibravatar Carlo Marcelo Arenas Belón1-0/+4
Allow tests that assume a 64-bit `size_t` to be skipped in 32-bit platforms and regardless of the size of `long`. This imitates the `LONG_IS_64BIT` prerequisite. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-03test-tool genzeros: generate large amounts of data more efficientlyLibravatar Johannes Schindelin1-2/+15
In this developer's tests, producing one gigabyte worth of NULs in a busy loop that writes out individual bytes, unbuffered, took ~27sec. Writing chunked 256kB buffers instead only took ~0.6sec This matters because we are about to introduce a pair of test cases that want to be able to produce 5GB of NULs, and we cannot use `/dev/zero` because of the HP NonStop platform's lack of support for that device. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-03test-genzeros: allow more than 2G zeros in WindowsLibravatar Carlo Marcelo Arenas Belón1-2/+2
d5cfd142ec (tests: teach the test-tool to generate NUL bytes and use it, 2019-02-14), add a way to generate zeroes in a portable way without using /dev/zero (needed by HP NonStop), but uses a long variable that is limited to 2^31 in Windows. Use instead a (POSIX/C99) intmax_t that is at least 64bit wide in 64-bit Windows to use in a future test. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>