summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-03-30bloom.c: core Bloom filter implementation for changed paths.Libravatar Garima Singh4-0/+172
Add the core implementation for computing Bloom filters for the paths changed between a commit and it's first parent. We fill the Bloom filters as (const char *data, int len) pairs as `struct bloom_filters" within a commit slab. Filters for commits with no changes and more than 512 changes, is represented with a filter of length zero. There is no gain in distinguishing between a computed filter of length zero for a commit with no changes, and an uncomputed filter for new commits or for commits with more than 512 changes. The effect on `git log -- path` is the same in both cases. We will fall back to the normal diffing algorithm when we can't benefit from the existence of Bloom filters. Helped-by: Jeff King <peff@peff.net> Helped-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jakub Narębski <jnareb@gmail.com> Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-30bloom.c: introduce core Bloom filter constructsLibravatar Garima Singh4-1/+188
Introduce the constructs for Bloom filters, Bloom filter keys and Bloom filter settings. For details on what Bloom filters are and how they work, refer to Dr. Derrick Stolee's blog post [1]. It provides a concise explanation of the adoption of Bloom filters as described in [2] and [3]. Implementation specifics: 1. We currently use 7 and 10 for the number of hashes and the size of each entry respectively. They served as great starting values, the mathematical details behind this choice are described in [1] and [4]. The implementation, while not completely open to it at the moment, is flexible enough to allow for tweaking these settings in the future. Note: The performance gains we have observed with these values are significant enough that we did not need to tweak these settings. The performance numbers are included in the cover letter of this series and in the commit message of the subsequent commit where we use Bloom filters to speed up `git log -- path`. 2. As described in [1] and [3], we do not need 7 independent hashing functions. We use the Murmur3 hashing scheme, seed it twice and then combine those to procure an arbitrary number of hash values. 3. The filters will be sized according to the number of changes in each commit, in multiples of 8 bit words. [1] Derrick Stolee "Supercharging the Git Commit Graph IV: Bloom Filters" https://devblogs.microsoft.com/devops/super-charging-the-git-commit-graph-iv-Bloom-filters/ [2] Flavio Bonomi, Michael Mitzenmacher, Rina Panigrahy, Sushil Singh, George Varghese "An Improved Construction for Counting Bloom Filters" http://theory.stanford.edu/~rinap/papers/esa2006b.pdf https://doi.org/10.1007/11841036_61 [3] Peter C. Dillinger and Panagiotis Manolios "Bloom Filters in Probabilistic Verification" http://www.ccs.neu.edu/home/pete/pub/Bloom-filters-verification.pdf https://doi.org/10.1007/978-3-540-30494-4_26 [4] Thomas Mueller Graf, Daniel Lemire "Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters" https://arxiv.org/abs/1912.08258 Helped-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jakub Narębski <jnareb@gmail.com> Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-30bloom.c: add the murmur3 hash implementationLibravatar Garima Singh7-0/+133
In preparation for computing changed paths Bloom filters, implement the Murmur3 hash algorithm as described in [1]. It hashes the given data using the given seed and produces a uniformly distributed hash value. [1] https://en.wikipedia.org/wiki/MurmurHash#Algorithm Helped-by: Derrick Stolee <dstolee@microsoft.com> Helped-by: Szeder Gábor <szeder.dev@gmail.com> Reviewed-by: Jakub Narębski <jnareb@gmail.com> Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-30commit-graph: define and use MAX_NUM_CHUNKSLibravatar Garima Singh1-2/+3
This is a minor cleanup to make it easier to change the number of chunks being written to the commit graph. Reviewed-by: Jakub Narębski <jnareb@gmail.com> Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-29Merge branch 'ds/default-pack-use-sparse-to-true'Libravatar Junio C Hamano7-16/+18
The 'pack.useSparse' configuration variable now defaults to 'true', enabling an optimization that has been experimental since Git 2.21. * ds/default-pack-use-sparse-to-true: pack-objects: flip the use of GIT_TEST_PACK_SPARSE config: set pack.useSparse=true by default
2020-03-26The second batch post 2.26 cycleLibravatar Junio C Hamano1-0/+53
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-26Merge branch 'ah/force-pull-rebase-configuration'Libravatar Junio C Hamano3-11/+65
"git pull" learned to warn when no pull.rebase configuration exists, and neither --[no-]rebase nor --ff-only is given (which would result a merge). * ah/force-pull-rebase-configuration: pull: warn if the user didn't say whether to rebase or to merge
2020-03-26Merge branch 'tg/retire-scripted-stash'Libravatar Junio C Hamano8-862/+31
"git stash" has kept an escape hatch to use the scripted version for a few releases, which got stale. It has been removed. * tg/retire-scripted-stash: stash: remove the stash.useBuiltin setting stash: get git_stash_config at the top level
2020-03-26Merge branch 'jc/describe-misnamed-annotated-tag'Libravatar Junio C Hamano2-7/+28
When "git describe C" finds an annotated tag with tagname A to be the best name to explain commit C, and the tag is stored in a "wrong" place in the refs/tags hierarchy, e.g. refs/tags/B, the command gave a warning message but used A (not B) to describe C. If C is exactly at the tag, the describe output would be "A", but "git rev-parse A^0" would not be equal as "git rev-parse C^0". The behavior of the command has been changed to use the "long" form i.e. A-0-gOBJECTNAME, which is correctly interpreted by rev-parse. * jc/describe-misnamed-annotated-tag: describe: force long format for a name based on a mislocated tag
2020-03-26Merge branch 'at/rebase-fork-point-regression-fix'Libravatar Junio C Hamano3-13/+34
The "--fork-point" mode of "git rebase" regressed when the command was rewritten in C back in 2.20 era, which has been corrected. * at/rebase-fork-point-regression-fix: rebase: --fork-point regression fix
2020-03-26Merge branch 'bc/filter-process'Libravatar Junio C Hamano20-75/+350
Provide more information (e.g. the object of the tree-ish in which the blob being converted appears, in addition to its path, which has already been given) to smudge/clean conversion filters. * bc/filter-process: t0021: test filter metadata for additional cases builtin/reset: compute checkout metadata for reset builtin/rebase: compute checkout metadata for rebases builtin/clone: compute checkout metadata for clones builtin/checkout: compute checkout metadata for checkouts convert: provide additional metadata to filters convert: permit passing additional metadata to filter processes builtin/checkout: pass branch info down to checkout_worktree
2020-03-26Merge branch 'hi/gpg-prefer-check-signature'Libravatar Junio C Hamano6-78/+201
The code to interface with GnuPG has been refactored. * hi/gpg-prefer-check-signature: gpg-interface: prefer check_signature() for GPG verification t: increase test coverage of signature verification output
2020-03-26Merge branch 'bc/sha-256-part-1-of-4'Libravatar Junio C Hamano28-141/+623
SHA-256 transition continues. * bc/sha-256-part-1-of-4: (22 commits) fast-import: add options for rewriting submodules fast-import: add a generic function to iterate over marks fast-import: make find_marks work on any mark set fast-import: add helper function for inserting mark object entries fast-import: permit reading multiple marks files commit: use expected signature header for SHA-256 worktree: allow repository version 1 init-db: move writing repo version into a function builtin/init-db: add environment variable for new repo hash builtin/init-db: allow specifying hash algorithm on command line setup: allow check_repository_format to read repository format t/helper: make repository tests hash independent t/helper: initialize repository if necessary t/helper/test-dump-split-index: initialize git repository t6300: make hash algorithm independent t6300: abstract away SHA-1-specific constants t: use hash-specific lookup tables to define test constants repository: require a build flag to use SHA-256 hex: add functions to parse hex object IDs in any algorithm hex: introduce parsing variants taking hash algorithms ...
2020-03-26Merge branch 'pb/recurse-submodules-fix'Libravatar Junio C Hamano3-25/+51
Fix "git checkout --recurse-submodules" of a nested submodule hierarchy. * pb/recurse-submodules-fix: t/lib-submodule-update: add test removing nested submodules unpack-trees: check for missing submodule directory in merged_entry unpack-trees: remove outdated description for verify_clean_submodule t/lib-submodule-update: move a test to the right section t/lib-submodule-update: remove outdated test description t7112: remove mention of KNOWN_FAILURE_SUBMODULE_RECURSIVE_NESTED
2020-03-25The first batch post 2.26 cycleLibravatar Junio C Hamano3-2/+43
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-25Merge branch 'ss/submodule-foreach-cb'Libravatar Junio C Hamano1-4/+4
Code clean-up. * ss/submodule-foreach-cb: submodule--helper.c: Rename 'cb_foreach' to 'foreach_cb'
2020-03-25Merge branch 'jc/config-tar'Libravatar Junio C Hamano3-7/+8
Improve the structure of the documentation source a bit. * jc/config-tar: separate tar.* config to its own source file
2020-03-25Merge branch 'en/oidset-uninclude-hashmap'Libravatar Junio C Hamano1-1/+0
Code clean-up. * en/oidset-uninclude-hashmap: oidset: remove unnecessary include
2020-03-25Merge branch 'ds/check-connected-reprepare-packed-git'Libravatar Junio C Hamano1-0/+4
Corner case "git fetch" fix. * ds/check-connected-reprepare-packed-git: connected.c: reprepare packs for corner cases
2020-03-25Merge branch 'rs/doc-passthru-fetch-options'Libravatar Junio C Hamano1-3/+7
Doc update. * rs/doc-passthru-fetch-options: pull: document more passthru options
2020-03-25Merge branch 'pw/advise-rebase-skip'Libravatar Junio C Hamano8-46/+233
The mechanism to prevent "git commit" from making an empty commit or amending during an interrupted cherry-pick was broken during the rewrite of "git rebase" in C, which has been corrected. * pw/advise-rebase-skip: commit: give correct advice for empty commit during a rebase commit: encapsulate determine_whence() for sequencer commit: use enum value for multiple cherry-picks sequencer: write CHERRY_PICK_HEAD for reword and edit cherry-pick: check commit error messages cherry-pick: add test for `--skip` advice in `git commit` t3404: use test_cmp_rev
2020-03-25Merge branch 'yz/p4-py3'Libravatar Junio C Hamano2-95/+146
Update "git p4" to work with Python 3. * yz/p4-py3: ci: use python3 in linux-gcc and osx-gcc and python2 elsewhere git-p4: use python3's input() everywhere git-p4: simplify regex pattern generation for parsing diff-tree git-p4: use dict.items() iteration for python3 compatibility git-p4: use functools.reduce instead of reduce git-p4: fix freezing while waiting for fast-import progress git-p4: use marshal format version 2 when sending to p4 git-p4: open .gitp4-usercache.txt in text mode git-p4: convert path to unicode before processing them git-p4: encode/decode communication with git for python3 git-p4: encode/decode communication with p4 for python3 git-p4: remove string type aliasing git-p4: change the expansion test from basestring to list git-p4: make python2.7 the oldest supported version
2020-03-25Merge branch 'am/real-path-fix'Libravatar Junio C Hamano16-75/+107
The real_path() convenience function can easily be misused; with a bit of code refactoring in the callers' side, its use has been eliminated. * am/real-path-fix: get_superproject_working_tree(): return strbuf real_path_if_valid(): remove unsafe API real_path: remove unsafe API set_git_dir: fix crash when used with real_path()
2020-03-25Merge branch 'sg/commit-slab-clarify-peek'Libravatar Junio C Hamano1-1/+6
In-code comment update. * sg/commit-slab-clarify-peek: commit-slab: clarify slabname##_peek()'s return value
2020-03-25Merge branch 'jc/maintain-doc'Libravatar Junio C Hamano1-13/+39
Doc update. * jc/maintain-doc: update how-to-maintain-git
2020-03-25Merge branch 'js/https-proxy-config'Libravatar Junio C Hamano2-5/+90
A handful of options to configure SSL when talking to proxies have been added. * js/https-proxy-config: http: add environment variable support for HTTPS proxies http: add client cert support for HTTPS proxies
2020-03-25Merge branch 'hw/advise-ng'Libravatar Junio C Hamano9-12/+200
Revamping of the advise API to allow more systematic enumeration of advice knobs in the future. * hw/advise-ng: tag: use new advice API to check visibility advice: revamp advise API advice: change "setupStreamFailure" to "setUpstreamFailure" advice: extract vadvise() from advise()
2020-03-22Git 2.26Libravatar Junio C Hamano1-1/+1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-21Merge branch 'en/rebase-backend'Libravatar Junio C Hamano1-1/+1
Test fix. * en/rebase-backend: t3419: prevent failure when run with EXPENSIVE
2020-03-21Merge tag 'l10n-2.26.0-rnd2.1' of git://github.com/git-l10n/git-po.gitLibravatar Junio C Hamano13-26190/+53932
l10n-2.26.0-rnd2.1 * tag 'l10n-2.26.0-rnd2.1' of https://github.com/git-l10n/git-po: (28 commits) l10n: tr.po: change file mode to 644 l10n: de.po: Update German translation for Git 2.26.0 l10n: de.po: add missing space l10n: tr: Fix a couple of ambiguities l10n: Update Catalan translation l10n: sv.po: Update Swedish translation (4839t0f0u) l10n: zh_CN: Revise v2.26.0 translation l10n: zh_CN: for git v2.26.0 l10n round 1 and 2 l10n: vi(4839t): Updated Vietnamese translation for v2.26.0 l10n: vi: fix translation + grammar l10n: zh_TW.po: v2.26.0 round 2 (0 untranslated) l10n: zh_TW.po: v2.26.0 round 1 (11 untranslated) l10n: it.po: update the Italian translation for Git 2.26.0 round 2 l10n: es: 2.26.0 round#2 l10n: bg.po: Updated Bulgarian translation (4839t) l10n: tr: v2.26.0 round 2 l10n: fr : v2.26.0 rnd 2 l10n: git.pot: v2.26.0 round 2 (7 new, 2 removed) l10n: tr: Add glossary for Turkish translations l10n: sv.po: Update Swedish translation (4835t0f0u) ...
2020-03-21l10n: tr.po: change file mode to 644Libravatar Jiang Xin1-0/+0
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2020-03-20t3419: prevent failure when run with EXPENSIVELibravatar brian m. carlson1-1/+1
This test runs a function which itself runs several assertions. The last of these assertions cleans up the .git/rebase-apply directory, since when run with EXPENSIVE set, the function is invoked a second time to run the same tests with a larger data set. However, as of 2ac0d6273f ("rebase: change the default backend from "am" to "merge"", 2020-02-15), the default backend of rebase has changed, and cleaning up the rebase-apply directory has no effect: it no longer exists, since we're using rebase-merge instead. Since we don't really care which rebase backend is in use, let's just use the command "git rebase --quit", which will do the right thing regardless. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-20pack-objects: flip the use of GIT_TEST_PACK_SPARSELibravatar Derrick Stolee3-5/+6
The environment variable GIT_TEST_PACK_SPARSE was previously used to allow testing the --sparse option for "git pack-objects" in the test suite. This allowed interesting cases of "git push" to also test this algorithm. Since pack.useSparse is now true by default, we do not need this variable to _enable_ the --sparse option, but instead to _disable_ it. This flips how we work with the variable a bit. When checking for the variable, default to a value of -1 for "unset". If unset, then take the default from the repo settings, which is currently 1. Then, the --[no-]sparse command-line option will override either of these settings. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-20config: set pack.useSparse=true by defaultLibravatar Derrick Stolee5-11/+12
The pack.useSparse config option was introduced by 3d036eb0 (pack-objects: create pack.useSparse setting, 2019-01-19) and was first available in v2.21.0. When enabled, the pack-objects process during 'git push' will use a sparse tree walk when deciding which trees and blobs to send to the remote. The algorithm was introduced by d5d2e93 (revision: implement sparse algorithm, 2019-01-16) and has been in production use by VFS for Git since around that time. The features.experimental config option also enabled pack.useSparse, so hopefully that has also increased exposure. It is worth noting that pack.useSparse has a possibility of sending more objects across a push, but requires a special arrangement of exact _copies_ across directories. There is a test in t5322-pack-objects-sparse.sh that demonstrates this possibility. This test uses the --sparse option to "git pack-objects" but we can make it implied by the config value to demonstrate that the default value has changed. While updating that test, I noticed that the documentation did not include an option for --no-sparse, which is now more important than it was before. Since the downside is unlikely but the upside is significant, set the default value of pack.useSparse to true. Remove it from the set of options implied by features.experimental. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-20l10n: de.po: Update German translation for Git 2.26.0Libravatar Matthias Rüster1-2726/+2906
Signed-off-by: Matthias Rüster <matthias.ruester@gmail.com> Reviewed-by: Ralf Thielow <ralf.thielow@gmail.com> Reviewed-by: Phillip Szelat <phillip.szelat@gmail.com>
2020-03-20l10n: de.po: add missing spaceLibravatar Ralf Thielow1-1/+1
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
2020-03-19Merge https://github.com/prati0100/git-guiLibravatar Junio C Hamano10-2793/+3854
* 'master' of https://github.com/prati0100/git-gui: git-gui: create a new namespace for chord script evaluation git-gui: reduce Tcl version requirement from 8.6 to 8.5 git-gui--askpass: coerce answers to UTF-8 on Windows git-gui: fix error popup when doing blame -> "Show History Context" git-gui: add missing close bracket git-gui: update German translation git-gui: extend translation glossary template with more terms git-gui: update pot template and German translation to current source code
2020-03-20l10n: tr: Fix a couple of ambiguitiesLibravatar Emir Sarı1-9/+9
Signed-off-by: Emir Sarı <bitigchi@me.com>
2020-03-19Merge branch 'py/remove-tcloo'Libravatar Pratyush Yadav3-35/+35
Reduce the Tcl version requirement to 8.5 to allow git-gui to run on MacOS distributions like High Sierra. While here, fix a potential variable name collision. * py/remove-tcloo: git-gui: create a new namespace for chord script evaluation git-gui: reduce Tcl version requirement from 8.6 to 8.5
2020-03-18RelNotes/2.26.0: fix various typosLibravatar Elijah Newren1-4/+4
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-18submodule--helper.c: Rename 'cb_foreach' to 'foreach_cb'Libravatar Shourya Shukla1-4/+4
In 'submodule--helper.c', the structures and macros for callbacks belonging to any subcommand are named in the format: 'subcommand_cb' and 'SUBCOMMAND_CB_INIT' respectively. This was an exception for the subcommand 'foreach' of the command 'submodule'. Rename the aforementioned structures and macros: 'struct cb_foreach' to 'struct foreach_cb' and 'CB_FOREACH_INIT' to 'FOREACH_CB_INIT'. Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-18separate tar.* config to its own source fileLibravatar Junio C Hamano3-7/+8
Even though there is only one configuration variable in the namespace, it is not quite right to have tar.umask described among the variables for tag.* namespace. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-18l10n: Update Catalan translationLibravatar Jordi Mas1-80/+71
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2020-03-17Sync with Git 2.25.2Libravatar Junio C Hamano3-40/+88
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-17Git 2.25.2Libravatar Junio C Hamano3-2/+62
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-17unicode: update the width tables to Unicode 13.0Libravatar Beat Bolli1-16/+27
Now that Unicode 13.0 has been announced[0], update the character width tables to the new version. [0] https://home.unicode.org/announcing-the-unicode-standard-version-13-0/ Signed-off-by: Beat Bolli <dev+git@drbeat.li> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-17Merge branch 'js/ci-windows-update' into maintLibravatar Junio C Hamano10-73/+93
Updates to the CI settings. * js/ci-windows-update: Azure Pipeline: switch to the latest agent pools ci: prevent `perforce` from being quarantined t/lib-httpd: avoid using macOS' sed
2020-03-17Merge branch 'jk/run-command-formatfix' into maintLibravatar Junio C Hamano1-1/+1
Code style cleanup. * jk/run-command-formatfix: run-command.h: fix mis-indented struct member
2020-03-17Merge branch 'jk/doc-credential-helper' into maintLibravatar Junio C Hamano2-91/+88
Docfix. * jk/doc-credential-helper: doc: move credential helper info into gitcredentials(7)
2020-03-17Merge branch 'js/mingw-open-in-gdb' into maintLibravatar Junio C Hamano2-0/+23
Dev support. * js/mingw-open-in-gdb: mingw: add a helper function to attach GDB to the current process