Commit Graph

456 Commits (24f09bd016fc7310bf6983b81f4730ab881554c2)
 

Author SHA1 Message Date
Stefano Rivera 24f09bd016 Share implementation with github workflow
Signed-off-by: Stefano Rivera <stefano@rivera.za.net>
3 years ago
Stefano Rivera 26e3f8c52e Exit non-zero if the tests fail
Signed-off-by: Stefano Rivera <stefano@rivera.za.net>
3 years ago
Stefano Rivera 34b26f4026 Break the actual test runner into its own script
So that we don't have to run with coverage if we don't want to.

Additionally, don't require being in the t directory to run tests

Signed-off-by: Stefano Rivera <stefano@rivera.za.net>
3 years ago
Elijah Newren 75e67bcd44 git-filter-repo.txt: link to GitHub docs on purging old history
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 12743def48 git-filter-repo.txt: add some clarifications around replace refs
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 8683d6fe48 Merge branch 'js/windows-fixes'
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Johannes Schindelin fbaab1704c lint-history: do decode bytes
This fixes the "TypeError: a bytes-like object is required, not 'str'"
problem on Windows, letting t9391 pass.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
3 years ago
Johannes Schindelin e0a3df8c62 Fix the Python path on Windows
On Windows, we want to run with a native Python, i.e. the separator is a
semicolon, and the paths should be Windows paths (although they're
allowed to have forward slashes instead of backslashes).

Since we're most likely running this in an MSYS2 Bash, allow for
`$TEST_DIRECTORY` to pretend to be a Unix path, and translate it via
`cygpath` into a Windows path.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
3 years ago
Elijah Newren 3f181531df README.md: link to external formatting of user manual
Some people don't like htmlpreview.github.io.  I once or twice saw a
case where it appeared to be affected by load limits.  Since external
sites are making the manual available, and it's unlikely there are too
many changes between the last release and the current manual, just link
to it as an alternative for folks.

Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren d2fdc89ff3 filter-repo: avoid depending on `wc` binary being present
rev-list already has --count option anyway, so piping output to wc -l to
count the number of lines was a total waste of time.  Plus, it might
cause failures for the testsuite on some Windows boxes.

Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren cf67ccd978 filter-repo: improve invalid repository error message
Even though the repository is encoded as a bytestring, we want error
messages to be UTF-8.

Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 7500fb7c5a t9390: add a testcase for --path-rename with no colon
Commit 28b479b7 (Fix bug in --path-rename argument without colon,
2021-03-12) added a new conditional error message, with no corresponding
testcase to ensure the line was covered.  I forgot to check the coverage
before merging the change.  Add a relevant test now.

Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 97a1613f81 lint-history: fix binary blob detection
We had a lingering issue in the conversion from python2 to python3; as
reported by @thebrandre on GitHub:

    any(x==b'1' for x in b"123")
    # returns True in Python2 and False in Python3 because different
    # types are returned on iteration:
    [type(x) for x in b"123"]
    # Python2: [<type 'str'>, <type 'str'>, <type 'str'>]
    # Python3: [<class 'int'>, <class 'int'>, <class 'int'>]

Replace the
    any(x==b"0" for x in blob.data[0:8192])
construct with
    b"\0" in blob.data[0:8192]
to fix this.

Suggested-by: @thebrandre on GitHub
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren cf84943982 Merge branch 'lk/path-rename-colon-count'
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Lassi Kortela 28b479b79d Fix bug in --path-rename argument without colon
The --path-rename flag expected an argument with a colon
character (':') in it, which it assumed without checking. If the user
gave an argument with no colon in it, this backtrace would be shown:

  File "/usr/local/bin/git-filter-repo", line 1626, in __call__
    if values[0] and values[1] and not (
IndexError: list index out of range

Add a real error message in place of the backtrace.

Also check that there's exactly one colon; show an error message if
there's more than one, as that syntax has no interpretation that is
obviously the right one.

Signed-off-by: Lassi Kortela <lassi@lassi.io>
3 years ago
Elijah Newren 4987e0f6e3 filter-repo: fix --use-mailmap
--use-mailmap was defined as `--mailmap .mailmap` except that it would
set args.mailmap to ".mailmap" rather than b".mailmap" (in other words,
it accidentally set it to a string rather than a bytestring).  Since
the --mailmap parameter is always passed as a bytestring, we ran into
errors with calling unknown functions due to the type mismatch.

Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 407d15dd29 Merge pull request #167 from dscho/meaow
Add a GitHub workflow for continuous testing

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Johannes Schindelin d28b2a7346 Add a GitHub workflow to test this thing
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin d0dcece202 t9391: guard `dos2unix` use behind a prereq
Not all setups have `dos2unix`. Most notably, the Ubuntu and macOS
agents of GitHub Actions don't.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin 85afdf9da9 t9391: don't rely on the system gitconfig defining core.autocrlf=false
The test case t9391.12 specifically wants to test LF vs CR/LF line
ending issues, expecting `core.autoCRLF` to default to `false`. This is
true on Linux and macOS and pretty much everywhere else, except on
Windows.

Let's make sure that the test operates with the `core.autoCRLF` value it
assumes to operate under.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin fe79ec9912 t9390: work around yet another Unix<->Win32 path issue
On Windows, there is no absolute path `/fake/path`, but MSYS2 (which Git
for Windows uses e.g. for running Bash scripts) pretends that it exists.
This only works within MSYS2 applications, of course, so... when MSYS2
sees that we hand a parameter to a non-MSYS2 application in a shell
script, it helpfully converts it to the full path (prepending MSYS2's
pseudo root directory).

Let's work around that by using a Win32-compatible path to begin with:
`$(pwd)` produces that on Windows. On other platforms, it still works.

As a bonus, this safe-guards our test against a setup where `/fake/path`
_actually exists_. Stranger things have been seen in the wild, after
all.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin 848cd652f0 t9390: work around clash with MSYS2's Unix<->Win32 path conversion
MSYS2 tries to be very helpful, and in most cases it even works, by
converting parameters passed from inside an MSYS2 Bash to a non-MSYS2
application (such as `git.exe`) if they look like Unix-style paths or
path lists.

Sometimes, however, this automatic path conversion is unhelpful, e.g.
when passing the parameter `foo:.` to Git, which MSYS2 will readily
convert to a Windows-style path list: `foo;bar` (i.e. using a semicolon
instead of a colon).

Happily, there is a way to avoid that: the `MSYS_NO_PATHCONV` variable.
Let's use it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin 6967fad156 t9390: avoid using `colrm`
While it is true that `colrm` is available on macOS by default, and even
in Ubuntu (thanks to the `bsdmainutils` package), it is not available on
Windows.

Let's use `cut` instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin e6ffeded2e t9390: avoid using Bash-ism `<(...)`
The problem with this is that on Windows, we use the MSYS2 Bash which
uses the POSIX emulation layer called "MSYS2 runtime" that pretends that
there _is_ something like the `/dev/fd/` namespace, and tells `git.exe`
about it, but `git.exe` does not use the POSIX emulation layer, and
hence has no idea what Bash is talking about.

Besides, we should avoid pipes, just as we do in the Git project.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin 8bc195673c t9390: close link of broken &&-chain
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin f1ee28d78f t9390: expect the correct line count in `--strip-blobs-with-ids`
In that test case, we expect the line count to be 5, but it is actually
6 lines that we should expect:

	numbers/medium.num
	numbers/small.num
	sequence/know
	whatever
	words/know

Note the empty line at the top: this list is generated via `git log
--format=%n`, and that `%n` stands for "newline", meaning that we _must_
expect an empty line.

This expectation seems to have been broken already in the commit that
added the test case: b6a35f8 (filter-repo: implement
--strip-blobs-with-ids, 2019-05-30). It was hidden for such a long time
by a broken &&-chain, which we will fix next.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin 6c475a7e09 t9390: use the correct prereq when using "funny" file names
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin 580e0f0395 Test data and scripts must have Unix line endings
The tests will otherwise fail.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin 453128fff7 Ignore the generated Python cache
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Elijah Newren 74ea810872 INSTALL.md: add notes about common installation issues
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 9282a33a02 git-filter-repo.txt: regexes & globs apply to entire file, not to lines
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 93ee4ae907 Merge branch 'mw/empty-author-name' into main
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Martin Wilck 282f8ddb9b filter-repo: only set author from committer if author email not set
Some commits may have a valid author email, but no valid author name.
Old versions of git didn't enforce a non-empty name.
Setting the author data from the committer is wrong in this case.

Also add a test case for this to t9390.

Example: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c6295cdf656de63d6d1123def71daba6cd91939c

(en: replaced with a dedicated test instead of tweaking existing ones)

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 7eaaf191de filter-repo: correctly prune nested tags not matching filtering criteria
When the user specifies some kind of criteria to filter commits by (e.g.
--subdirectory-filter mysubdir), we rewrite parents commits that are
entirely filtered out to the most recent ancestor that still exists, or
just prune the parent if there isn't one.  That works great when the
parent is a commit, but nested tags have parents that are tags.  If we
only prune the first tag (i.e. the tag of a commit), then letting any
tags through that had that tag as a parent will result in a fast-import
crash with a message of the form

   fatal: mark :35390 not declared

Ensure that when a tag gets pruned, the pruning is recorded as such...so
that any children tags will get pruned as well.

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren b1606ba8ac Merge branch 'mr/fix-filter-lamely-name-error' into main
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren f9a54f36d9 Merge branch 'tm/fix-typo' into main
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Marius Renner 70f83c2526 filter-lamely: fix NameError because of forgotten fr module prefix
In repositories with annotated tags filter-lamely crashes with the
message: "NameError: name 'Reset' is not defined".

This is because of a missing "fr" module prefix in the code, which this
commit adds.

Signed-off-by: Marius Renner <marius@mariusrenner.de>
4 years ago
Tom Matthews 96959d1174
converting-from-bfg-repo-cleaner.md: fix typo
Signed-off-by: Tom Matthews <trcm@pm.me>
4 years ago
Elijah Newren 7b3e714b94 filter-repo (README): remove outdated 2.28.0-not-yet-released comment
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren d79ea709b7 filter-repo: fix crash from assuming parent is an int
When filtering with --refs, parents can be a hash rather than an
integer.  There was a code path in RepoFilter._prunable() that was
written assuming the first parent would always be an integer; fix it to
handle a hash as well.

Reported-by: Niklas Hambüchen <mail@nh2.me>
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 4b452da4ef Merge branch 'jb/ignore-generated-docs' into main
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren e4960a53f8 Fix undefined variable names
Reported-by: Christian Clauss <cclauss@me.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Jonas Bernoulli 8a8278701f .gitignore: ignore the generated documentation
Signed-off-by: Jonas Bernoulli <jonas@bernoul.li>
4 years ago
Elijah Newren ed6f410088 Contributing.md: link to Nicolai Hähnle's code review comments
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren cefeef1c0a filter-repo: use new --date-format=raw-permissive fast-import option
fast-import gained a new raw-permissive date format explictly for
allowing people to import repositories as-is.  Make use of the flag, and
stop rewriting the bogus timezone found in rails.git.

If users do not like these bogus times, they can of course write a
filter to fix them (or even make them bogus in a different way).  For
example:

    git filter-repo ... --commit-callback '
      if commit.author_date.endswith(b"+051800"):
        commit.author_date.replace(b"+051800", b"+0261")
    '

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren debe52000d contrib: rename no-op-example to barebones-example
"no-op" might suggest that it doesn't do anything, when in reality it
does exactly what filter-repo does.  Rename it to barebones-example.

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 2f26e4bce5 INSTALL.md: wording clarification on what repology.org tracks
Homebrew and scoop are both package managers and package repositories.
Fedora 32 is not a package manager, but does map to a package
repository.  Clarify wording that the list from repology.org is a list
of package repositories, not package managers.

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren b74eb6b69d Merge branch 'jr/document-commit-and-ref-map' into main
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
James Ramsay f867bb6ad7 git-filter-repo.txt: document mapping output
Useful commit and reference mappings are created on every run. These are
helpful in a number of situations, and should be documented so that
end-users and Git hosts can understand how to use the output.

The commit-map is particularly useful for Git hosts to override
retention mechanisms, like hidden refs. This allows end-users to purge
large files and sensitive data.

Signed-off-by: James Ramsay <james@jramsay.com.au>
4 years ago
Elijah Newren 1e0c3ab3ae filter-repo: make fresh clone warning scarier
Apparently, despite the fact that *overwrite* *repo* *history* are three
important words that each individually convey a lot of important
meaning, people ignore it and instinctively add --force.  Insert the
word "destructively" to get people to pause.

Further, change the end of the warning not to how to get around the
warning with the current repository, but instead with a suggestion that
they should instead be operating on a fresh clone and only then make a
side comment that the --force flag can be used to override.

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago