Commit order from fast-export --first-parent has changed in git 2.35,
see 726a228dfb
This will break the same tests on older git releases.
Fixes: #344
Signed-off-by: Stefano Rivera <stefano@rivera.za.net>
The wording "exact paths" appears to not be clear enough for folks and I
keep repeatedly getting bug reports about filter-repo not following
renames. Make it very explicit.
Signed-off-by: Elijah Newren <newren@gmail.com>
--analyze is hardcoded to write to a subdirectory inside GIT_DIR.
When practicing filtering runs on a large repo it is desirable to keep
an unchanged copy read-only to reduce chance of user error. It is
desirable to be able to analyze a read-only repo without having to clone
it. This would save a lot of time and space.
Add --report-dir option to set a non-default destination directory for
writing analysis output to.
Signed-off-by: rndbit <rndbit@filter.bitman.net>
[en: fixed existing regression test broken by now not overwriting the
analysis directory unconditionally, and also added a new test of
the new behavior for code coverage.]
Signed-off-by: Elijah Newren <newren@gmail.com>
The --replace-text failed to detect blobs as binary and incorrectly
applied to all blobs.
Prior to switch from python2 to python3 it incorrectly designated blobs
containing 0 character instead of NUL byte as binary and would have been
causing text replacements to apply to binary files and not apply to text
files containing 0 character.
Add regression tests with blobs containing; 0 character, NUL byte, and
both 0 character and NUL byte.
Signed-off-by: rndbit <rndbit@filter.bitman.net>
Detection if blob is binary for the purpose of --replace-text always
fails and text replacement is applied to all blobs. This has changed
going to python3. With python2 the same code would still be wrong but
would manifest differently.
In the construct 'for x in b"..."' the x is
- of type <int> in python3
- of type <str> in python2
thus in python3 condition 'x == b"\0"' can not be true for any x due to
type difference.
Further, the search was supposed to look for NUL byte and not 0
character, thus change to b"\0" instead of b"0".
Signed-off-by: rndbit <rndbit@filter.bitman.net>
Like --replace-text, add an option --replace-message which replaces text
in commit/tag message bodies, so that users can easily replace text
without constructing a --message-callback.
Signed-off-by: Gwyneth Morgan <gwymor@tilde.club>
Signed-off-by: Elijah Newren <newren@gmail.com>
When git-filter-repo is installed, sys.argv[0] will be an entry-point
stub, not the relevant Python module.
Signed-off-by: Stefano Rivera <stefano@rivera.za.net>
Someone was surprised by my claim that someone else had reported
Microsoft provided a stub or stripped down python. Link to where it was
reported in case others hit the same problem.
Vilius Šumskas reported that the need to edit the shebang line has been
corrected with the newest Git for Windows, so update the text to note
this. It's possible other users may still have problems given the
variety of Windows versions and the number of reports I had about this,
so I want to still leave links there for at least a little while.
Be more explicit about how pip is lame and provides virtually no benefit
since it leaves you to fix your $PATH yourself, which was the only step
that was needed in installing the whole package anyway.
Signed-off-by: Elijah Newren <newren@gmail.com>
This should make the installation via pip more robust.
On Windows the usage of entry_points will install a wrapper executable
for the script that chooses the proper python executable. This
essentially makes the script run correctly when called via `git
filter-repo` (direct execution via `git-filter-repo` was already fine
before).
This fixes an issue on Windows, where the git-installation will choose a
different python executable than the one indicated by the installation
via `pip{x,3} install`.
Signed-off-by: Benjamin Motz <benjamin.motz@mailbox.org>
It appears that python will usually write out files even if we do not
explicitly close them, but other tweaks to the code can make this not
happen. Explicitly close the files to be safe.
Signed-off-by: Elijah Newren <newren@gmail.com>
The "Filtering based on many paths" section includes this code snippet,
```
regex:^.*/.*/[0-9]{4}-[0-9]{2}-[0-9]{2}.txt$
```
and this text
```
files whose name
was of the form YYYY.MM-DD.txt at least two subdirectories deep
```
Update the text to YYYY-MM-DD.txt to correctly match the regex
in the code snippet.
Signed-off-by: Cody Martin <codytylermartin@gmail.com>
`git shortlog` ignores the case when matching the email address. As
such, `git filter-repo` should do the same.
Signed-off-by: Stefan Lietzau <lietzaustefan@gmail.com>
[en: fixed a small logic error, tweaked the commit message, and rebased]
Signed-off-by: Elijah Newren <newren@gmail.com>
Dscho made fixes to msys2, cygwin, git-for-windows, and contributed
several improvements to git-filter-repo that were merged in
js/windows-fixes. Reference some of the fixes so that those who had
issues with git-filter-repo in the past may be willing to retry, and
update the installation instructions with relevant pointers.
Signed-off-by: Elijah Newren <newren@gmail.com>
For anything more complicated than a few lines, it's easier to write the
callback body in a file and let filter-repo load the file as a string.
Signed-off-by: Shezan Baig <sbaig1@bloomberg.net>
[en: added a testcase for code coverage]
Signed-off-by: Elijah Newren <newren@gmail.com>
So that we don't have to run with coverage if we don't want to.
Additionally, don't require being in the t directory to run tests
Signed-off-by: Stefano Rivera <stefano@rivera.za.net>
Some users may want to take advantage of setting TMPDIR to another
location that might be faster for the linting process.
Reported-by: @ruv on GitHub
Signed-off-by: Elijah Newren <newren@gmail.com>
It was not clear for some users that the filenames would be relative
paths from the toplevel of the repository. Add some text to explain
this.
Reported-by: @ruv on GitHub
Signed-off-by: Elijah Newren <newren@gmail.com>
Users could specify relative paths on the command line, and then also
provide a directory other than '.' for the repo. Since we did an
unconditional os.chdir() to move into the repo, that would invalidate
the original relative paths. Fix that by changing the relative paths
into absolute paths.
Signed-off-by: 林博仁(Buo-ren Lin) <Buo.Ren.Lin@gmail.com>
[en: tweaked commit message to explain the problem]
Signed-off-by: Elijah Newren <newren@gmail.com>
This fixes the "TypeError: a bytes-like object is required, not 'str'"
problem on Windows, letting t9391 pass.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows, we want to run with a native Python, i.e. the separator is a
semicolon, and the paths should be Windows paths (although they're
allowed to have forward slashes instead of backslashes).
Since we're most likely running this in an MSYS2 Bash, allow for
`$TEST_DIRECTORY` to pretend to be a Unix path, and translate it via
`cygpath` into a Windows path.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Some people don't like htmlpreview.github.io. I once or twice saw a
case where it appeared to be affected by load limits. Since external
sites are making the manual available, and it's unlikely there are too
many changes between the last release and the current manual, just link
to it as an alternative for folks.
Signed-off-by: Elijah Newren <newren@gmail.com>
rev-list already has --count option anyway, so piping output to wc -l to
count the number of lines was a total waste of time. Plus, it might
cause failures for the testsuite on some Windows boxes.
Signed-off-by: Elijah Newren <newren@gmail.com>
Commit 28b479b7 (Fix bug in --path-rename argument without colon,
2021-03-12) added a new conditional error message, with no corresponding
testcase to ensure the line was covered. I forgot to check the coverage
before merging the change. Add a relevant test now.
Signed-off-by: Elijah Newren <newren@gmail.com>
We had a lingering issue in the conversion from python2 to python3; as
reported by @thebrandre on GitHub:
any(x==b'1' for x in b"123")
# returns True in Python2 and False in Python3 because different
# types are returned on iteration:
[type(x) for x in b"123"]
# Python2: [<type 'str'>, <type 'str'>, <type 'str'>]
# Python3: [<class 'int'>, <class 'int'>, <class 'int'>]
Replace the
any(x==b"0" for x in blob.data[0:8192])
construct with
b"\0" in blob.data[0:8192]
to fix this.
Suggested-by: @thebrandre on GitHub
Signed-off-by: Elijah Newren <newren@gmail.com>
The --path-rename flag expected an argument with a colon
character (':') in it, which it assumed without checking. If the user
gave an argument with no colon in it, this backtrace would be shown:
File "/usr/local/bin/git-filter-repo", line 1626, in __call__
if values[0] and values[1] and not (
IndexError: list index out of range
Add a real error message in place of the backtrace.
Also check that there's exactly one colon; show an error message if
there's more than one, as that syntax has no interpretation that is
obviously the right one.
Signed-off-by: Lassi Kortela <lassi@lassi.io>
--use-mailmap was defined as `--mailmap .mailmap` except that it would
set args.mailmap to ".mailmap" rather than b".mailmap" (in other words,
it accidentally set it to a string rather than a bytestring). Since
the --mailmap parameter is always passed as a bytestring, we ran into
errors with calling unknown functions due to the type mismatch.
Signed-off-by: Elijah Newren <newren@gmail.com>