Commit Graph

442 Commits (debug-github-workflows)
 

Author SHA1 Message Date
Elijah Newren 1931ae5f28 .github/workflows/test.yml: show git version being used
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 848f369160 Make skipped tests much less verbose
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren e9e0308df1
Merge pull request #167 from dscho/meaow
Add a GitHub workflow for continuous testing
4 years ago
Johannes Schindelin d28b2a7346 Add a GitHub workflow to test this thing
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin d0dcece202 t9391: guard `dos2unix` use behind a prereq
Not all setups have `dos2unix`. Most notably, the Ubuntu and macOS
agents of GitHub Actions don't.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin 85afdf9da9 t9391: don't rely on the system gitconfig defining core.autocrlf=false
The test case t9391.12 specifically wants to test LF vs CR/LF line
ending issues, expecting `core.autoCRLF` to default to `false`. This is
true on Linux and macOS and pretty much everywhere else, except on
Windows.

Let's make sure that the test operates with the `core.autoCRLF` value it
assumes to operate under.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin fe79ec9912 t9390: work around yet another Unix<->Win32 path issue
On Windows, there is no absolute path `/fake/path`, but MSYS2 (which Git
for Windows uses e.g. for running Bash scripts) pretends that it exists.
This only works within MSYS2 applications, of course, so... when MSYS2
sees that we hand a parameter to a non-MSYS2 application in a shell
script, it helpfully converts it to the full path (prepending MSYS2's
pseudo root directory).

Let's work around that by using a Win32-compatible path to begin with:
`$(pwd)` produces that on Windows. On other platforms, it still works.

As a bonus, this safe-guards our test against a setup where `/fake/path`
_actually exists_. Stranger things have been seen in the wild, after
all.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin 848cd652f0 t9390: work around clash with MSYS2's Unix<->Win32 path conversion
MSYS2 tries to be very helpful, and in most cases it even works, by
converting parameters passed from inside an MSYS2 Bash to a non-MSYS2
application (such as `git.exe`) if they look like Unix-style paths or
path lists.

Sometimes, however, this automatic path conversion is unhelpful, e.g.
when passing the parameter `foo:.` to Git, which MSYS2 will readily
convert to a Windows-style path list: `foo;bar` (i.e. using a semicolon
instead of a colon).

Happily, there is a way to avoid that: the `MSYS_NO_PATHCONV` variable.
Let's use it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin 6967fad156 t9390: avoid using `colrm`
While it is true that `colrm` is available on macOS by default, and even
in Ubuntu (thanks to the `bsdmainutils` package), it is not available on
Windows.

Let's use `cut` instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin e6ffeded2e t9390: avoid using Bash-ism `<(...)`
The problem with this is that on Windows, we use the MSYS2 Bash which
uses the POSIX emulation layer called "MSYS2 runtime" that pretends that
there _is_ something like the `/dev/fd/` namespace, and tells `git.exe`
about it, but `git.exe` does not use the POSIX emulation layer, and
hence has no idea what Bash is talking about.

Besides, we should avoid pipes, just as we do in the Git project.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin 8bc195673c t9390: close link of broken &&-chain
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin f1ee28d78f t9390: expect the correct line count in `--strip-blobs-with-ids`
In that test case, we expect the line count to be 5, but it is actually
6 lines that we should expect:

	numbers/medium.num
	numbers/small.num
	sequence/know
	whatever
	words/know

Note the empty line at the top: this list is generated via `git log
--format=%n`, and that `%n` stands for "newline", meaning that we _must_
expect an empty line.

This expectation seems to have been broken already in the commit that
added the test case: b6a35f8 (filter-repo: implement
--strip-blobs-with-ids, 2019-05-30). It was hidden for such a long time
by a broken &&-chain, which we will fix next.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin 6c475a7e09 t9390: use the correct prereq when using "funny" file names
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin 580e0f0395 Test data and scripts must have Unix line endings
The tests will otherwise fail.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin 453128fff7 Ignore the generated Python cache
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Elijah Newren 74ea810872 INSTALL.md: add notes about common installation issues
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 9282a33a02 git-filter-repo.txt: regexes & globs apply to entire file, not to lines
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 93ee4ae907 Merge branch 'mw/empty-author-name' into main
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Martin Wilck 282f8ddb9b filter-repo: only set author from committer if author email not set
Some commits may have a valid author email, but no valid author name.
Old versions of git didn't enforce a non-empty name.
Setting the author data from the committer is wrong in this case.

Also add a test case for this to t9390.

Example: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c6295cdf656de63d6d1123def71daba6cd91939c

(en: replaced with a dedicated test instead of tweaking existing ones)

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 7eaaf191de filter-repo: correctly prune nested tags not matching filtering criteria
When the user specifies some kind of criteria to filter commits by (e.g.
--subdirectory-filter mysubdir), we rewrite parents commits that are
entirely filtered out to the most recent ancestor that still exists, or
just prune the parent if there isn't one.  That works great when the
parent is a commit, but nested tags have parents that are tags.  If we
only prune the first tag (i.e. the tag of a commit), then letting any
tags through that had that tag as a parent will result in a fast-import
crash with a message of the form

   fatal: mark :35390 not declared

Ensure that when a tag gets pruned, the pruning is recorded as such...so
that any children tags will get pruned as well.

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren b1606ba8ac Merge branch 'mr/fix-filter-lamely-name-error' into main
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren f9a54f36d9 Merge branch 'tm/fix-typo' into main
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Marius Renner 70f83c2526 filter-lamely: fix NameError because of forgotten fr module prefix
In repositories with annotated tags filter-lamely crashes with the
message: "NameError: name 'Reset' is not defined".

This is because of a missing "fr" module prefix in the code, which this
commit adds.

Signed-off-by: Marius Renner <marius@mariusrenner.de>
4 years ago
Tom Matthews 96959d1174
converting-from-bfg-repo-cleaner.md: fix typo
Signed-off-by: Tom Matthews <trcm@pm.me>
4 years ago
Elijah Newren 7b3e714b94 filter-repo (README): remove outdated 2.28.0-not-yet-released comment
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren d79ea709b7 filter-repo: fix crash from assuming parent is an int
When filtering with --refs, parents can be a hash rather than an
integer.  There was a code path in RepoFilter._prunable() that was
written assuming the first parent would always be an integer; fix it to
handle a hash as well.

Reported-by: Niklas Hambüchen <mail@nh2.me>
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 4b452da4ef Merge branch 'jb/ignore-generated-docs' into main
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren e4960a53f8 Fix undefined variable names
Reported-by: Christian Clauss <cclauss@me.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Jonas Bernoulli 8a8278701f .gitignore: ignore the generated documentation
Signed-off-by: Jonas Bernoulli <jonas@bernoul.li>
4 years ago
Elijah Newren ed6f410088 Contributing.md: link to Nicolai Hähnle's code review comments
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren cefeef1c0a filter-repo: use new --date-format=raw-permissive fast-import option
fast-import gained a new raw-permissive date format explictly for
allowing people to import repositories as-is.  Make use of the flag, and
stop rewriting the bogus timezone found in rails.git.

If users do not like these bogus times, they can of course write a
filter to fix them (or even make them bogus in a different way).  For
example:

    git filter-repo ... --commit-callback '
      if commit.author_date.endswith(b"+051800"):
        commit.author_date.replace(b"+051800", b"+0261")
    '

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren debe52000d contrib: rename no-op-example to barebones-example
"no-op" might suggest that it doesn't do anything, when in reality it
does exactly what filter-repo does.  Rename it to barebones-example.

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 2f26e4bce5 INSTALL.md: wording clarification on what repology.org tracks
Homebrew and scoop are both package managers and package repositories.
Fedora 32 is not a package manager, but does map to a package
repository.  Clarify wording that the list from repology.org is a list
of package repositories, not package managers.

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren b74eb6b69d Merge branch 'jr/document-commit-and-ref-map' into main
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
James Ramsay f867bb6ad7 git-filter-repo.txt: document mapping output
Useful commit and reference mappings are created on every run. These are
helpful in a number of situations, and should be documented so that
end-users and Git hosts can understand how to use the output.

The commit-map is particularly useful for Git hosts to override
retention mechanisms, like hidden refs. This allows end-users to purge
large files and sensitive data.

Signed-off-by: James Ramsay <james@jramsay.com.au>
4 years ago
Elijah Newren 1e0c3ab3ae filter-repo: make fresh clone warning scarier
Apparently, despite the fact that *overwrite* *repo* *history* are three
important words that each individually convey a lot of important
meaning, people ignore it and instinctively add --force.  Insert the
word "destructively" to get people to pause.

Further, change the end of the warning not to how to get around the
warning with the current repository, but instead with a suggestion that
they should instead be operating on a fresh clone and only then make a
side comment that the --force flag can be used to override.

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 8abf8faec8 git-filter-repo.txt: be more forceful on the wording of --force
Online blogs/articles/Q&A as well as direct feedback suggests that
people use the --force flag rather cavalierly.  Add words like
"irreversible" and "immediate pruning" to discourage such blithe
application of this flag.  I hope this encourages folks to either learn
the ramifications of irreversible full-repository entire history
rewrites first, or to follow the recommendation of only operating on a
fresh clone.

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren f8c14d159c git-filter-repo.txt: point people at the generated documentation
People keep trying to read this file, unaware that it is the source code
for generating the documentation, not the generated documentation.  Add
a comment at the top that explains this and points people in the right
direction.

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 38e70b69e8 filter-repo: ignore comment lines in --paths-from-file
Allow lines starting with '#' to be treated as a comment and be ignored.
Update the documentation to note that both blank lines and comment lines
are ignored, and mention how filenames starting with '#' can be matched
(namely, the same way that filenames startwith with 'regex:', 'glob:',
or 'literal:' can be -- by prefixing the filename with 'literal:').

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 771404d656 filter-repo: allow globs to match file or directory names
I added special code to filter-repo so that --path expressions could
match filenames or some leading directory name.  --path-regex, since it
does not implicitly add anchorings, can also match a leading path, and
can thus be used to match against directories.  --path-glob could not be
used to match a leading directory of a path, since fnmatch.fnmatch()
requires the full string to match.  But users like being able to specify
directory names, such as '*/bin', so let's take any glob expression and
treat it as two: '<glob>' and '<glob>/*' and try to match against either
one; this will allow it to match against file or directory names like
the other two types of path matching.

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 25b226b1de t9390: make tests individually re-runnable
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren eb9ea17629 INSTALL.md: fix missing trailing backquote
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 34f761734b INSTALL.md: simplify manual installation instructions
Make use of `git --man-path` and `git --html-path` to simplify the
manual installation instructions a bit.  Also, there appears to be a
site.getsitepackages() call in python to give similar information about
where git_filter_repo.py can be installed.

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren a238e3b7e6 git-filter-repo.txt: discourage use of random clone flags
Flags like --local, --shared, --reference (and --dissociate), and
--origin would all mess up the fresh clone checker.  Attempting to
defend against all of them would not only be costly, but make it harder
to draw the line about guesses as to whether a repository is a fresh
clone or not.  --origin also has problems in that filter-repo has
special handling for the 'origin' remote that I don't want to apply to
other random remotes.

Flags like --depth, --single-branch, and --no-tags could prevent enough
data from being downloaded to do a full rewrite and result in a
partially rewritten or possibly even corrupt history (no idea how
shallow clones interact; probably badly).  --filter would also make the
repo start without enough info though it'd at least be downloaded on
demand; it'd still be a really slow way to do it, though, so it's a bad
idea.

filter-repo doesn't really provide an easy mechanism to rewrite a repo
and its submodule simultaneously, so recursing submodules seems useless
and unhelpful.  --shallow-submodules would be bad for at least the same
reasons --depth is for the parent module, assuming we handled
submodules.  --remote-submodules just provides a way to make the repo
dirty to start, which is counter-productive.  --jobs could be useful, if
recursing submodules was.

--no-checkout might be safe to use and --sparse might also be okay for
as long as it only affects the working tree, but in both cases why not
go --bare or --mirror if you're doing that?  Likewise, --no-hardlinks is
useless given that we're already saying people need to use --no-local.

-b would be okay to use, but why wouldn't you just change the default
branch on the server rather than just within this one clone used for
rewriting the history?  Whether you push back to the original repository
or to a new repo, you'd have to take a separate step to change it in
that remote repo.  And if you really will use this new local repository
as the official source, then you can switch branches at the end of the
rewrite just as easily.

--separate-git-dir and --template might be okay to use, I haven't
tested.  If either doesn't work now, or breaks at any point in the
future, I feel much better being able to say, "I told you to only use
these three flags to git clone."

-u only affects the ability to receive the clone; it's fine to use.
Also, -q only affects the console output during the clone operation, so
you could use it.

There will probably be more flags added to git-clone over time.  Testing
against all of them is insanity.  Recommend people only use --no-local,
--bare, and --mirror, with the first only needed when cloning from a
local filesystem, and the other two never needed but allowed for those
that prefer.

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 49d6f02ff8 filter-repo: clarify interactions between path filtering and path renaming
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 3e1bff264c Revert "filter-repo: fix ugly bug with mixing path filtering and renaming"
This reverts commit df6c8652a2.  The
motivating example was wrong; path renaming should not be involved in
path filtering, it only says how paths should be renamed if they happen
to be selected.  A subsequent commit will improve the documentation.

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren a4c12253a8 git-filter-repo.txt: briefly explain steps for pushing to original url
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren b8ebda97dd contrib: avoid applying --replace-text to binary files in bfg-ish
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 86569ee7ac Contributing.md: add a small clarification about line coverage
Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Elijah Newren 23bec32283 contrib, docs: make discovery of code formatting and linting easier
The desire to format or lint code throughout history has arisen several
times.  It's more natural to do this in filter-branch since it somewhat
forces people to run external commands, but we have an example contrib
demo that shows how to run an external command on each file in history
that I created even before any of these requests came in and yet I still
periodically get requests about it.

Make lint-history ever-so-slightly easier to apply to a subset of
filenames, and include its usage as an extra cheat sheet comparison for
filter-branch-vs-filter-repo commands.

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago