git-filter-repo

Commit Graph

Author	SHA1	Message	Date
Elijah Newren	117dd28883	Merge branch 'en/flesh-out-docs' into master The prerequisites and installation docs were not quite detailed enough, and no code of conduct or contribution guidelines were included. Flesh out the docs to cover these issues. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	d07a2fe2ea	Contributing.md: mention testsuite line coverage Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	64aa9359ed	run_coverage: prefer coverage3 to python3-coverage Some of the systems I ran on had a 'python3-coverage' and some had a 'coverage3' program. More were of the latter name, but more importantly, the upstream tarball only creates the latter name; apparently the former was just added by some distros. So, switch to the more official name of the program. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	b3eb2cf461	filter-repo (README): add code of conduct and contributing guidelines Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	1762b99573	Explain how to use a python3 executable not named "python3" Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	5c35bb7a8d	filter-repo (README): add sections on prerequisites and installation Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	1810051a58	Merge branch 'mh/generated-readme-typo-fix' into master Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Matthisk Heimensen	22cc153395	filter-repo: fix typo in generated analysis README Signed-off-by: Matthisk Heimensen <m@tthisk.nl>	5 years ago
Elijah Newren	33cf19376d	Merge branch 'bf/installation-fixes' into master Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Benoit Fouletier	2cbd4a46a7	Makefile: fix path installation issues - quote paths that may have spaces - force ln in case the file already exists Signed-off-by: Benoit Fouletier <bennews@free.fr>	5 years ago
Benoit Fouletier	ca2fd07dfa	Makefile: fix documentation installation - correct paths to including missing "Documentation/" prefix - use fully specified "origin/docs" branch in case the "docs" branch is not checked out locally Signed-off-by: Benoit Fouletier <bennews@free.fr>	5 years ago
Elijah Newren	8d8410e2b2	Makefile: use the right token environment variable Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	84fddfe262	git-filter-repo.txt: fix typesetting of --partial Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	ceb924ea8f	filter-repo (README): add link to predecessor project Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	2fc5596455	filter-repo (README): add note about requiring a recent version of git Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	904e03f963	filter-repo: workaround Windows' insistence that command args be strings It appears that in addition to Windows requiring cwd be a string (and not a bytestring), it also requires the command line arguments to be unicode strings. This appears to be a python-on-Windows issue at the surface (attempts to quote things that assumes the arguments are all strings), but whether it's solely a python-on-Windows issue or there is also a deeper Windows issue, we can workaround this brain-damage by extending the SubprocessWrapper slightly. As with the cwd changes, only apply this on Windows and not elsewhere because there are perfectly legitimate reasons to pass non-unicode parameters (e.g. filenames that are not valid unicode). Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	f2729153fe	filter-repo: workaround Windows' insistence that cwd not be a bytestring Unfortunately, it appears that Windows does not allow the 'cwd' argument of various subprocess calls to be a bytestring. That may be functional on Windows since Windows-related filesystems are allowed to require that all file and directory names be valid unicode, but not all platforms enforce such restrictions. As such, I certainly cannot change cwd=directory to cwd=decode(directory) because that could break on other platforms (and perhaps even on Windows if someone is trying to read a non-native filesystem). Instead, create a SubprocessWrapper class that will always call decode on the cwd argument before passing along to the real subprocess class. Use these wrappers on Windows, and do not use them elsewhere. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	da2a969157	Makefile: add a few new targets to streamline my release workflow Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	d70b29a165	filter-repo: fix import sort order During the python3 transition, StringIO was renamed to io -- but the import wasn't moved to preserve appropriate sorting. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	e333be7b17	filter-repo: consistently use bytestrings for directory names Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	e6dd613e3f	filter-repo: add a --version option Note that this isn't a version number or even the more generalized version string that folks are used to seeing, but a version hash (or leading portion thereof). A few import points: * These version hashes are not strictly monotonically increasing values. Like I said, these aren't version numbers. If that bothers you, read on... * This scheme has incredibly nice semantics satisfying a pair of properties that most version schemes would assume are mutually incompatible: This scheme works even if the user doesn't have a clone of filter-repo and doesn't require any build step to inject the version into the program; it works even if people just download git-filter-repo.py off GitHub without any of the other sources. And: This scheme means that a user is running precisely version X of the code, with the version not easily faked or misrepresented when third parties edit the code. Given the wonderful semantics provided by satisfying this pair of properties that all other versioning schemes seem to miss out on, I think I should name this scheme. How about "Semantic Versioning"? (Hehe...) * The version hash is super easy to use; I just go to my own clone of filter-repo and run either: git show $VERSION_HASH or git describe $VERSION_HASH * A human consumable version might suggest to folks that this software is something they might frequently use and upgrade. This program should only be used in exceptional cases (because rewriting history is not for the faint of heart). * A human consumable version (i.e. a version number or even the more relaxed version strings in more common use) might suggest to folks that they can rely on strict backward compatibility. It's nice to subtly undercut any such assumption. * Despite all that, I will make releases (downloadable tarballs with real version numbers in the tarball name; I'm just going to re-use whatever version git is released with at the time). But those version numbers won't be used by the --version option; instead the version hash will. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	1e21d6e2ec	Add installation instructions Try to make it a little more friendly for distros to package. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	62c311c69f	filter-repo: fix an unmarked bytestring to be marked as such Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	e0140bb2ad	git-filter-repo.txt: minor updates to docs A few changes: * Include notes about git-2.24.0 changes * Make it clearer that messing with the first parent could have negative side-effects if the file_changes aren't also updated. * Fix wrapping of a line that was too long. Also, update the README.md: * Note the upstream improvements made in (not yet released) git-2.24.0 Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	320c85f941	filter-repo: improve support for partial history rewrites Partial history rewrites were possible before with the (previously hidden) --refs flag, but the defaults were wrong. That could be worked around with the --source or --target flags, but that disabled --no-data for fast-export and thus slowed things down, and also would require overridding --replace-refs. And the defaults for --source and --target may diverge further from what is wanted/needed for partial history rewrites in the future. So, add --partial as a first-class supported option with scary documentation about how it permits mixing new and old history. Make --refs imply that flag. Make the behavioral similarities (in regards to which steps are skipped) between --source, --target, and --partial more clear. Add relevant documentation to round it out. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	509a624b6a	filter-repo: fix issue with pruning of empty commits In order to build the correct tree for a commit, git-fast-import always takes a list of file changes for a merge commit relative to the first parent. When the entire first-parent history of a merge commit is pruned away and the merge had paths with no difference relative to the first parent but which differed relative to later parents, then we really need to generate a new list of file changes in order to have one of those other parents become the new first parent. An example might help clarify... Let's say that there is a merge commit, and: * it resolved differences in pathA between its two parents by taking the version of pathA from the first parent. * pathB was added in the history of the second parent (it is not present in the first parent) and is NOT included in the merge commit (either being deleted, or via rename treated as deleted and added as something else) For this merge commit, neither pathA nor pathB differ from the first parent, and thus wouldn't appear in the list of file changes shown by fast-export. However, when our filtering rules determine that the first parent (and all its parents) should be pruned away, then the second parent has to become the new first parent of the merge commit. But to end up with the right files in the merge commit despite using a different parent, we need a list of file changes that specifies the changes for both pathA and pathB. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	cdec483573	filter-repo: use our own textdomain for translations git.git wants to move more towards core-only rather than batteries included, and as such, filter-repo will not be part of the git distribution. Therefore, due to keeping the projects apart, there will need to be separate translation files (assuming filter-repo ever gains any translations) and as such we will need a different textdomain definition. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	71bb8d26a9	filter-repo: add a --state-branch option for incremental exporting Allow folks to periodically update the export of a live repo without re-exporting from the beginning. This is a performance improvement, but can also be important for collaboration. For example, for sensitivity reasons, folks might want to export a subset of a repo and update the export periodically. While this could be done by just re-exporting the repository anew each time, there is a risk that the paths used to specify the wanted subset might need to change in the future; making the user verify that their paths (including globs or regexes) don't also pick up anything from history that was previously excluded so that they don't get a divergent history is not very user friendly. Allowing them to just export stuff that is new since the last export works much better for them. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	b5b0cf4230	filter-repo (README): restructure and update Since we now have a separate user manual and it does not make sense to duplicate information in multiple places, restructure the README: * Refer to the actual manual early on * Limit the README to mostly be about why I wrote it and why folks might want to consider it instead of existing tools * Include a new section on upstream improvements, especially since it looks like inclusion of git-filter-repo in git.git is unlikely. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	a6a6a1b0f6	git-filter-repo.txt: add a manpage for the filter-repo command Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	622afca813	contrib: new filter-repo demo named bfg-ish This implements most of BFG Repo Cleaner, with several additional features and bugfixes included. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	df575fb181	contrib: new filter-repo demo named filter-lamely (or filter-branch-ish) This is a re-implementation of git-filter-branch that is nearly perfectly bug compatible (it can replace git-filter-branch and still pass the git testsuite). It deviates in one minor way that should not matter to real world usecases, but allows it to run a few times faster than filter-branch. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	65f0ecaef7	filter-repo: updates and minor fixes in option help and README Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	6d231c0a94	contrib: simple examples of tools based on filter-repo Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	2094221721	filter-repo: do not claim we are repacking if we are not Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	9fbe2569ec	filter-repo: split repacking logic into a separate function for reuse Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	6ba30e9b98	filter-repo: use more versatile commit rename function Being able to find the new commit hash for either an abbreviated commit hash or a full commit hash is much more useful than only working for a full commit hash. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	47406a6cc0	filter-repo: re-license When I wrote git_fast_filter.py, I was unaware of and did not forsee libgit2. So, although the license said the project could be used under whatever license git.git was, there was still a potential barrier for usage by libgit2. I'm not sure if libgit2 will ever want to use filter-repo, but I don't want the barrier there and I would like to avoid a repeat of this problem. (Also, since filter-repo is for the most part a one-shot usage tool, I doubt that the normal copyleft provisions could provide much value.) MIT is widely used, compatible with just about everyting, and is preferred by Palantir (my current employer) for open source contributions. So, I contacted all other contributors (Jim is still at Sandia) and got permission to relicense. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	7a12d7a38b	filter-repo: add ability to parse and dump encoding Commit `346f2ba891` (filter-repo: make reencoding of commit messages togglable, 2019-05-11) made reencoding of commit messages togglable but forgot to add parsing and outputting of the encoding header itself. Add such ability now. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	e9678a367f	filter-repo: support deleteall directive Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	a78831c984	filter-repo: remove _seen_refs as it is now unused Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	532dc047b3	filter-repo: use exported/imported refs for cleanup and metadata recording Now that we are tracking exported and imported refs, we no longer need to rely on _orig_refs and _seen_refs for deletion of "unused" refs at the end of the run. Verify that we correctly tracked exported and imported refs by using them instead for the post-run ref deletion. This removes the last use of _seen_refs, which will be removed in a subsequent commit. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	e162bcc496	filter-repo: track exported and imported refs We previously nuked all refs not seen in the import using _seen_refs, by comparing to a full list of original refs. That works okay when doing a full repository rewrite, but fails for partial history rewrites. Further, external rewriting tools that wants to implement a tweak of this behavior would have had to access the internal _seen_refs field, but might not be able to rely on _orig_refs if they were doing a partial history rewrite. Fix both by tracking both which refs were exported from the source repository, and which were ultimately imported into the target repository (they may differ due to pruned commits, renamed branches or tags, etc.). Make both available via a new public API, get_exported_and_imported_refs(). Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	1c25be5be7	filter-repo: add public method for adding objects to stream External rewrite tools using filter-repo as a library may want to add additional objects into the stream. Some examples in t/t9391 did this using an internal _output field and using syntax that did not seem so clear. Provide an insert() method for doing this, and convert existing cases over to it. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	4175b808da	filter-repo: rename _handle_final_commands to _final_commands Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	88c1269d5a	filter-repo: ensure branches are updated as we go When we prune a commit for being empty, there is no update to the branch associated with the commit in the fast-import stream. If the parent commit had been associated with a different branch, then the branch associated with the pruned commit would not be updated without additional measures. In the past, we resolved this by recording that the branch needed an update in _seen_refs. While this works, it is a bit more complicated than just issuing an immediate Reset. Also, note that we need to avoid calling callbacks on that Reset because those could rename branches (again, if the commit-callback already renamed once) causing us to not update the intended branch. There was actually one testcase where the old method didn't work: when a branch was pruned away to nothing. A testcase accidentally encoded the wrong behavior, hiding this problem. Fix the testcase to check for correct behavior. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	aaeadac6df	filter-repo: fix explicit ref deletion via reset directive We previously did this incorrectly, but due to our assumptions of full-history rewriting and deleting of unseen refs, we got away with it. Fix this for partial history rewrites. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	e34dff11a1	filter-repo: remove dead code Commit `1f0e57bada` ("filter-repo: avoid pruning annotated tags that we have seen", 2019-03-07) left behind the setting of a variable, full_ref, that is no longer used. Remove it. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	aa9ab9df9f	filter-repo: refine choice of when to skip blobs We can pass --no-data to fast-export in one additional case. Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago
Elijah Newren	7d42c2093c	filter-repo: limit splicing repos warning to test that splices repos Signed-off-by: Elijah Newren <newren@gmail.com>	5 years ago

1 2 3 4 5 ...

317 Commits (117dd28883df1409e912891790f80925afbcf2ab) All Branches Search

317 Commits (117dd28883df1409e912891790f80925afbcf2ab)

All Branches