This is by far the largest python3 change; it consists basically of
* using b'<str>' instead of '<str>' in lots of places
* adding a .encode() if we really do work with a string but need to
get it converted to a bytestring
* replace uses of .format() with interpolation via the '%' operator,
since bytestrings don't have a .format() method.
Signed-off-by: Elijah Newren <newren@gmail.com>
Use UTF-8 chars in user names, filenames, branch names, tag names, and
file contents. Also include invalid UTF-8 in file contents; should be
able to handle binary data.
Signed-off-by: Elijah Newren <newren@gmail.com>
The sorting order of entries written to files in the analysis directory
didn't specify a secondary sort, thus making the order dependent on the
random-ish sorting order of dictionaries and making it inconsistent
between python versions. While the secondary order didn't matter much,
having a defined order makes it slightly easier to define a single
testcase that can work across versions.
Signed-off-by: Elijah Newren <newren@gmail.com>
Assuming filter-repo will be merged into git.git, use "git" for the
TEXTDOMAIN, and assume its build system will replace "@@LOCALEDIR@@"
appropriately.
Note that the xgettext command used to grab string translations is
nearly identical to the one for C files in git.git; just use
--language=python instead and add --join-existing to avoid overwriting
the po/git.pot file. In other words, use the command:
xgettext -o../git/po/git.pot --join-existing --force-po \
--add-comments=TRANSLATORS: \
--msgid-bugs-address="Git Mailing List <git@vger.kernel.org>" \
--from-code=UTF-8 --language=python \
--keyword=_ --keyword=N_ --keyword="Q_:1,2" \
git-filter-repo
To create or update the translation, go to git.git/po and run either of:
msginit --locale=XX
msgmerge --add-location --backup=off -U XX.po git.pot
Once you've updated the translation, within git.git just build as
normal. That's all that's needed.
Signed-off-by: Elijah Newren <newren@gmail.com>
The AncestryGraph setup assumed we had previously seen all commits which
would be used as parents; that interacted badly with doing an
incremental import. Add a function which can be used to record external
commits, each of which we'll treat like a root commit (i.e. depth 1 and
having no parents of its own). Add a test to prevent regressions.
Signed-off-by: Elijah Newren <newren@gmail.com>
There are a number of things not present in "normal" imports that we
nevertheless support and need to be tested:
* broken timezone adjustment (+051800->+0261; observed in the wild
in real repos, and adjustment prevents fast-import from dying)
* commits missing an author (observed in the wild in a real repo;
just sets author to committer)
* optional additional linefeeds in the input allowed by
git-fast-import but usually not written by git-fast-export
* progress and checkpoint objects
* progress, checkpoint, and 'everything' callbacks
Signed-off-by: Elijah Newren <newren@gmail.com>
The test does check the exact output for the report, meaning if the
output is changed at all this test will need to be updated, but it at
least makes sure we are getting all the right kinds of information. I
do not expect the output format will change very often.
Signed-off-by: Elijah Newren <newren@gmail.com>
Pruning of commits which become empty can result in a variety of
topology changes: a merge may have lost all its ancestors corresponding
to one of (or more) of its parents, a merge may end up merging a commit
with itself, or a merge may end up merging a commit with its own
ancestor. Merging a commit with itself makes no sense, so we'd rather
prune down to one parent and hopefully prune the merge commit, but we do
need to worry about whether the are changes in the commit and whether
the original merge commit also merged something with itself. We have
similar cases for dealing with a merge of some commit with its own
ancestor: if the original topology did the same, or the merge commit has
additional file changes, then we cannot remove the commit. But,
otherwise, the commit can be pruned.
Add testcases covering the variety of changes that can occur to make
sure we get them right.
Signed-off-by: Elijah Newren <newren@gmail.com>
There are several cases to worry about with commit pruning; commits
that start empty and had no parent, commits that start empty and
had a parent which may or may not get pruned, commits which had
changes but became empty, commits which were merges but lost a line
of ancestry and have no changes of their own, etc. Add testcases
covering these cases, though most topology related ones will be
deferred to a later set of tests.
Signed-off-by: Elijah Newren <newren@gmail.com>
Make it easy for users to search and replace text throughout the
repository history. Instead of inventing some new syntax, reuse the
same syntax used by BFG repo filter's --replace-text option, namely,
a file with one expression per line of the form
[regex:|glob:|literal:]$MATCH_EXPR[==>$REPLACEMENT_EXPR]
Where "$MATCH_EXPR" is by default considered to be literal text, but
could be a regex or a glob if the appropriate prefix is used. Also,
$REPLACEMENT_EXPR defaults to '***REMOVED***' if not specified. If
you want a literal '==>' to be part of your $MATCH_EXPR, then you
must also manually specify a replacement expression instead of taking
the default. Some examples:
sup3rs3kr3t
(replaces 'sup3rs3kr3t' with '***REMOVED***')
HeWhoShallNotBeNamed==>Voldemort
(replaces 'HeWhoShallNotBeNamed' with 'Voldemort')
very==>
(replaces 'very' with the empty string)
regex:(\d{2})/(\d{2})/(\d{4})==>\2/\1/\3
(replaces '05/17/2012' with '17/05/2012', and vice-versa)
The format for regex is as from
re.sub(<pattern>, <repl>, <string>) from
https://docs.python.org/2/library/re.html
The <string> comes from file contents of the repo, and you specify
the <pattern> and <repl>.
glob:Copy*t==>Cartel
(replaces 'Copyright' or 'Copyleft' or 'Copy my st' with 'Cartel')
Signed-off-by: Elijah Newren <newren@gmail.com>