Filtering input from multiple repositories can still be done; however, to
avoid overloading of mark numbers, one should pass --export-marks=<file>
to the first git fast-export and pass --import-marks=<file> to the second.
Signed-off-by: Elijah Newren <newren@gmail.com>
git-fast-import requires that file changes listed in a merge commit be
relative to the first parent. Thus, if I've added new files on a branch
being merged in from the second or later parents, I need to manually
modify the list of files in the merge commit as well. In order to do that,
as soon as I splice in any commit, I have to record the list of new files
for both that commit and every descendant it has.
Signed-off-by: Elijah Newren <newren@gmail.com>
Also, provide an OutputStream class, to make it easy to still direct all
output to some file rather than always sending to git fast-import.
Signed-off-by: Elijah Newren <newren@gmail.com>
Automatically do renaming of references to commits that were skipped, and
automatically remove skipped blobs from the output of commits that
reference them.
Signed-off-by: Elijah Newren <newren@gmail.com>
Explicitly specify --topo-order; git-fast-export fails on some topologies
unless it traverses in topological order.
Signed-off-by: Elijah Newren <newren@gmail.com>
pyparsing sucks a whole file into memory at a time and then parses, which
is really bad in this case since the output from git-fast-export is huge.
I entered disk swapping madness pretty easily. So, now I just do my own
manual parsing.
Signed-off-by: Elijah Newren <newren@gmail.com>
* Allow hooking up (and filtering) multiple git fast-export's to one import
* Allow user callbacks to force dumping of object in order to reference it
with subsequent inserted objects
* Put the separate callbacks and global vars in the calling program into a
combined class
Signed-off-by: Elijah Newren <newren@gmail.com>
My prior handlings of marks would only work if there were not additions
or removals from the fast-export stream. Further, I referred to these as
marks even though I really only accept idnum values, not sha1s or anything
else. So, now I refer to these as ids everywhere, and I am much more
careful in my handling of ids.
Signed-off-by: Elijah Newren <newren@gmail.com>
The commit_callback call was trying to pass a Reset object, which was
not defined. Copy-n-paste-n-forget-to-replace isn't good. Now it passes
a Commit object.
Signed-off-by: Elijah Newren <newren@gmail.com>
We still only parse a single blob, but this should put the infrastructure
in place for parsing more output from git-fast-export.
Signed-off-by: Elijah Newren <newren@gmail.com>