Skip to content

Commit

Permalink
filter-repo: only conditionally heed the already_ran file
Browse files Browse the repository at this point in the history
When users run filter-repo, we first check if they already did a
previous run.  If they recently ran filter-repo previously, then they
are probably just trying to do their history rewrite in multiple steps,
so we bypass the fresh clone check for these subsequent runs.  However,
it is possible that the user finishes their rewrite, continues as normal
for months or years, and then decides to do another rewrite.  It may be
that there is still a $GIT_DIR/filter-repo/already_ran file present from
the repository filtering from long ago, and we do not want the presence
of that file to result in the fresh clone check being bypassed.

Implement a middle ground -- if the already_ran file is older than a
day, ask the user whether to consider the current run a continuation of
an existing rewrite or if it should be considered a completely new
rewrite.  Add documentation explaining this as well.

Signed-off-by: Elijah Newren <[email protected]>
  • Loading branch information
newren committed Oct 21, 2024
1 parent 8bd8803 commit 8a243ae
Show file tree
Hide file tree
Showing 4 changed files with 217 additions and 1 deletion.
27 changes: 27 additions & 0 deletions Documentation/git-filter-repo.txt
Original file line number Diff line number Diff line change
Expand Up @@ -405,6 +405,33 @@ references were changed.
* An all-zeros hash, or null SHA, represents a non-existent object.
When in the "new" column, this means the ref was removed entirely.

Already Ran
~~~~~~~~~~~

The `$GIT_DIR/filter-repo/already_ran` file contains a file recording that
git-filter-repo has been run. When this file is present, future runs will
be treated as an extension of the previous filtering operation.

Concretely, this means:
* The "Fresh Clone" check is bypassed

This is done because past runs would cause the repository to no longer
look like a fresh clone, and thus fail the fresh clone check, but doing
filtering via multiple invocations of git-filter-repo is an intended
and support usecase. You already passed or bypassed the "Fresh Clone"
check on your initial run.

However, if the already_ran file exists but is older than 1 day when they
invoke git-filter-repo, the user will be prompted for whether the new run
should be considered a continuation of the previous run. If they do not
answer in the affirmative, then the above bullet will not apply.
This prompt exists because users might do a history rewrite in a repository,
forget about it and leave the $GIT_DIR/filter-repo directory around, and
then some months or years later need to do another rewrite. If commits
have been made public and shared from the previous rewrite, then the next
filter-repo run should not be considered a continuation of the previous
filtering run.

[[FRESHCLONE]]
FRESH CLONE SAFETY CHECK AND --FORCE
------------------------------------
Expand Down
16 changes: 15 additions & 1 deletion git-filter-repo
Original file line number Diff line number Diff line change
Expand Up @@ -2942,7 +2942,21 @@ class RepoFilter(object):

# Determine if this is second or later run of filter-repo
tmp_dir = self.results_tmp_dir(create_if_missing=False)
already_ran = os.path.isfile(os.path.join(tmp_dir, b'already_ran'))
ran_path = os.path.join(tmp_dir, b'already_ran')
already_ran = os.path.isfile(ran_path)
if already_ran:
current_time = time.time()
file_mod_time = os.path.getmtime(ran_path)
file_age = current_time - file_mod_time
if file_age > 86400: # file older than a day
msg = (f"The previous run is older than a day ({decode(ran_path)} already exists).\n"
f"See \"Already Ran\" section in the manual for more information.\n"
f"Treat this run as a continuation of filtering in the previous run (Y/N)? ")
response = input(msg)

if response.lower() != 'y':
os.remove(ran_path)
already_ran = False

# Default for --replace-refs
if not self._args.replace_refs:
Expand Down
79 changes: 79 additions & 0 deletions t/t9393-rerun.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
#!/bin/bash

test_description='filter-repo tests with reruns'

. ./test-lib.sh

export PATH=$(dirname $TEST_DIRECTORY):$PATH # Put git-filter-repo in PATH

DATA="$TEST_DIRECTORY/t9393"
DELETED_SHA="0000000000000000000000000000000000000000" # FIXME: sha256 support

test_expect_success 'a re-run that is treated as a clean slate' '
test_create_repo clean_slate_rerun &&
(
cd clean_slate_rerun &&
git fast-import --quiet <$DATA/simple &&
FIRST_ORPHAN=$(git rev-parse orphan-me~1) &&
FINAL_ORPHAN=$(git rev-parse orphan-me) &&
FILE_A_CHANGE=$(git rev-list -1 HEAD -- fileA) &&
FILE_B_CHANGE=$(git rev-list -1 HEAD -- fileB) &&
FILE_C_CHANGE=$(git rev-list -1 HEAD -- fileC) &&
FILE_D_CHANGE=$(git rev-list -1 HEAD -- fileD) &&
ORIGINAL_TAG=$(git rev-parse v1.0) &&
git filter-repo --invert-paths --path fileB --force &&
NEW_FILE_C_CHANGE=$(git rev-list -1 HEAD -- fileC) &&
NEW_FILE_D_CHANGE=$(git rev-list -1 HEAD -- fileD) &&
FINAL_TAG=$(git rev-parse v1.0) &&
cat <<-EOF | sort >sha-expect &&
${FIRST_ORPHAN} ${FIRST_ORPHAN}
${FINAL_ORPHAN} ${FINAL_ORPHAN}
${FILE_A_CHANGE} ${FILE_A_CHANGE}
${FILE_B_CHANGE} ${DELETED_SHA}
${FILE_C_CHANGE} ${NEW_FILE_C_CHANGE}
${FILE_D_CHANGE} ${NEW_FILE_D_CHANGE}
EOF
printf "%-40s %s\n" old new >expect &&
cat sha-expect >>expect &&
test_cmp <(sort expect) <(sort .git/filter-repo/commit-map) &&
cat <<-EOF | sort -k 3 >sha-expect &&
${FILE_D_CHANGE} ${NEW_FILE_D_CHANGE} $(git symbolic-ref HEAD)
${FINAL_ORPHAN} ${FINAL_ORPHAN} refs/heads/orphan-me
${ORIGINAL_TAG} ${FINAL_TAG} refs/tags/v1.0
EOF
printf "%-40s %-40s %s\n" old new ref >expect &&
cat sha-expect >>expect &&
test_cmp expect .git/filter-repo/ref-map &&
touch -t 197001010000 .git/filter-repo/already_ran &&
echo no | git filter-repo --invert-paths --path fileC --force &&
FINAL_FILE_D_CHANGE=$(git rev-list -1 HEAD -- fileD) &&
REALLY_FINAL_TAG=$(git rev-parse v1.0) &&
cat <<-EOF | sort >sha-expect &&
${FIRST_ORPHAN} ${FIRST_ORPHAN}
${FINAL_ORPHAN} ${FINAL_ORPHAN}
${FILE_A_CHANGE} ${FILE_A_CHANGE}
${NEW_FILE_C_CHANGE} ${DELETED_SHA}
${NEW_FILE_D_CHANGE} ${FINAL_FILE_D_CHANGE}
EOF
printf "%-40s %s\n" old new >expect &&
cat sha-expect >>expect &&
test_cmp <(sort expect) <(sort .git/filter-repo/commit-map) &&
cat <<-EOF | sort -k 3 >sha-expect &&
${NEW_FILE_D_CHANGE} ${FINAL_FILE_D_CHANGE} $(git symbolic-ref HEAD)
${FINAL_ORPHAN} ${FINAL_ORPHAN} refs/heads/orphan-me
${FINAL_TAG} ${REALLY_FINAL_TAG} refs/tags/v1.0
EOF
printf "%-40s %-40s %s\n" old new ref >expect &&
cat sha-expect >>expect &&
test_cmp expect .git/filter-repo/ref-map
)
'

test_done
96 changes: 96 additions & 0 deletions t/t9393/simple
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
feature done
# Simple repo with a few files, and two branches with no common history.
# Note that the original-oid directives are very fake, but make it easy to
# track things.
blob
mark :1
original-oid 0000000000000000000000000000000000000001
data 16
file 1 contents

blob
mark :2
original-oid 0000000000000000000000000000000000000002
data 16
file 2 contents

blob
mark :3
original-oid 0000000000000000000000000000000000000003
data 16
file 3 contents

blob
mark :4
original-oid 0000000000000000000000000000000000000004
data 16
file 4 contents

reset refs/heads/orphan-me
commit refs/heads/orphan-me
mark :5
original-oid 0000000000000000000000000000000000000009
author Little O. Me <[email protected]> 1535228562 -0700
committer Little O. Me <[email protected]> 1535228562 -0700
data 8
Initial
M 100644 :1 nuke-me

commit refs/heads/orphan-me
mark :6
original-oid 000000000000000000000000000000000000000A
author Little 'ol Me <me@laptop.(none)> 1535229544 -0700
committer Little 'ol Me <me@laptop.(none)> 1535229544 -0700
data 9
Tweak it
from :5
M 100644 :4 nuke-me

reset refs/heads/master
commit refs/heads/master
mark :7
original-oid 000000000000000000000000000000000000000B
author Little O. Me <[email protected]> 1535229523 -0700
committer Little O. Me <[email protected]> 1535229523 -0700
data 15
Initial commit
M 100644 :1 fileA

commit refs/heads/master
mark :8
original-oid 000000000000000000000000000000000000000C
author Lit.e Me <[email protected]> 1535229559 -0700
committer Lit.e Me <[email protected]> 1535229580 -0700
data 10
Add fileB
from :7
M 100644 :2 fileB

commit refs/heads/master
mark :9
original-oid 000000000000000000000000000000000000000D
author Little Me <[email protected]> 1535229601 -0700
committer Little Me <[email protected]> 1535229601 -0700
data 10
Add fileC
from :8
M 100644 :3 fileC

commit refs/heads/master
mark :10
original-oid 000000000000000000000000000000000000000E
author Little Me <[email protected]> 1535229618 -0700
committer Little Me <[email protected]> 1535229618 -0700
data 10
Add fileD
from :9
M 100644 :4 fileD

tag v1.0
from :10
original-oid 000000000000000000000000000000000000000F
tagger Little John <[email protected]> 1535229637 -0700
data 5
v1.0

done

0 comments on commit 8a243ae

Please sign in to comment.