Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New hardlinks test failed - false positive? #735

Open
elboulangero opened this issue Mar 5, 2025 · 9 comments · May be fixed by #737
Open

New hardlinks test failed - false positive? #735

elboulangero opened this issue Mar 5, 2025 · 9 comments · May be fixed by #737

Comments

@elboulangero
Copy link

During the build of latest rsync version 3.4.1 on the Debian builders, there was one failure on the architecture ppc64el: https://buildd.debian.org/status/fetch.php?pkg=rsync&arch=ppc64el&ver=3.4.1%2Bds1-1&stamp=1741147156&raw=0

The test that failed was added a short while ago in commit dc34990 (hence CC @rosorio)

I'll reproduce the interesting part of the logs below:

        makepath /build/reproducible-path/rsync-3.4.1+ds1/testtmp/hardlinks/from/sym
        makepath /build/reproducible-path/rsync-3.4.1+ds1/testtmp/hardlinks/to
Running: "/build/reproducible-path/rsync-3.4.1+ds1/rsync  -aH '/build/reproducible-path/rsync-3.4.1+ds1/testtmp/hardlinks/from/sym' '/build/reproducible-path/rsync-3.4.1+ds1/testtmp/hardlinks/to'"
-------------
check how the directory listings compare with diff:

--- /build/reproducible-path/rsync-3.4.1+ds1/testtmp/hardlinks/ls-from	2025-03-05 03:58:55.047862241 +0000
+++ /build/reproducible-path/rsync-3.4.1+ds1/testtmp/hardlinks/ls-to	2025-03-05 03:58:55.051862074 +0000
@@ -1,2 +1,2 @@
-drwxr-xr-x               0    999.999         3 2025-03-05 03:58:54 .
+drwxr-xr-x               0    999.999         3 2025-03-05 03:58:55 .
 drwxr-xr-x               0    999.999         2 2025-03-05 03:58:54 ./sym
-------------
check how the files compare with diff:

-------------
Failed:  dir-diff
----- hardlinks log ends

My understanding is that it's a false positive: makepath created from/sym at 03:58:54, and to at 03:58:55, and so it necessarily shows in the ls diff?

Thanks

@rosorio
Copy link
Contributor

rosorio commented Mar 5, 2025

I dont have this kind of issues building rsync on FreeBSD,
does adding the -N flag (preserve create times) to the rsync command helps ?

@elboulangero
Copy link
Author

in case it was not clear, this is not always reproducible, you have to be pretty unlucky (ie. run the testsuite a bunch of times, enough, until you hit it). And yet I reproduced it yesterday during a build on my machine -- I must be the unlucky type 😢

@elboulangero: My understanding is that it's a false positive: makepath created from/sym at 03:58:54, and to at 03:58:55, and so it necessarily shows in the ls diff?

I was not entirely correct, now I realize that what makes the test flaky is the delay between the moment the directories are created, and the moment the test runs (ie. the checkit function).

To clarify, here's a patch that makes the test fails 100% of the times for me:

--- a/testsuite/hardlinks.test
+++ b/testsuite/hardlinks.test
@@ -81,6 +81,7 @@ diff $diffopt "$name1" "$todir" || test_fail "solo copy of name1 failed"
 # enabled (this has broken in 3.4.0 so far, so we need this test).
 rm -rf "$fromdir" "$todir"
 makepath "$fromdir/sym" "$todir"
+sleep 5
 checkit "$RSYNC -aH '$fromdir/sym' '$todir'" "$fromdir" "$todir"
 
 # The script would have aborted on error, so getting here means we've won.

And here are a few lines of script that I use to reproduce the test manually:

cd <<RSYNC_SOURCE>>
rm -fr foo/ bar/; mkdir -p foo/sym bar
date +'%F %T'
for d in foo bar; do echo "==== $d"; ( cd $d; find . -print | sort | xargs ../tls -l -L ); done
sleep 5; echo
./rsync -aH $(pwd)/foo/sym $(pwd)/bar
date +'%F %T'
for d in foo bar; do echo "==== $d"; ( cd $d; find . -print | sort | xargs ../tls -l -L ); done

Output:

2025-03-06 03:33:58
==== foo
drwxr-xr-x               0      0.0           3 2025-03-06 03:33:58 .
drwxr-xr-x               0      0.0           2 2025-03-06 03:33:58 ./sym
==== bar
drwxr-xr-x               0      0.0           2 2025-03-06 03:33:58 .

2025-03-06 03:34:03
==== foo
drwxr-xr-x               0      0.0           3 2025-03-06 03:33:58 .
drwxr-xr-x               0      0.0           2 2025-03-06 03:33:58 ./sym
==== bar
drwxr-xr-x               0      0.0           3 2025-03-06 03:34:03 .
drwxr-xr-x               0      0.0           2 2025-03-06 03:33:58 ./sym

Above we can see that the mtime for the destination directory is updated by the rsync run, so if there's a delay between the moment the directories are created, and the moment rsync runs, it will mismatch and fail the test.

As for a fix, changing the line:

checkit "$RSYNC -aH '$fromdir/sym' '$todir'" "$fromdir" "$todir"

To:

checkit "$RSYNC -aH '$fromdir/' '$todir/'" "$fromdir" "$todir"

Is enough to fix it. But I don't know if, after changing it this way, it still tests for the regression that it's meant to test.


@rosorio: does adding the -N flag (preserve create times) to the rsync command helps ?

I tried:

rsync: This rsync does not support --crtimes (-N)

Is it specific to BSD? I'm on Linux.

elboulangero added a commit to elboulangero/rsync that referenced this issue Mar 6, 2025
The test was added in dc34990, it turns out that it's flaky. It failed
once on the Debian build infra, cf. [1].

The problem is that the command `rsync -aH '$fromdir/sym' '$todir'`
updates the mod time of `$todir`, so there might be a diff between the
output of `rsync_ls_lR $fromdir` and `rsync_ls_lR $todir`, if ever rsync
runs 1 second (or more) after the directories were created.

To clarify: it's easy to make the test fails 100% of the times with this
change:

```
 makepath "$fromdir/sym" "$todir"
+sleep 5
 checkit "$RSYNC -aH '$fromdir/sym' '$todir'" "$fromdir" "$todir"
```

The fix proposed here is in line with other tests in hardlinks.test, as
far as I can see all the tests use a trailing slash for the `$fromdir`
and `$todir` arguments.

I tested that, after this commit, the test still catches the regression
in rsync 3.4.0.

Fixes: RsyncProject#735

[1]: https://buildd.debian.org/status/fetch.php?pkg=rsync&arch=ppc64el&ver=3.4.1%2Bds1-1&stamp=1741147156&raw=0
@elboulangero
Copy link
Author

elboulangero commented Mar 6, 2025

I don't know if, after changing it this way, it still tests for the regression that it's meant to test.

Answering myself:

I grabbed a 3.4.0 rsync source, applied @rosorio patch dc34990, and I can confirm that the testsuite fails, as expected:

Internal hashtable error: illegal key supplied!

Then I applied #736, and I can confirm that the test still catches the regression:

Internal hashtable error: illegal key supplied!

@rosorio
Copy link
Contributor

rosorio commented Mar 6, 2025

[....]

rsync: This rsync does not support --crtimes (-N)


Is it specific to BSD? I'm on Linux.

It's not completely clear for me, but according with the discussion pointed bellow, on linux, userland does not have write access to inode ctime/crtime. So this option is probably disabled in your build system.

rsync and --crtimes support? -- linuxquestions.org

@rosorio
Copy link
Contributor

rosorio commented Mar 6, 2025

I don't know if, after changing it this way, it still tests for the regression that it's meant to test.

Answering myself:

I grabbed a 3.4.0 rsync source, applied @rosorio patch dc34990, and I can confirm that the testsuite fails, as expected:

Internal hashtable error: illegal key supplied!

Then I applied #736, and I can confirm that the test still catches the regression:

Internal hashtable error: illegal key supplied!

Excellent, I will do the same test to see if it still working on my side.

@rosorio
Copy link
Contributor

rosorio commented Mar 6, 2025

Hi,

I'm sorry but doing the test on rsync-3.4.0 with #376 patch, the breakage with -H is not tested properly and the test pass.

preserve_scratch=yes whichtests=hardlinks.test ./runtests.sh
============================================================
./runtests.sh running in /wrkdirs/usr/ports/net/rsync/work/rsync-3.4.0
    rsync_bin=/wrkdirs/usr/ports/net/rsync/work/rsync-3.4.0/rsync
    srcdir=/wrkdirs/usr/ports/net/rsync/work/rsync-3.4.0
    TLS_ARGS= -l -L
    testuser=root
    os=FreeBSD 1402amd64-head 14.2-RELEASE FreeBSD 14.2-RELEASE amd64
    preserve_scratch=yes
    scratchbase=/wrkdirs/usr/ports/net/rsync/work/rsync-3.4.0/testtmp
PASS    hardlinks
------------------------------------------------------------
----- overall results:
      1 passed
------------------------------------------------------------

And my scratch directories has the following content :

/wrkdirs/usr/ports/net/rsync/work/rsync-3.4.0/testtmp/hardlinks/from
/wrkdirs/usr/ports/net/rsync/work/rsync-3.4.0/testtmp/hardlinks/from/sym
/wrkdirs/usr/ports/net/rsync/work/rsync-3.4.0/testtmp/hardlinks/to
/wrkdirs/usr/ports/net/rsync/work/rsync-3.4.0/testtmp/hardlinks/to/sym

Reverting the patch, the test breaks and scratch has the following content

/wrkdirs/usr/ports/net/rsync/work/rsync-3.4.0/testtmp/hardlinks/from
/wrkdirs/usr/ports/net/rsync/work/rsync-3.4.0/testtmp/hardlinks/from/sym
/wrkdirs/usr/ports/net/rsync/work/rsync-3.4.0/testtmp/hardlinks/to

Edit: fix typo

@elboulangero
Copy link
Author

I'm sorry but doing the test on rsync-3.4.0 with #376 patch, the breakage with -H is not tested properly and the test pass.

You mean #736 I think, not #376 :)

In any case: you're right... I tested again, I can confirm, after the patch #736 the test on rsync-3.4.0 passes, the -H regression is not detected. Sorry about that, something went wrong on my side... Thank you for testing it.

I'm testing another approach.

elboulangero added a commit to elboulangero/rsync that referenced this issue Mar 6, 2025
The test was added in dc34990, it turns out that it's flaky. It failed
once on the Debian build infra, cf. [1].

The problem is that the command `rsync -aH '$fromdir/sym' '$todir'`
updates the mod time of `$todir`, so there might be a diff between the
output of `rsync_ls_lR $fromdir` and `rsync_ls_lR $todir`, if ever rsync
runs 1 second (or more) after the directories were created.

To clarify: it's easy to make the test fails 100% of the times with this
change:

```
 makepath "$fromdir/sym" "$todir"
+sleep 5
 checkit "$RSYNC -aH '$fromdir/sym' '$todir'" "$fromdir" "$todir"
```

With the fix proposed here, we don't use `checkit` anymore, instead we
just run the rsync command, then a simple `diff` to compare the two
directories. This is exactly what the other `-H` test just above does.

In case there's some doubts, `diff` fails if `sym` is missing:

```
$ mkdir -p foo/sym bar
$ diff foo bar || echo KO!
Only in foo: sym
KO!
```

I tested that, after this commit, the test still catches the `-H`
regression in rsync 3.4.0.

Fixes: RsyncProject#735

[1]: https://buildd.debian.org/status/fetch.php?pkg=rsync&arch=ppc64el&ver=3.4.1%2Bds1-1&stamp=1741147156&raw=0
@elboulangero elboulangero linked a pull request Mar 6, 2025 that will close this issue
@elboulangero
Copy link
Author

Hi again, I opened #737. This time it should be good.

@rosorio
Copy link
Contributor

rosorio commented Mar 6, 2025

Hi again, I opened #737. This time it should be good.

Hi,

Yes you're right, thanks for reordering the issue digits :) .

Patch #737 looks good to me, the hardlinks test now breaks on rsync-3.4.0 as expected and passes on 3.4.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants