Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Something wrong with integer hashes and/or set grow, Cuis is 1000x faster #129

Open
dtlewis290 opened this issue Jan 14, 2025 · 3 comments

Comments

@dtlewis290
Copy link
Member

Reported by Eliot on squeak-dev:
https://lists.squeakfoundation.org/archives/list/[email protected]/thread/ABGLWILW6YJ2NPAHICWSLCOUIAKXKLFL/

Copied from the original report:

Hi All,

   recently the identityHash implementation in the VM has been upgraded. 

Open Smalltalk Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3500] 5.20250103.0325
platform sources revision VM: 202501030325 [email protected]:oscogvm Date: Thu Jan 2 19:25:19 2025 CommitHash: 5e05e38

It now uses an xorshift register PNG that cycles through all possible 2^22 - 1 valid identity hashes in 2^22 - 1 iterations. One way to test this is to create a set, and loop creating 2^22-1 objects, adding their hashes to the set, and answering the size of the set.

So let's do this and see how long it takes at the same time. e.g. this is on my 2021 M1 Max MBP

| size |
{ [| hashes |
    hashes := Set new: (2 raisedTo: 22).
    1 to: (2 raisedTo: 22) - 1 do:
        [:ign| hashes add: Object new identityHash].
    size := hashes size] timeToRun.
    size.
    (2 raisedTo: 22) - 1 }. #(450 4194303 4194303)

also #(483 4194303 4194303)

So avoiding grows this takes less than half a second.

However, if we just use Set new and rely on it growing we get approximately a 1,500 to 1,800 fold slowdown:

| size |
{ [| hashes |
    hashes := Set new.
    1 to: (2 raisedTo: 22) - 1 do:
        [:ign| hashes add: Object new identityHash].
    size := hashes size] timeToRun.
    size.
    (2 raisedTo: 22) - 1 }. #(768992 4194303 4194303)

also #(800874 4194303 4194303).

768992 / 483.0 = 1592
800874 / 450.0 1780

Now Cuis has moved away from allowing stretchy collections to have their capacity initialized with new:. One has to use newWithRoomForMoreThan:. So

| size |
{ [| hashes |
hashes := Set newWithRoomForMoreThan: (2 raisedTo: 22).
1 to: (2 raisedTo: 22) - 1 do:
[:ign| hashes add: Object new identityHash].
size := hashes size] timeToRun.
size.
(2 raisedTo: 22) - 1 }. #(506 4194303 4194303)

BUT!!!! If we just use new and allow Cuis to grow the set we get e.g.

| size |
{ [| hashes |
hashes := Set new.
1 to: (2 raisedTo: 22) - 1 do:
[:ign| hashes add: Object new identityHash].
size := hashes size] timeToRun.
size.
(2 raisedTo: 22) - 1 }. #(725 4194303 4194303) .

This is at least 1,000 times faster than Squeak.  Something is seriously wrong with the Squeak code.

768992 / 725.0 = 1061

,,,^..^,,,
Happy New Year! Eliot

@squeak-smalltalk-bot
Copy link

squeak-smalltalk-bot commented Jan 14, 2025 via email

@squeak-smalltalk-bot
Copy link

squeak-smalltalk-bot commented Jan 14, 2025 via email

@squeak-smalltalk-bot
Copy link

squeak-smalltalk-bot commented Jan 14, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants