-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Something wrong with integer hashes and/or set grow, Cuis is 1000x faster #129
Comments
That remind me this thread:
https://lists.cuis.st/mailman/archives/cuis-dev/2024-March/008555.html
Le mar. 14 janv. 2025 à 20:38, David T Lewis via Squeak-dev <
***@***.***> a écrit :
… Reported by Eliot on squeak-dev:
***@***.***/thread/ABGLWILW6YJ2NPAHICWSLCOUIAKXKLFL/
Copied from the original report:
Hi All,
recently the identityHash implementation in the VM has been upgraded.
Open Smalltalk Cog[Spur] VM [CoInterpreterPrimitives
VMMaker.oscog-eem.3500] 5.20250103.0325
platform sources revision VM: 202501030325 ***@***.***:oscogvm
Date: Thu Jan 2 19:25:19 2025 CommitHash: 5e05e38
It now uses an xorshift register PNG that cycles through all possible 2^22
- 1 valid identity hashes in 2^22 - 1 iterations. One way to test this is
to create a set, and loop creating 2^22-1 objects, adding their hashes to
the set, and answering the size of the set.
So let's do this and see how long it takes at the same time. e.g. this is
on my 2021 M1 Max MBP
| size |
{ [| hashes |
hashes := Set new: (2 raisedTo: 22).
1 to: (2 raisedTo: 22) - 1 do:
[:ign| hashes add: Object new identityHash].
size := hashes size] timeToRun.
size.
(2 raisedTo: 22) - 1 }. #(450 4194303 4194303)
also #(483 4194303 4194303)
So avoiding grows this takes less than half a second.
However, if we just use Set new and rely on it growing we get
approximately a 1,500 to 1,800 fold slowdown:
| size |
{ [| hashes |
hashes := Set new.
1 to: (2 raisedTo: 22) - 1 do:
[:ign| hashes add: Object new identityHash].
size := hashes size] timeToRun.
size.
(2 raisedTo: 22) - 1 }. #(768992 4194303 4194303)
also #(800874 4194303 4194303).
768992 / 483.0 = 1592
800874 / 450.0 1780
Now Cuis has moved away from allowing stretchy collections to have their
capacity initialized with new:. One has to use newWithRoomForMoreThan:. So
| size |
{ [| hashes |
hashes := Set newWithRoomForMoreThan: (2 raisedTo: 22).
1 to: (2 raisedTo: 22) - 1 do:
[:ign| hashes add: Object new identityHash].
size := hashes size] timeToRun.
size.
(2 raisedTo: 22) - 1 }. #(506 4194303 4194303)
BUT!!!! If we just use new and allow Cuis to grow the set we get e.g.
| size |
{ [| hashes |
hashes := Set new.
1 to: (2 raisedTo: 22) - 1 do:
[:ign| hashes add: Object new identityHash].
size := hashes size] timeToRun.
size.
(2 raisedTo: 22) - 1 }. #(725 4194303 4194303) .
This is at least 1,000 times faster than Squeak. Something is seriously
wrong with the Squeak code.
768992 / 725.0 = 1061
*,,,^..^,,,*
Happy New Year! Eliot
—
Reply to this email directly, view it on GitHub
<#129>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/BFYAK6LM7BP4BHBELJMTK4L2KVRQPAVCNFSM6AAAAABVFUFTZWVHI2DSMVQWIX3LMV43ASLTON2WKOZSG44DQMJTG42DCMA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
Squeak-dev mailing list -- ***@***.***
To unsubscribe send an email to
***@***.***
|
This time it seems a bit different though, the bench is not dominated by
GC, but by hash collisions.
Cuis Set uses the old fashioned capacity:
sizeFor: nElements
"Large enough size to hold nElements with some slop (see fullCheck)"
nElements <= 0 ifTrue: [^ 1].
^ nElements+1*4//3
Squeak Set uses more advanced #*goodPrimeAtLeast:* for determining the
capacity
*.*So we would expect Squeak to be better... But no, it isn't.
First observe that we have:
(hashes sorted = (1 to: hashes size) asArray).
Hence, we try to put consecutives SmallInteger in a Set.
If I bench Eliot's test, but with gradually growing size (1<<18-1),
(1<<19-1),(1<<20-1), (1<<21-1)
I get:
'6.87 per second. 146 milliseconds per run. 13.48116 % GC time.'
'3.23 per second. 310 milliseconds per run. 14.60311 % GC time.'
'1.56 per second. 641 milliseconds per run. 15.1462 % GC time.'
'0.087 per second. 11.5 seconds per run. 0.04346 % GC time.'
So the cost grows linearly with size up to 2^20,
but 2^21 costs us a factor x20 instead of expected x2,
that's 10 times more than expected,
and GC cost drops in percentage...
I won't even try (1<<22-1) on my slow machine...
What happens ?
Simple : the answer is in SmallInteger>>hash ^self
The identity hashes are varying from 1 to 1<<22-1.
Up to 1<<20-1, we do have not so many consecutives hashes statistically.
But we have way more consecutive with 1<<21-1 elements (amonf 1<<22-1
possible hashes).
And since candidate index in scanFor: is simply object hash\\array size+1,
once we have a collision (and we will have until the array size grows above
1<<22-2),
since we have already filled consecutive slots with consecutives hashes,
we are getting long collision chains.
So why is Cuis better behaved?
Because Integer>>hash will return self hashMultiply...
Note that above 1<<54, consecutive Integer hashes will show quite some
collisions because first convert asFloat, and many consecutive LargeInteger
convert to the same Float.
Just redefine SmallInteger hash>>^self hashMultiply, and you'll get more
decent benchmark.
Except, that you can't do it interactively, because you have to rehash all
hashed collection before Morphic takes the hand back.
Nicolas
Le mar. 14 janv. 2025 à 21:55, Nicolas Cellier <
***@***.***> a écrit :
… That remind me this thread:
https://lists.cuis.st/mailman/archives/cuis-dev/2024-March/008555.html
Le mar. 14 janv. 2025 à 20:38, David T Lewis via Squeak-dev <
***@***.***> a écrit :
> Reported by Eliot on squeak-dev:
>
> ***@***.***/thread/ABGLWILW6YJ2NPAHICWSLCOUIAKXKLFL/
>
> Copied from the original report:
>
> Hi All,
>
> recently the identityHash implementation in the VM has been upgraded.
>
> Open Smalltalk Cog[Spur] VM [CoInterpreterPrimitives
> VMMaker.oscog-eem.3500] 5.20250103.0325
> platform sources revision VM: 202501030325 ***@***.***:oscogvm
> Date: Thu Jan 2 19:25:19 2025 CommitHash: 5e05e38
>
> It now uses an xorshift register PNG that cycles through all possible
> 2^22 - 1 valid identity hashes in 2^22 - 1 iterations. One way to test this
> is to create a set, and loop creating 2^22-1 objects, adding their hashes
> to the set, and answering the size of the set.
>
> So let's do this and see how long it takes at the same time. e.g. this is
> on my 2021 M1 Max MBP
>
> | size |
> { [| hashes |
> hashes := Set new: (2 raisedTo: 22).
> 1 to: (2 raisedTo: 22) - 1 do:
> [:ign| hashes add: Object new identityHash].
> size := hashes size] timeToRun.
> size.
> (2 raisedTo: 22) - 1 }. #(450 4194303 4194303)
>
> also #(483 4194303 4194303)
>
> So avoiding grows this takes less than half a second.
>
> However, if we just use Set new and rely on it growing we get
> approximately a 1,500 to 1,800 fold slowdown:
>
> | size |
> { [| hashes |
> hashes := Set new.
> 1 to: (2 raisedTo: 22) - 1 do:
> [:ign| hashes add: Object new identityHash].
> size := hashes size] timeToRun.
> size.
> (2 raisedTo: 22) - 1 }. #(768992 4194303 4194303)
>
> also #(800874 4194303 4194303).
>
> 768992 / 483.0 = 1592
> 800874 / 450.0 1780
>
> Now Cuis has moved away from allowing stretchy collections to have their
> capacity initialized with new:. One has to use newWithRoomForMoreThan:. So
>
> | size |
> { [| hashes |
> hashes := Set newWithRoomForMoreThan: (2 raisedTo: 22).
> 1 to: (2 raisedTo: 22) - 1 do:
> [:ign| hashes add: Object new identityHash].
> size := hashes size] timeToRun.
> size.
> (2 raisedTo: 22) - 1 }. #(506 4194303 4194303)
>
> BUT!!!! If we just use new and allow Cuis to grow the set we get e.g.
>
> | size |
> { [| hashes |
> hashes := Set new.
> 1 to: (2 raisedTo: 22) - 1 do:
> [:ign| hashes add: Object new identityHash].
> size := hashes size] timeToRun.
> size.
> (2 raisedTo: 22) - 1 }. #(725 4194303 4194303) .
>
> This is at least 1,000 times faster than Squeak. Something is seriously
> wrong with the Squeak code.
>
> 768992 / 725.0 = 1061
>
> *,,,^..^,,,*
> Happy New Year! Eliot
>
> —
> Reply to this email directly, view it on GitHub
> <#129>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/BFYAK6LM7BP4BHBELJMTK4L2KVRQPAVCNFSM6AAAAABVFUFTZWVHI2DSMVQWIX3LMV43ASLTON2WKOZSG44DQMJTG42DCMA>
> .
> You are receiving this because you are subscribed to this thread.Message
> ID: ***@***.***>
> Squeak-dev mailing list -- ***@***.***
> To unsubscribe send an email to
> ***@***.***
|
The note about hash of integer larger than 1<<54 does apply to Cuis, not
sure it was clear enough.
Hence, if you want to attack Cuis, then try
[(1<<100+1 to: 1<<100+65536) asSet] bench.
Since all those Integer will have the same hash, the cost of adding will be
O(N^2)
That's the tribute for having (anInteger asFloat = anInteger) and
maintaining equal hashes for equal objects...
In Squeak, we don't have (anInteger asFloat = anInteger), except if
(anInteger isAnExactFloat).
Le mer. 15 janv. 2025 à 00:13, Nicolas Cellier <
***@***.***> a écrit :
… This time it seems a bit different though, the bench is not dominated by
GC, but by hash collisions.
Cuis Set uses the old fashioned capacity:
sizeFor: nElements
"Large enough size to hold nElements with some slop (see fullCheck)"
nElements <= 0 ifTrue: [^ 1].
^ nElements+1*4//3
Squeak Set uses more advanced #*goodPrimeAtLeast:* for determining the
capacity
*.*So we would expect Squeak to be better... But no, it isn't.
First observe that we have:
(hashes sorted = (1 to: hashes size) asArray).
Hence, we try to put consecutives SmallInteger in a Set.
If I bench Eliot's test, but with gradually growing size (1<<18-1),
(1<<19-1),(1<<20-1), (1<<21-1)
I get:
'6.87 per second. 146 milliseconds per run. 13.48116 % GC time.'
'3.23 per second. 310 milliseconds per run. 14.60311 % GC time.'
'1.56 per second. 641 milliseconds per run. 15.1462 % GC time.'
'0.087 per second. 11.5 seconds per run. 0.04346 % GC time.'
So the cost grows linearly with size up to 2^20,
but 2^21 costs us a factor x20 instead of expected x2,
that's 10 times more than expected,
and GC cost drops in percentage...
I won't even try (1<<22-1) on my slow machine...
What happens ?
Simple : the answer is in SmallInteger>>hash ^self
The identity hashes are varying from 1 to 1<<22-1.
Up to 1<<20-1, we do have not so many consecutives hashes statistically.
But we have way more consecutive with 1<<21-1 elements (amonf 1<<22-1
possible hashes).
And since candidate index in scanFor: is simply object hash\\array size+1,
once we have a collision (and we will have until the array size grows
above 1<<22-2),
since we have already filled consecutive slots with consecutives hashes,
we are getting long collision chains.
So why is Cuis better behaved?
Because Integer>>hash will return self hashMultiply...
Note that above 1<<54, consecutive Integer hashes will show quite some
collisions because first convert asFloat, and many consecutive LargeInteger
convert to the same Float.
Just redefine SmallInteger hash>>^self hashMultiply, and you'll get more
decent benchmark.
Except, that you can't do it interactively, because you have to rehash all
hashed collection before Morphic takes the hand back.
Nicolas
Le mar. 14 janv. 2025 à 21:55, Nicolas Cellier <
***@***.***> a écrit :
> That remind me this thread:
>
> https://lists.cuis.st/mailman/archives/cuis-dev/2024-March/008555.html
>
> Le mar. 14 janv. 2025 à 20:38, David T Lewis via Squeak-dev <
> ***@***.***> a écrit :
>
>> Reported by Eliot on squeak-dev:
>>
>> ***@***.***/thread/ABGLWILW6YJ2NPAHICWSLCOUIAKXKLFL/
>>
>> Copied from the original report:
>>
>> Hi All,
>>
>> recently the identityHash implementation in the VM has been upgraded.
>>
>> Open Smalltalk Cog[Spur] VM [CoInterpreterPrimitives
>> VMMaker.oscog-eem.3500] 5.20250103.0325
>> platform sources revision VM: 202501030325 ***@***.***:oscogvm
>> Date: Thu Jan 2 19:25:19 2025 CommitHash: 5e05e38
>>
>> It now uses an xorshift register PNG that cycles through all possible
>> 2^22 - 1 valid identity hashes in 2^22 - 1 iterations. One way to test this
>> is to create a set, and loop creating 2^22-1 objects, adding their hashes
>> to the set, and answering the size of the set.
>>
>> So let's do this and see how long it takes at the same time. e.g. this
>> is on my 2021 M1 Max MBP
>>
>> | size |
>> { [| hashes |
>> hashes := Set new: (2 raisedTo: 22).
>> 1 to: (2 raisedTo: 22) - 1 do:
>> [:ign| hashes add: Object new identityHash].
>> size := hashes size] timeToRun.
>> size.
>> (2 raisedTo: 22) - 1 }. #(450 4194303 4194303)
>>
>> also #(483 4194303 4194303)
>>
>> So avoiding grows this takes less than half a second.
>>
>> However, if we just use Set new and rely on it growing we get
>> approximately a 1,500 to 1,800 fold slowdown:
>>
>> | size |
>> { [| hashes |
>> hashes := Set new.
>> 1 to: (2 raisedTo: 22) - 1 do:
>> [:ign| hashes add: Object new identityHash].
>> size := hashes size] timeToRun.
>> size.
>> (2 raisedTo: 22) - 1 }. #(768992 4194303 4194303)
>>
>> also #(800874 4194303 4194303).
>>
>> 768992 / 483.0 = 1592
>> 800874 / 450.0 1780
>>
>> Now Cuis has moved away from allowing stretchy collections to have their
>> capacity initialized with new:. One has to use newWithRoomForMoreThan:. So
>>
>> | size |
>> { [| hashes |
>> hashes := Set newWithRoomForMoreThan: (2 raisedTo: 22).
>> 1 to: (2 raisedTo: 22) - 1 do:
>> [:ign| hashes add: Object new identityHash].
>> size := hashes size] timeToRun.
>> size.
>> (2 raisedTo: 22) - 1 }. #(506 4194303 4194303)
>>
>> BUT!!!! If we just use new and allow Cuis to grow the set we get e.g.
>>
>> | size |
>> { [| hashes |
>> hashes := Set new.
>> 1 to: (2 raisedTo: 22) - 1 do:
>> [:ign| hashes add: Object new identityHash].
>> size := hashes size] timeToRun.
>> size.
>> (2 raisedTo: 22) - 1 }. #(725 4194303 4194303) .
>>
>> This is at least 1,000 times faster than Squeak. Something is seriously
>> wrong with the Squeak code.
>>
>> 768992 / 725.0 = 1061
>>
>> *,,,^..^,,,*
>> Happy New Year! Eliot
>>
>> —
>> Reply to this email directly, view it on GitHub
>> <#129>,
>> or unsubscribe
>> <https://github.com/notifications/unsubscribe-auth/BFYAK6LM7BP4BHBELJMTK4L2KVRQPAVCNFSM6AAAAABVFUFTZWVHI2DSMVQWIX3LMV43ASLTON2WKOZSG44DQMJTG42DCMA>
>> .
>> You are receiving this because you are subscribed to this thread.Message
>> ID: ***@***.***>
>> Squeak-dev mailing list -- ***@***.***
>> To unsubscribe send an email to
>> ***@***.***
>
>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Reported by Eliot on squeak-dev:
https://lists.squeakfoundation.org/archives/list/[email protected]/thread/ABGLWILW6YJ2NPAHICWSLCOUIAKXKLFL/
Copied from the original report:
Hi All,
recently the identityHash implementation in the VM has been upgraded.
Open Smalltalk Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3500] 5.20250103.0325
platform sources revision VM: 202501030325 [email protected]:oscogvm Date: Thu Jan 2 19:25:19 2025 CommitHash: 5e05e38
It now uses an xorshift register PNG that cycles through all possible 2^22 - 1 valid identity hashes in 2^22 - 1 iterations. One way to test this is to create a set, and loop creating 2^22-1 objects, adding their hashes to the set, and answering the size of the set.
So let's do this and see how long it takes at the same time. e.g. this is on my 2021 M1 Max MBP
| size |
{ [| hashes |
hashes := Set new: (2 raisedTo: 22).
1 to: (2 raisedTo: 22) - 1 do:
[:ign| hashes add: Object new identityHash].
size := hashes size] timeToRun.
size.
(2 raisedTo: 22) - 1 }. #(450 4194303 4194303)
also #(483 4194303 4194303)
So avoiding grows this takes less than half a second.
However, if we just use Set new and rely on it growing we get approximately a 1,500 to 1,800 fold slowdown:
| size |
{ [| hashes |
hashes := Set new.
1 to: (2 raisedTo: 22) - 1 do:
[:ign| hashes add: Object new identityHash].
size := hashes size] timeToRun.
size.
(2 raisedTo: 22) - 1 }. #(768992 4194303 4194303)
also #(800874 4194303 4194303).
768992 / 483.0 = 1592
800874 / 450.0 1780
Now Cuis has moved away from allowing stretchy collections to have their capacity initialized with new:. One has to use newWithRoomForMoreThan:. So
| size |
{ [| hashes |
hashes := Set newWithRoomForMoreThan: (2 raisedTo: 22).
1 to: (2 raisedTo: 22) - 1 do:
[:ign| hashes add: Object new identityHash].
size := hashes size] timeToRun.
size.
(2 raisedTo: 22) - 1 }. #(506 4194303 4194303)
BUT!!!! If we just use new and allow Cuis to grow the set we get e.g.
| size |
{ [| hashes |
hashes := Set new.
1 to: (2 raisedTo: 22) - 1 do:
[:ign| hashes add: Object new identityHash].
size := hashes size] timeToRun.
size.
(2 raisedTo: 22) - 1 }. #(725 4194303 4194303) .
This is at least 1,000 times faster than Squeak. Something is seriously wrong with the Squeak code.
768992 / 725.0 = 1061
,,,^..^,,,
Happy New Year! Eliot
The text was updated successfully, but these errors were encountered: