Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add robot name exercise #123

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Conversation

ageron
Copy link
Contributor

@ageron ageron commented Sep 28, 2024

This exercise did not have test case specifications, so I didn't create template.j2 this time, I just wrote the test file manually. These tests include the core workflow (creating a factory, using it to create a robot, booting the robot, resetting it to factory defaults, etc.). It also includes some tests to ensure that the generated names have the right format, and that they look sufficiently random. For this I included some limited statistical tests. Hopefully this will reject very basic "random" number generators such as just incrementing a counter.

I pushed exercism/roc-test-runner#12 to add the roc-random package to the roc-test-runner. The tests for this exercise will only work once that PR has been merged.

@ageron
Copy link
Contributor Author

ageron commented Oct 2, 2024

It looks like there's a Roc bug causing crashes on Ubuntu (but not on my Macbook), so the tests fail. If I replace the roc-random package with this basic pseudo-random number generator (a Linear Congruential Generator kindly suggested by ChatGPT), everything works fine:

# Random.roc
module [State, u32, seed, Generator]

State a := { state : a }

u32 = \min, max ->
    a = 1664525 # multiplier
    c = 1013904223 # increment
    \@State { state } ->
        newState = (((state |> Num.toU64) * a + c) % 0x100000000) |> Num.toU32
        {
            value: newState % (max - min + 1) + min,
            state: @State { state: newState },
        }

seed : U32 -> State U32
seed = \startSeed -> @State { state: startSeed }

Generator a b : State a -> { value : b, state : State a }

So it looks like the issue is triggered by the roc-random package somehow. I'll file a Roc issue.

@ageron
Copy link
Contributor Author

ageron commented Oct 2, 2024

I've reproduced the issue using the official Roc image based on Ubuntu 22.04. I updated the package to roc-random 0.2.2 and sadly I still get the issue.

As explained above, I've tried to replace roc-random with a very basic pseudorandom number generator, and everything went well. So it looks like it's the roc-random package that's triggering the issue. Looking at its source code, it only uses the Num builtin package, it seems. These are the functions it uses:

Num.addChecked
Num.addWrap
Num.bitwiseXor
Num.mulWrap
Num.shiftLeftBy
Num.shiftRightZfBy

Num.intCast

Num.add
Num.sub
Num.rem

Num.toI16
Num.toI32
Num.toI64
Num.toI8
Num.toU16
Num.toU32
Num.toU8

Num.maxI16
Num.maxI32
Num.maxI8
Num.minI16
Num.minI32
Num.minI8

If there's a bug in any of them, I would bet it's one of the first ones because they're not used that often.

The other possibility is that there's an overflow somewhere that's mishandled. The code seems to deal with values close to the integer bounds.

Otherwise, the only vaguely "exotic" thing I see is this:

Generator uint value : State uint -> Generation uint value

## A pseudorandom value, paired with its [Generator]'s output state (for chaining)
Generation uint value : { value : value, state : State uint }

## Internal state for [Generator]s
State uint := { s : uint, c : AlgorithmConstants uint }

The Generator type is a function type which takes two params, one of which serves as a parameter for an opaque type. This might be a combination that the compiler struggles with somehow?

Any help resolving this issue would be much appreciated. 🙏

@Anton-4
Copy link
Contributor

Anton-4 commented Oct 4, 2024

Quick minimization:

app [main] {
    pf: platform "https://github.com/roc-lang/basic-cli/releases/download/0.15.0/SlwdbJ-3GR7uBWQo6zlmYWNYOxnvo8r6YABXD-45UOw.tar.br",
    rand: "https://github.com/lukewilliamboswell/roc-random/releases/download/0.2.2/cfMw9d_uxoqozMTg7Rvk-By3k1RscEDoR1sZIPVBRKQ.tar.br",
}

import rand.Random
import pf.Stdout

main =
    # Characters in consecutive names should not be correlated
    # we truncate the list to speed up the tests
    truncatedNames0 = manyNames0 |> List.takeFirst correlationSampleSize
    truncatedNames1 = manyNames0 |> List.dropFirst 1 |> List.takeFirst correlationSampleSize
    [0, 1, 2, 3, 4]
    |> List.joinMap \index1 -> [0, 1, 2, 3, 4] |> List.map \index2 -> (index1, index2)
    |> List.all \(index1, index2) ->
        maybeChars = truncatedNames0 |> List.mapTry \chars -> chars |> List.get index1
        maybeCharsNext = truncatedNames1 |> List.mapTry \chars -> chars |> List.get index2
        maybeChars |> seemsIndependentEnoughFrom maybeCharsNext
    |> Inspect.toStr
    |> Stdout.line

## A factory is used to create robots, and hold state such as the existing robot
## names and the current random state
Factory := {
    existingNames : Set Str,
    state : Random.State U32,
}

## A robot must either have no name or a name composed of two letters followed
## by three digits
Robot := {
    maybeName : Result Str [NoName],
    factory : Factory,
}

createFactory : { seed : U32 } -> Factory
createFactory = \{ seed } ->
    @Factory { state: Random.seed seed, existingNames: Set.empty {} }

createRobot : Factory -> Robot
createRobot = \factory ->
    @Robot { maybeName: Err NoName, factory }

boot : Robot -> Robot
boot = \robot ->
    when robot |> getName is
        Ok _ -> robot
        Err NoName -> robot |> generateRandomName

getName : Robot -> Result Str [NoName]
getName = \@Robot { maybeName } ->
    maybeName

getFactory : Robot -> Factory
getFactory = \@Robot { factory } ->
    factory

generateRandomName : Robot -> Robot
generateRandomName = \@Robot { maybeName, factory } ->
    (@Factory { state, existingNames }) = factory
    { updatedState, string: twoLetters } = randomString { state, generator: Random.u32 'A' 'Z', length: 2 }
    { updatedState: updatedState2, string: threeDigits } = randomString { state: updatedState, generator: Random.u32 '0' '9', length: 3 }
    possibleName = "$(twoLetters)$(threeDigits)"

    if existingNames |> Set.contains possibleName then
        numberOfPossibleNames = 26 * 26 * 10 * 10 * 10
        if existingNames |> Set.len == numberOfPossibleNames then
            # better crash than run into an infinite loop
            crash "Too many robots, we have run out of possible names!"
        else
            updatedFactory = @Factory { existingNames, state: updatedState2 }
            generateRandomName (@Robot { maybeName, factory: updatedFactory })
    else
        updatedFactory = @Factory {
            existingNames: existingNames |> Set.insert possibleName,
            state: updatedState2,
        }
        @Robot { maybeName: Ok possibleName, factory: updatedFactory }

randomString : { state : Random.State U32, generator : Random.Generator U32 U32, length : U64 } -> { updatedState : Random.State U32, string : Str }
randomString = \{ state, generator, length } ->
    List.range { start: At 0, end: Before length }
    |> List.walk { state, characters: [] } \walk, _ ->
        random = generator walk.state
        updatedState = random.state
        characters = walk.characters |> List.append (random.value |> Num.toU8)
        { state: updatedState, characters }
    |> \{ state: updatedState, characters } ->
        when characters |> Str.fromUtf8 is
            Ok string -> { updatedState, string }
            Err (BadUtf8 _ _) -> crash "Unreachable: characters are all ASCII"


### Next we will try to ensure that the random names are sufficiently diverse.
### For this, we will first create many robot names.

## Create many robots using a given random seed, and return their names
## encoded using Str.toUtf8.
## The default quantity is 1,000, which is enough to offer strong statistical
## guarantees in the tests below, for example the probability that any letter
## or digit is absent from all names is negligible.
generateRobotNames : { seed : U32, quantity ? U64 } -> List (List U8)
generateRobotNames = \{ seed, quantity ? 1000 } ->
    factory = createFactory { seed }
    List.range { start: At 0, end: Before quantity }
    |> List.walk { names: [], factory } \state, _ ->
        robot = state.factory |> createRobot |> boot
        nameUtf8 =
            when robot |> getName is
                Ok name -> name |> Str.toUtf8
                Err NoName -> crash "A robot must have a name after the first boot"
        {
            names: state.names |> List.append nameUtf8,
            factory: robot |> getFactory,
        }
    |> .names

## many random robot names based on seed 0
manyNames0 : List (List U8)
manyNames0 = generateRobotNames { seed: 0 }

### Finally, we will try to ensure that the characters are not linearly
### correlated within each name or across consecutive names. This does not
### guarantee that the names are truly random, but at least it should rule out
### many types of non-random sequences (e.g., such as simply incrementing a
### counter).

## Convert a list of integers to F64s
toFloats : List (Num *) -> List F64
toFloats = \numbers ->
    numbers |> List.map Num.toF64

## The R² correlation coefficient, also known as the coefficient of determination,
## measures the degree of linear correlation between two lists of numbers.
## It ranges from -∞ to +1.0.
## When both lists are strongly linearly correlated, R² approaches +1.0.
## When both lists are long and independently drawn from the same random
## distribution, R² approaches -1.0.
r2Coeff : List F64, List F64 -> F64
r2Coeff = \numbers1, numbers2 ->
    length = numbers1 |> List.len |> Num.toF64
    mean = numbers1 |> List.sum |> Num.div length
    subtractMean = \val -> val - mean
    square = \val -> val * val
    # Total sum of squares (TSS)
    tss = numbers1 |> List.map subtractMean |> List.map square |> List.sum
    # Residual sum of squares (RSS)
    rss = numbers1 |> List.map2 numbers2 Num.sub |> List.map square |> List.sum
    epsilon = 1e-10 # to avoid division by zero
    1.0 - rss / (tss + epsilon)

# To speed up the correlation tests, we truncate the list of names
correlationSampleSize = 200

# It's not impossible for the random characters to be correlated by chance,
# but given 200 letters or digits, the probability that the correlation
# coefficient ends up greater than this threshold is negligible
r2Threshold = -0.25

seemsIndependentEnoughFrom = \maybeChars1, maybeChars2 ->
    when (maybeChars1, maybeChars2) is
        (Ok chars1, Ok chars2) ->
            r2Coeff (chars1 |> toFloats) (chars2 |> toFloats) < r2Threshold

        _ -> Bool.false # unreachable if names are 5 chars long

Next I did ./target/debug/roc build examples/helloWorld.roc --linker=legacy, and then with valgrind:

❯ valgrind ./examples/helloWorld 
==66227== Memcheck, a memory error detector
==66227== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==66227== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==66227== Command: ./examples/helloWorld
==66227== 
==66227== Invalid read of size 8
==66227==    at 0x141EC4: ??? (roc_app:0)
==66227==    by 0x142BDB: #Attr_#generic_rc_by_ref_3_dec (roc_app:0)
==66227==    by 0x142BDB: ??? (roc_app:0)
==66227==    by 0x142D2C: ??? (roc_app:0)
==66227==    by 0x1369E4: List_iterHelp_7f755070a828d5b0991fd43175b6036d1cb7ecf871d1b823a8797b553d92bc (roc_app:0)
==66227==    by 0x1412DD: List_iterate_5eb6a2599d3097c754d93922b84522fd22c626afbfca9a48d724fa1945e3ca9 (roc_app:0)
==66227==    by 0x136798: List_all_c1becff895d98252d67f432f42e71e6c36db1525fb1c2935030e1f76862 (roc_app:0)
==66227==    by 0x133FB9: #UserApp_main_4c76596ba8468090c17ec41cffb26165180de983edd4ec9ae381e395d43ed6 (roc_app:0)
==66227==    by 0x13C5E5: _mainForHost_74e539aad6ec9b39b5d11cddb1f5bc7530ae1c63102d2ee576fb29be2b2bf8cf (roc_app:0)
==66227==    by 0x13C609: roc__mainForHost_1_exposed_generic (roc_app:0)
==66227==    by 0x1A5343: rust_main (in /home/username/gitrepos/roc-with-nix3/roc/examples/helloWorld)
==66227==    by 0x488110D: (below main) (in /nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5/lib/libc.so.6)
==66227==  Address 0x4c216b0 is 0 bytes inside a block of size 72 free'd
==66227==    at 0x48469E4: free (in /nix/store/78zxvxfy6klvwfc2s95y71y0b284fd6v-valgrind-3.22.0/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==66227==    by 0x141F02: ??? (roc_app:0)
==66227==    by 0x13A222: #Attr_#generic_rc_by_ref_3_dec (roc_app:0)
==66227==    by 0x13A222: List_sublistLowlevel_131fc9d292b7c25af42a6c6deb3979c2144f1a7423d39eb46aef237b8f774b (roc_app:0)
==66227==    by 0x133EC8: #UserApp_main_4c76596ba8468090c17ec41cffb26165180de983edd4ec9ae381e395d43ed6 (roc_app:0)
==66227==    by 0x13C5E5: _mainForHost_74e539aad6ec9b39b5d11cddb1f5bc7530ae1c63102d2ee576fb29be2b2bf8cf (roc_app:0)
==66227==    by 0x13C609: roc__mainForHost_1_exposed_generic (roc_app:0)
==66227==    by 0x1A5343: rust_main (in /home/username/gitrepos/roc-with-nix3/roc/examples/helloWorld)
==66227==    by 0x488110D: (below main) (in /nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5/lib/libc.so.6)
==66227==  Block was alloc'd at
==66227==    at 0x484376B: malloc (in /nix/store/78zxvxfy6klvwfc2s95y71y0b284fd6v-valgrind-3.22.0/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==66227==    by 0x13F322: List_reserve_5d84da6abaf677d342986d45e3605cfd5bd1528ee5196616226adfb513950 (roc_app:0)
==66227==    by 0x1318D0: List_append_9d949d48d48bd9629a662ead3e2fc9728ebe6544f0834563102ca492bac0 (roc_app:0)
==66227==    by 0x13F00A: #UserApp_69_dcb273fb5185f78cd65b7adda9732f8bba62aec766939b09e25f2a158a74543 (roc_app:0)
==66227==    by 0x1413E3: List_walkHelp_405afdd54e1519581be52a8c8268ff856d1e183e22d746e36b8e1e892557df7 (roc_app:0)
==66227==    by 0x132FCC: List_walk_1e4d2f1e6b4984301a1489b71481ade3a818d1fae80b8f87ea525c7bff923 (roc_app:0)
==66227==    by 0x135276: #UserApp_randomString_505ee8ebb90affa4bd2deabd9435fb89b29e5cff14c44acccc1c3ac721361 (roc_app:0)
==66227==    by 0x135504: #UserApp_generateRandomName_b96090d699017c67f1c33d23f5f5b60732db46124d7bead69858245e5f4 (roc_app:0)
==66227==    by 0x13C2DC: #UserApp_boot_8e6f801bf53c86630ef8be4d8c3ee15f80c78c69d3ce8031062a6e529ee (roc_app:0)
==66227==    by 0x13B108: #UserApp_84_1d346aa605d9beef3d3c8b59d397858b899a3c7efdcede8d0efd4953334f936 (roc_app:0)
==66227==    by 0x13DFBE: List_walkHelp_28b81340646419744ffe2153acaa8e39d3c2d10c2a51eb5702318112c7c5 (roc_app:0)
==66227==    by 0x135E70: List_walk_dba168b77b24c6e793bab73fde9e33d53bc4c848cefe7b7fe931876bedf73d (roc_app:0)
==66227== 
==66227== Invalid write of size 8
==66227==    at 0x141EE3: ??? (roc_app:0)
==66227==    by 0x142BDB: #Attr_#generic_rc_by_ref_3_dec (roc_app:0)
==66227==    by 0x142BDB: ??? (roc_app:0)
==66227==    by 0x142D2C: ??? (roc_app:0)
==66227==    by 0x1369E4: List_iterHelp_7f755070a828d5b0991fd43175b6036d1cb7ecf871d1b823a8797b553d92bc (roc_app:0)
==66227==    by 0x1412DD: List_iterate_5eb6a2599d3097c754d93922b84522fd22c626afbfca9a48d724fa1945e3ca9 (roc_app:0)
==66227==    by 0x136798: List_all_c1becff895d98252d67f432f42e71e6c36db1525fb1c2935030e1f76862 (roc_app:0)
==66227==    by 0x133FB9: #UserApp_main_4c76596ba8468090c17ec41cffb26165180de983edd4ec9ae381e395d43ed6 (roc_app:0)
==66227==    by 0x13C5E5: _mainForHost_74e539aad6ec9b39b5d11cddb1f5bc7530ae1c63102d2ee576fb29be2b2bf8cf (roc_app:0)
==66227==    by 0x13C609: roc__mainForHost_1_exposed_generic (roc_app:0)
==66227==    by 0x1A5343: rust_main (in /home/username/gitrepos/roc-with-nix3/roc/examples/helloWorld)
==66227==    by 0x488110D: (below main) (in /nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5/lib/libc.so.6)
==66227==  Address 0x4c216b0 is 0 bytes inside a block of size 72 free'd
==66227==    at 0x48469E4: free (in /nix/store/78zxvxfy6klvwfc2s95y71y0b284fd6v-valgrind-3.22.0/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==66227==    by 0x141F02: ??? (roc_app:0)
==66227==    by 0x13A222: #Attr_#generic_rc_by_ref_3_dec (roc_app:0)
==66227==    by 0x13A222: List_sublistLowlevel_131fc9d292b7c25af42a6c6deb3979c2144f1a7423d39eb46aef237b8f774b (roc_app:0)
==66227==    by 0x133EC8: #UserApp_main_4c76596ba8468090c17ec41cffb26165180de983edd4ec9ae381e395d43ed6 (roc_app:0)
==66227==    by 0x13C5E5: _mainForHost_74e539aad6ec9b39b5d11cddb1f5bc7530ae1c63102d2ee576fb29be2b2bf8cf (roc_app:0)
==66227==    by 0x13C609: roc__mainForHost_1_exposed_generic (roc_app:0)
==66227==    by 0x1A5343: rust_main (in /home/username/gitrepos/roc-with-nix3/roc/examples/helloWorld)
==66227==    by 0x488110D: (below main) (in /nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5/lib/libc.so.6)
==66227==  Block was alloc'd at
==66227==    at 0x484376B: malloc (in /nix/store/78zxvxfy6klvwfc2s95y71y0b284fd6v-valgrind-3.22.0/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==66227==    by 0x13F322: List_reserve_5d84da6abaf677d342986d45e3605cfd5bd1528ee5196616226adfb513950 (roc_app:0)
==66227==    by 0x1318D0: List_append_9d949d48d48bd9629a662ead3e2fc9728ebe6544f0834563102ca492bac0 (roc_app:0)
==66227==    by 0x13F00A: #UserApp_69_dcb273fb5185f78cd65b7adda9732f8bba62aec766939b09e25f2a158a74543 (roc_app:0)
==66227==    by 0x1413E3: List_walkHelp_405afdd54e1519581be52a8c8268ff856d1e183e22d746e36b8e1e892557df7 (roc_app:0)
==66227==    by 0x132FCC: List_walk_1e4d2f1e6b4984301a1489b71481ade3a818d1fae80b8f87ea525c7bff923 (roc_app:0)
==66227==    by 0x135276: #UserApp_randomString_505ee8ebb90affa4bd2deabd9435fb89b29e5cff14c44acccc1c3ac721361 (roc_app:0)
==66227==    by 0x135504: #UserApp_generateRandomName_b96090d699017c67f1c33d23f5f5b60732db46124d7bead69858245e5f4 (roc_app:0)
==66227==    by 0x13C2DC: #UserApp_boot_8e6f801bf53c86630ef8be4d8c3ee15f80c78c69d3ce8031062a6e529ee (roc_app:0)
==66227==    by 0x13B108: #UserApp_84_1d346aa605d9beef3d3c8b59d397858b899a3c7efdcede8d0efd4953334f936 (roc_app:0)
==66227==    by 0x13DFBE: List_walkHelp_28b81340646419744ffe2153acaa8e39d3c2d10c2a51eb5702318112c7c5 (roc_app:0)
==66227==    by 0x135E70: List_walk_dba168b77b24c6e793bab73fde9e33d53bc4c848cefe7b7fe931876bedf73d (roc_app:0)
==66227== 
Bool.true
==66227== 
==66227== HEAP SUMMARY:
==66227==     in use at exit: 1,024 bytes in 1 blocks
==66227==   total heap usage: 12,375 allocs, 12,374 frees, 40,025,152 bytes allocated
==66227== 
==66227== LEAK SUMMARY:
==66227==    definitely lost: 0 bytes in 0 blocks
==66227==    indirectly lost: 0 bytes in 0 blocks
==66227==      possibly lost: 0 bytes in 0 blocks
==66227==    still reachable: 1,024 bytes in 1 blocks
==66227==         suppressed: 0 bytes in 0 blocks
==66227== Rerun with --leak-check=full to see details of leaked memory
==66227== 
==66227== For lists of detected and suppressed errors, rerun with: -s
==66227== ERROR SUMMARY: 1598 errors from 2 contexts (suppressed: 0 from 0)

@Anton-4
Copy link
Contributor

Anton-4 commented Oct 4, 2024

I've got some things to take care of before I can get to this, but I recommend:

  • further minimization of code that produces the core dump
  • try to find a small change that prevents the core dump
  • compare both versions using IDA free, as described here.

@ageron
Copy link
Contributor Author

ageron commented Oct 5, 2024

That's great, thanks @Anton-4 ! Especially useful to see how you proceed in such cases, I've never used Valgrind. I'll follow your suggestions. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants