Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

help says that node_list is an opotion for the scheduler section but it doesn't exist #772

Open
cadejager opened this issue Jun 14, 2024 · 1 comment

Comments

@cadejager
Copy link
Collaborator

ro-rfe3:cpu-testing/tests:cpu-testing$ pav show sched --vars slurm
 Variables for the slurm scheduler plugin.
-----------------+----------+---------------------------------------+------------------------------------------------------------------
 Name            | Deferred | Example                               | Help
-----------------+----------+---------------------------------------+------------------------------------------------------------------
 chunk_ids       | False    | ['0', '1', '2', '3']                  | A list of indices of the available chunks.
 chunk_size      | False    |                                       | The size of each chunk.
 errors          | False    | ['oh no, there was an error.']        | Return the list of retrieval errors encountered when using this
                 |          |                                       | var_dict. Key errors are not included.
 min_cpus        | False    | 4                                     | Get a minimum number of cpus available on each (filtered) noded.
                 |          |                                       | Defaults to 1 if unknown.
 min_mem         | False    | 1000                                  | Get a minimum for any node across each (filtered) nodes. Returns
                 |          |                                       | a value in bytes (4 GB if unknown).
 node_list       | False    | ['node01', 'node03', 'node04']        | The list of node names that the test could run on, after
                 |          |                                       | filtering, as per the 'nodes' variable.
 node_list_id    | False    |                                       | Return the node list id, if available. This is meaningless to
                 |          |                                       | test configs, but is used internally by Pavilion.
 nodes           | False    | 1                                     | The number of nodes that a test may run on, after filtering
                 |          |                                       | according to the test's 'schedule' section. The actual nodes
                 |          |                                       | selected for the test, whether selected by Pavilion or the
                 |          |                                       | scheduler itself, will be in 'test_nodes'.
 partition       | False    |                                       | This variable provides extra status info for a test. It is
                 |          |                                       | particularly meant to be overridden by plugins.
 requested_nodes | False    |                                       | Number of requested nodes.
 tasks_per_node  | True     | 5                                     | The number of tasks to create per node. If the scheduler does
                 |          |                                       | not support node info, just returns 1.
 tasks_total     | True     | 180                                   | The total number of tasks for the job, either as defined by
                 |          |                                       | 'tasks' or by the tasks_per_node and number of nodes.
 test_cmd        | True     | srun -N 5 -w node[05-10],node23 -n 20 | Calls the actual test command and then wraps the return with the
                 |          |                                       | wrapper provided in the schedule section of the configuration.
 test_min_cpus   | True     | 4                                     | The min cpus for each node in the chunk. Defaults to 1 if no
                 |          |                                       | info is available.
 test_min_mem    | True     | 32                                    | The min memory for each node in the chunk in bytes. Defaults to
                 |          |                                       | 4 GB if no info is available.
 test_node_list  | True     | ['node02', 'node04']                  | The list of nodes by name allocated for this test. Note that
                 |          |                                       | more nodes than this may exist in the allocation.
 test_nodes      | True     | 45                                    | The number of nodes for this specific test, determined once the
                 |          |                                       | test has an allocation. Note that the allocation size may be
                 |          |                                       | larger than this number.
ro-rfe3:cpu-testing/tests:cpu-testing$ pav run dgemm.ex
Created Test Series cmdline.
Error loading test configs for test set 'dgemm.ex'
dgemm.ex - Test 'ex' in suite '/users/dejager/test/cpu-testing/tests/dgemm.yaml' has an error.
See 'pav show test_config' for the pavilion test config format.
  Invalid config key 'node_list' given under 'schedule'.
  Did you mean one of these?
    include_nodes - "Nodes to always include in every allocation on which this te..."
    min_nodes - "The minimum number of nodes to allocate. This is only suppor..."
    node_state - "Filter nodes based on their current state. Options are 'up' ..."
  Config elements under this one have similar keys:
  schedule:
    chunking:
      node_selection:     # Determines how Pavilion chooses nodes for each chunk. Chunks...
Error making tests for series 's141'.
  Error creating tests for test set dgemm.ex.
    Test 'ex' in suite '/users/dejager/test/cpu-testing/tests/dgemm.yaml' has an error.
    See 'pav show test_config' for the pavilion test config format.
      Invalid config key 'node_list' given under 'schedule'.
      Did you mean one of these?
        include_nodes - "Nodes to always include in every allocation on which this te..."
        min_nodes - "The minimum number of nodes to allocate. This is only suppor..."
        node_state - "Filter nodes based on their current state. Options are 'up' ..."
      Config elements under this one have similar keys:
      schedule:
        chunking:
          node_selection:     # Determines how Pavilion chooses nodes for each chunk. Chunks...
ro-rfe3:cpu-testing/tests:cpu-testing$
@cadejager
Copy link
Collaborator Author

In my test config I added the following:

  schedule:
    nodes: 1
    node_list: ['nid001216','nid001217']

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant