Notes from meeting in Aachen, April 2016

Dear friends of recog,

in the first week of April, there was a small meeting in Aachen where the future of recog was discussed. Present were

Alice Niemeyer
Frank Lübeck
Max Horn (who wrote these notes)
Sergio Siccha (Alice' PhD student)

Our goal was to figure out the current state of the GAP recog package, and to discuss how the project can continue. During the discussions, we were joined by Alice' PhD student Sergio Siccha.

Below the outcome of our discussion is summarized, and a long list of open tasks is given. Our hope is that with this email, and some additional activities, we can infuse this currently dormant project with some new life, motivating some of you to actively contribute.

DISCLAIMER: This summary was written by me (Max Horn). I tried my best to make the information in this document accurate, and to adequately represent what was said, but I may have made mistakes. My apologies for sending it out only now: This was meant to go out in late April / early May, but I only now (July 31) realized that I never sent it (when trying to talk to people about it Oberwolfach).

What is there?

The GAP packages recog and recogbase (from now on collectively referred to as "recog") already implement some important aspects of group recognition, including the following:

a generic, highly flexible framework for constructive and non-constructive recognition of matrix, permutation and black box groups;
constructive recognition of SL(n,F_q);
non-constructive recognition of classical groups, and sporadics;
functionality to use the result of a recognition run to perform things like membership tests and computing the group size

On the downside, many things are missing, including

constructive recognition of other groups, e.g.
- orthogonal groups
- symplectic groups
- exceptional groups
comprehensive documentation on various aspects.
verification, and a framework to implement it (more on that below)
better introspection into the result of the recognition. E.g. if a recognition step is about applying a base change, make it possible to extract the base change
and on the long run, the ability to rewrite the recognition info record to follow a "nice" series (chief series) to enable further algorithms, like the soluble radical method

Both of the above lists are certainly incomplete. Indeed, one of the tasks on the big TODO list is to identify what exactly is there and what is missing...

Goals, and how to help

In any case, we strongly believe that "completing" recog is highly important for computational group theory.

The following lists describe many open tasks that need to be done. If you are interested in helping out with any of them, please do so! But please first refer to the How to contribute page for some additional technical remarks.

A critical task here is to improve the documentation for adding new recognition methods. This should become as easy as possible, to allow people to contribute here without having to understand all of recog perfectly.

Improve documentation for users and developers

Goal: The recog manual should cater for both users and developers
- Users who just want to use it to work with groups, and who don't really want to be bothered by all the nitty-gritty details.
- Developers need to know how everything fits together

Some specific thoughts:

User documentation:

at the start of the manual, describe only the high level commands we think are interesting for "casual" users. Follow this by several illuminating examples. Chapter 5 of the recogbase manual goes a good way towards this goal, but one might argue that it should better be inside of the recog manual. Or perhaps in both?
During our discussions, it became quite clear that we had quite different views of what a "typical" user of recog might want to do. We identified two distinct use cases, and there are probably more. We would like to document what people expect from recog, as that helps deciding how to continue development. So please let us know if you have other wishes that are not covered here.
- Max needs to recognize millions of matrix groups that are the output of some other computation. He needs to perform orbit-stabilizer computations with the groups. In order to make these efficient, he needs to be able to perform membership tests, and compute sizes of groups. All of this is already perfectly covered by recog (though it could be faster... but that's always true for everything, I guess ;-)
- Alice would like much more mathematical information about the outcome of the recognition process. E.g. what Aschbacher class is a given group in? Depending on the class, provided a documented interface to extract relevant information, e.g.:
  - if a base change is involved, retrieve the base change matrices;
  - retrieve invariant subspaces
  - allow extracting "proof" that the group is in the given class; and/or that "why" a given node in the composition tree is "correct"
    - e.g. for reducible case, give proper non-trivial subspace
    - if group contains SL, give matrices in the group which generate one;
    - ...
  - provide "nice" user interface to this proof machinery (say, an "Explain()" method)

"Developer" documentation

as part of the manual:
- explain the bigger picture, i.e. how everything fits together, and
- document how to implement a new recognition method; ideally, this would be illustrated with one or two examples.
- Note: the recog and recogbase manuals actually already cover quite a bit of this, but it's often not easy to find the information one needs, short of reading everything.
- Clearly improving this is important to make it easier to contribute to recog.
as part of the source:
- recognition methods should be commented carefully, in particular:
  - if there are papers dealing with the implemented method, they should be listed in a source comment
  - a rough high level description in a few words or lines of what the algorithm does can also be a tremendous help
  - also: who implemented the algorithm? in case of questions...
- individual functions, even (and esp.) internal ones should have a source comment which states what kind of objects the various parameters are supposed to be. That is a tremendous help while debugging.

Determine which algorithms are implemented, and what is perhaps missing

we should have a list (in the code, and perhaps even the manual?!) of algorithms which are implemented; in particular proper citation and a bibliography.
- TODO: add this list here?!
similar, a list of algorithms that are not yet there, but which should be implemented! These include:
- constructive recognition of orthogonal and symplectic groups (in all variations, i.e. natural and non-natural representations; for matrix, permutation, black box groups; ...)
- constructive recognition of exceptional groups (in all variations)
- constructive recognition of sporadic groups (???)
besides this "technical" overview, we should also approach this from a mathematical angle:
- which (families of) groups are currently supported, which are not?
- note that sometimes groups are recognized, but only because they were small enough to be handled by some fallback method. E.g. small orthogonal groups "work" because recog can fallback to an isomorphism with a permutation group, and then applies Schreier-Sims. So, for families, it's a good idea to test multiple members, and in particular large ones
are all Aschbacher classes correctly covered?
- generate test inputs for which the structure is known, and then test whether recog handles
- Frank Lübeck and und Jürgen Müller provided examples, which are bundled with recog as examples/example*.g; these show some problem areas.
- Frank has the tools to generate these and more...

Verification

Currently, no verification is implemented at all.

One reason for that is of course the lack of more constructive recognition methods. So implementing more of that is of course highly important.

In addition, however, there is currently no infrastructure in recogbase which does this. We need to design and implement that. In particular:

Design a way how a recognition method can output a presentation for the recognized group.
- There is an attribute StdPresentation for that, but it currently is never set nor used, and completely undocumented. In particular: What data format should the presentations use? A GAP presentation object? Something else?
Change leaf recognition methods which perform constructive recognition to return such presentations (possibly extracted from Magma?)
Change non-leaf recognition methods to correctly assemble presentations from presentations for their factor and kernel children.

Test suite

We should add more tests, and improve existing ones.

make sure every method is triggered at least once by some test;
- we already have some tests in the tst directory; check what is there and what is missing, and add complementing tests
possibly import the Magma test suite for group recognition?!
generate test cases for all the (classes of) possible leaves
- small cases could be taken from AtlasRep
- Frank has some code to generate larger examples

Besides more and better tests, some meta level things should be done:

select a "quick" subset of the test suite that can be run in, say, under a minute on a typical developer machine, and make it very easy to run it; otherwise, people won't do that.
setup a build server which regularly runs the full test suite, to make sure we notice regressions quickly
- perhaps we can use the St Andrews Jenkins
- but there are many alternatives, e.g. Travis-CI; or perhaps Aachen or Gießen can host a build server
besides correctness, also setup tests for performance, and track that over time
- this way, we can determine when a change slows things down too much
- for bonus points, setup something like http://speed.pypy.org/
  - this is in fact based on https://github.com/tobami/codespeed
- it would also be interesting to compare examples between Magma and recog systematically, and note differences in (relative) performance. This can be helpful in identifying and fixing bottlenecks

Ensure (and "prove") correctness and termination

The test suite helps us to verify correctness and termination in practical examples. In addition, it would be good to ensure that recog is correct and terminates in general. Some thoughts

Max Neunhöffer's habilitation is highly relevant. Is it published? Ask Max about that, and if he gives permission, make it available in an easily accessible location
Correctness: Of course once we implement verification (via presentations), the results will in the end be correct.

But there are additional pitfalls here. See e.g. issue #1 on the recog issue tracker.

Termination is less clear; and potentially prone to breakage. In particular, what happens if the order of certain checks is changed (by accident... if one is not aware of the consequences...)? Then it is conceivable that the algorithm might e.g. run into a "loop" (converting back and forth between two different reps); or simply incorrect (because one method assumes that some other method was run before)

to this end, any dependencies between methods should/must be made explicit
for other method pairs, prove that there is no dependency...
thus, we require that every method always checks for every precondition it requires explicitly. If there is a potential penalty involved with this, then simply add a member to ri! E.g. set ri!.IsDiagonalGroup to true or false
- task: ensure this is the case for all existing methods
- also document for method authors what they need to do... including examples...
- allow disabling the generic fallback to orbit enumeration, so that we "directly" see which cases are missing
- write isolated test cases for individual methods (using TryFindHomMethod)
to avoid loops, perhaps even define a precise notion of "smaller" group, and prove that every method always outputs "smaller" results (compared to the input). However, that seems to be a quite difficult problem, and perhaps is not really necessary.

Random thoughts and ideas

Some immediate tasks
- revise meaning the return values for recognition methods: right now they are true, false, fail, NotApplicable. It is very easy to get confused about which means what, and indeed, the recog documentation and implementation partially contradict themselves or each other. Better names might be:
  - Success
  - NeverApplicable
  - NotYetSuccessful or TemporaryFailure
  - NotYetApplicable or NotYetEnoughInformation
- check whether we can get rid of "donotrecurse" (obsolete)
- More generally: we stumbled over function names which started with small letters or which were confusing to us because we wrongly guessed what they would do. Some overall renaming may be sensible.
- ...
determine status of recog/misc and recog/contrib
- what does that code do? do we still need it?
user interface needs to be improved (and documented)
- what exactly needs to be improved, resp. what exactly is missing or bad? ???
Implement "Mandarins" as "advertised" by Charles Leedham-Green
Alexander Hulpke wrote a partial "Magma to GAP" converter for a subset of Magma. This could potentially be used to help with various things:
- import Magma test suite
- import standard presentations from Magma
- ...
improve "basic" GAP functionality which is needed for recog (and of course everybody else)

for some recog computations, the GAP MeatAxe is a bottleneck some possible options:
- improve the Smash-Meataxe in GAP (-> Max made some progress on this)
- look at the Chop package https://bitbucket.org/tomandklaus/chop.git and whether we can use it (for some stuff, it is supposed to be very fast, but it does not implement "everything" -- but maybe enough for us????
- can we integrate the C-meataxe into GAP again?
  - either as a kernel extension...
  - ... or by invoking subprocesses?
    - AtlasRep already has a lot (all?) of the necessary code for this
- computation of minimal and characteristic polynomials
- better linear algebra; in particular, finish the MatrixObj project
  - finish the interface
  - document them
  - change ALL places / functions that work with matrices to work with MatrixObj
  - e.g. the MetaAxe
    - for this, we would also need test cases which exercise all its functions
- finite fields:
  - fast discrete log
  - fast root computation
  - efficient large finite fields (with more than 2^16 elements)

Find more isolated projects?
- for bachelor / master theses?
- for more advanced people
- for students payed to implement them?

Bugs

A list of currently known bugs can be found on the issue trackers for recog and recogbase:

If you are aware of any bugs not reported there, please submit an issue. If for some reason you can not or do not want to do this, you can also report issue on the recog mailing list.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly