Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to implement "compilation check" for AutoFormalization (only if the Lean 4 is syntactically valid, has any bug) #71

Open
brando90 opened this issue Jan 28, 2025 · 1 comment
Labels
help wanted Extra attention is needed

Comments

@brando90
Copy link
Collaborator

I wrote this:

def check_lean_compiles_strict(snippet: str, server: Server, require_no_goals: bool = True) -> bool:
    """
    Checks if a snippet is valid Lean 4 code, *strictly*, by:
      1) server.load_sorry(file=snippet) -> returns list[CompilationUnit]
      2) For each CompilationUnit:
         - If it has any `messages` containing "error", or
         - If it has leftover goals (goal_state != None) and we require full proof => fail
    If no errors or leftover goals, we declare success.

    Args:
        snippet: Lean 4 code snippet
        server:  Pantograph Server
        require_no_goals: if True, any leftover goals => fail

    Returns:
        bool: True if fully compiled with no leftover errors or goals
    """
    try:
        units = server.load_sorry(snippet)
        # If "error" in result => already raises ServerError, so we skip the except block
    except Exception:
        # Hard parse error or something
        return False

    # Now check each CompilationUnit for leftover messages or goals
    for cu in units:
        # If there's any "error" or "error:" mention in cu.messages => fail
        for msg in cu.messages:
            lower_msg = msg.lower()
            # A typical check: if "error" is in the string => fail
            if "error" in lower_msg:
                return False

        # If user wants no leftover goals, but .goal_state is not None => we see that as partial
        if require_no_goals and cu.goal_state is not None:
            # Means there's at least one sorry/hole or leftover type error, so fail
            return False

    # If we pass all checks => success
    return True

which checks if there is any type of compilation bug. I think that should work. Right?

Context: this is for the crucial pass @ k metric for predictable evaluations of LLMs for AutoFormalization. I will add this later.

@brando90 brando90 added the help wanted Extra attention is needed label Jan 28, 2025
@lenianiva
Copy link
Member

yeah that should work

Just need some fail-safe in case a user names a variable error or something

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants