How to implement "compilation check" for AutoFormalization (only if the Lean 4 is syntactically valid, has any bug) #71

brando90 · 2025-01-28T06:26:37Z

I wrote this:

def check_lean_compiles_strict(snippet: str, server: Server, require_no_goals: bool = True) -> bool:
    """
    Checks if a snippet is valid Lean 4 code, *strictly*, by:
      1) server.load_sorry(file=snippet) -> returns list[CompilationUnit]
      2) For each CompilationUnit:
         - If it has any `messages` containing "error", or
         - If it has leftover goals (goal_state != None) and we require full proof => fail
    If no errors or leftover goals, we declare success.

    Args:
        snippet: Lean 4 code snippet
        server:  Pantograph Server
        require_no_goals: if True, any leftover goals => fail

    Returns:
        bool: True if fully compiled with no leftover errors or goals
    """
    try:
        units = server.load_sorry(snippet)
        # If "error" in result => already raises ServerError, so we skip the except block
    except Exception:
        # Hard parse error or something
        return False

    # Now check each CompilationUnit for leftover messages or goals
    for cu in units:
        # If there's any "error" or "error:" mention in cu.messages => fail
        for msg in cu.messages:
            lower_msg = msg.lower()
            # A typical check: if "error" is in the string => fail
            if "error" in lower_msg:
                return False

        # If user wants no leftover goals, but .goal_state is not None => we see that as partial
        if require_no_goals and cu.goal_state is not None:
            # Means there's at least one sorry/hole or leftover type error, so fail
            return False

    # If we pass all checks => success
    return True

which checks if there is any type of compilation bug. I think that should work. Right?

Context: this is for the crucial pass @ k metric for predictable evaluations of LLMs for AutoFormalization. I will add this later.

lenianiva · 2025-01-29T01:48:22Z

yeah that should work

Just need some fail-safe in case a user names a variable error or something

brando90 added the help wanted Extra attention is needed label Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to implement "compilation check" for AutoFormalization (only if the Lean 4 is syntactically valid, has any bug) #71

How to implement "compilation check" for AutoFormalization (only if the Lean 4 is syntactically valid, has any bug) #71

brando90 commented Jan 28, 2025

lenianiva commented Jan 29, 2025

How to implement "compilation check" for AutoFormalization (only if the Lean 4 is syntactically valid, has any bug) #71

How to implement "compilation check" for AutoFormalization (only if the Lean 4 is syntactically valid, has any bug) #71

Comments

brando90 commented Jan 28, 2025

lenianiva commented Jan 29, 2025