🤗 Transformers-CFG

Transformers-CFG is a fork of the Hugging Face Transformers library adding support for Context-Free Grammar based Constrained Generation methods. The library is trying to keep up-to-date with the main branch of the 🤗 Transformers library. The library tries to offer a compatible interface to llama-cpp project.

Installation

pip install git+https://github.com/epfl-dlab/transformers-CFG

QuickStart: Force LLM to generate a valid json object

The below example can be found in examples/pytorch/text-geenration/grammar_constrained_generation.py

from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers.generation.grammar_utils import IncrementalGrammarConstraint
from transformers.generation.logits_process import GrammarConstrainedLogitsProcessor


if __name__ == "__main__":

    # Load model and tokenizer
    tokenizer = AutoTokenizer.from_pretrained("gpt2")
    tokenizer.pad_token = tokenizer.eos_token
    model = AutoModelForCausalLM.from_pretrained("gpt2")

    # Load json grammar
    with open("examples/grammars/json.gbnf", "r") as file:
        grammar_str = file.read()
    grammar = IncrementalGrammarConstraint(grammar_str, "root", tokenizer)
    grammar_processor = GrammarConstrainedLogitsProcessor(grammar)


    # Generate
    prefix1 = "This is a valid json string for http request:"
    prefix2 = "This is a valid json string for shopping cart:"
    input_ids = tokenizer([prefix1, prefix2], add_special_tokens=False, return_tensors="pt", padding=True)["input_ids"]

    output = model.generate(
        input_ids,
        do_sample=False,
        max_length=50,
        num_beams=2,
        logits_processor=[grammar_processor],
        repetition_penalty=5.0,
        num_return_sequences=1,
    )
    # decode output
    generations = tokenizer.batch_decode(output, skip_special_tokens=True)
    print(generations)

    """
    'This is a valid json string for http request:{ "request": { "method": "GET", "headers": [], "content": "Content","type": "application" }}
    'This is a valid json string for shopping cart:This is a valid json string for shopping cart:{ "name": "MyCart", "price": 0, "value": 1 }
    """

Grammar Collection

We provide a collection of grammars in the examples/grammars folder, which are mostly identical to the grammars in llama-cpp project. We try to keep the grammars up-to-date with the original grammars from llama-cpp project. But up to now, we can not yet guarantee that all grammars from llama-cpp project can be directly used in transformers-CFG.

The list of grammars contains:

json.gbnf: A grammar for generating valid json objects.
c.gbnf: A grammar for generating valid C programs.
chess.gbnf: A grammar for generating valid chess moves.
arithmetic.gbnf: A grammar for generating valid arithmetic expressions.

Why should I use transformers-CFG?

We offer the same grammar interface as llama-cpp project, allowing you to drop-in replace llama-cpp with transformers-CFG.
We allow you to use any of the models in the 🤗 Transformers library, including the ones that are not supported by llama-cpp.

Name		Name	Last commit message	Last commit date
Latest commit History 14,631 Commits
.circleci		.circleci
.github		.github
docker		docker
docs		docs
examples		examples
model_cards		model_cards
notebooks		notebooks
scripts		scripts
src/transformers		src/transformers
templates		templates
tests		tests
utils		utils
.coveragerc		.coveragerc
.gitattributes		.gitattributes
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
ISSUES.md		ISSUES.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
README_es.md		README_es.md
README_hd.md		README_hd.md
README_ja.md		README_ja.md
README_ko.md		README_ko.md
README_pt-br.md		README_pt-br.md
README_ru.md		README_ru.md
README_te.md		README_te.md
README_zh-hans.md		README_zh-hans.md
README_zh-hant.md		README_zh-hant.md
SECURITY.md		SECURITY.md
awesome-transformers.md		awesome-transformers.md
conftest.py		conftest.py
hubconf.py		hubconf.py
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤗 Transformers-CFG

Installation

QuickStart: Force LLM to generate a valid json object

Grammar Collection

Why should I use transformers-CFG?

About

Releases

Packages

Languages

License

epfl-dlab/transformers-GCD-PR

Folders and files

Latest commit

History

Repository files navigation

🤗 Transformers-CFG

Installation

QuickStart: Force LLM to generate a valid json object

Grammar Collection

Why should I use transformers-CFG?

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages