Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to latest graphrag library release #213

Merged
merged 45 commits into from
Jan 30, 2025

Conversation

jgbradley1
Copy link
Collaborator

@jgbradley1 jgbradley1 commented Jan 2, 2025

This PR is currently in progress. The intent is to upgrade the codebase to use the latest release of the graphrag library (v1.2.0), as well as make additional refactoring changes to simplify the code and improve readability.

The following list of changes in this PR will be updated as they are made:

Bicep Updates:

  • refactored the bicep code to be more generic and readable
  • consolidated all RBAC role assignments into separate bicep files under /infra/core/rbac/* to help clarify what RBAC roles are required in case manual application of RBAC roles is needed. RBAC role assignments are made to the AKS workload identity and the AKS service itself.
  • moved the logic of creating graphrag-specific cosmosdb databases/containers from bicep and into the backend docker image for better portability
  • fixed an RBAC deployment issue that would cause sporadic deployment failures. RBAC role assignments require a globally unique name identifier. When using guid() to name RBAC role assignments, enough information must be used to generate an identifier that is truly globally unique. Failure to enforce better global uniqueness can lead to deployment failures (sometimes).

Web App Updates - the backend graphrag API server

  • simplified the folder structure under backend/src - the web API code is now packaged up as a proper python package for easier deployment
  • extracted the core indexing logic out of the API code and into a separate script
  • refactored and cleaned up logging code to be simpler
  • fixed a logging issue that prevented complete log messages from getting sent to Application Insights
  • updated the auto-prompt tuning REST API endpoint to return prompts as a json response instead of a zip file
  • updated all code references and pytests to use v1.2.0 of the graphrag library
  • updated the web app code to hook into the graphrag library using the librarys API layer - this allowed for the removal of a lot of redundant code
  • updated pyproject.toml to use newer versions of python library dependencies
  • temporary removal of query streaming capability (will be added back in a future PR)
  • temporary removal of multi-index query capability (will be added back in a future PR)

CI/CD Updates

  • updated dependabot settings

@jgbradley1 jgbradley1 requested a review from a team as a code owner January 2, 2025 19:02
@jgbradley1 jgbradley1 changed the title Upgrade to graphrag v1.0.1 Upgrade to latest graphrag library release Jan 16, 2025
@jgbradley1 jgbradley1 merged commit 83316b2 into main Jan 30, 2025
8 checks passed
@jgbradley1 jgbradley1 deleted the joshbradley/upgrade-to-graphrag-v1.0.1 branch January 30, 2025 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant