Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating Databricks Available Models #129

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -16,7 +16,7 @@ BugReports: https://github.com/mlverse/chattr/issues
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
RoxygenNote: 7.3.2
Imports:
rstudioapi,
lifecycle,
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# chattr (dev)

* Updating support for Databricks to use newer Meta Llama models
* Prevents OpenAI 4o from showing as an option if no token is found

# chattr 0.2.0
7 changes: 3 additions & 4 deletions R/chattr-use.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#' Sets the LLM model to use in your session
#' @param x The label of the LLM model to use, or the path of a valid YAML
#' default file . Valid values are 'copilot', 'gpt4', 'gpt35', 'llamagpt',
#' 'databricks-dbrx', 'databricks-meta-llama3-70b', and 'databricks-mixtral8x7b'.
#' 'databricks-meta-llama-3-1-405b-instruct', and 'databricks-meta-llama-3-3-70b-instruct'
#' The value 'test' is also acceptable, but it is meant for package examples,
#' and internal testing.
#' @param ... Default values to modify.
@@ -97,9 +97,8 @@ ch_get_ymls <- function(menu = TRUE) {
}

if (!dbrx_exists) {
prep_files$`databricks-dbrx` <- NULL
prep_files$`databricks-meta-llama3-70b` <- NULL
prep_files$`databricks-mixtral8x7b` <- NULL
prep_files$`databricks-meta-llama-3-1-405b` <- NULL
prep_files$`databricks-meta-llama-3-3-70b` <- NULL
}

if (!llama_exists) {
133 changes: 64 additions & 69 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
---
editor_options:
markdown:
wrap: 72
---

# chattr

<!-- badges: start -->
@@ -10,6 +16,7 @@ status](https://www.r-pkg.org/badges/version/chattr.png)](https://CRAN.R-project
[![](man/figures/lifecycle-experimental.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)

<!-- badges: end -->

<!-- toc: start -->

- [Intro](#intro)
@@ -25,7 +32,7 @@ status](https://www.r-pkg.org/badges/version/chattr.png)](https://CRAN.R-project

<!-- toc: end -->

## Intro
## Intro {#intro}

`chattr` is an interface to LLMs (Large Language Models). It enables
interaction with the model directly from the RStudio IDE. `chattr`
@@ -37,7 +44,7 @@ tasks. The additional information appended to your request, provides a
sort of “guard rails”, so that the packages and techniques we usually
recommend as best practice, are used in the model’s responses.

## Install
## Install {#install}

Since this is a very early version of the package install the package
from GitHub:
@@ -46,74 +53,56 @@ from GitHub:
remotes::install_github("mlverse/chattr")
```

## Available models
## Available models {#available-models}

`chattr` provides two main integration with two main LLM back-ends. Each
back-end provides access to multiple LLM types. The plan is to add more
back-ends as time goes by:

<table style="width:100%;">
<colgroup>
<col style="width: 28%" />
<col style="width: 46%" />
<col style="width: 24%" />
</colgroup>
<thead>
<tr class="header">
<th style="text-align: center;">Provider</th>
<th style="text-align: center;">Models</th>
<th style="text-align: center;">Setup Instructions</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: center;">OpenAI</td>
<td style="text-align: center;">GPT Models accessible via the OpenAI’s
REST API. <code>chattr</code> provides a convenient way to interact with
GPT 4, and 3.5.</td>
<td style="text-align: center;"><a
href="https://mlverse.github.io/chattr/articles/openai-gpt.html">Interact
with OpenAI GPT models</a></td>
</tr>
<tr class="even">
<td style="text-align: center;"><a
href="https://github.com/kuvaus/LlamaGPTJ-chat">LLamaGPT-Chat</a></td>
<td style="text-align: center;">LLM models available in your computer.
Including GPT-J, LLaMA, and MPT. Tested on a <a
href="https://gpt4all.io/index.html">GPT4ALL</a> model.
<strong>LLamaGPT-Chat</strong> is a command line chat program for models
written in C++.</td>
<td style="text-align: center;"><a
href="https://mlverse.github.io/chattr/articles/backend-llamagpt.html">Interact
with local models</a></td>
</tr>
<tr class="odd">
<td style="text-align: center;"><a
href="https://docs.posit.co/ide/user/ide/guide/tools/copilot.html">GitHub
Copilot</a></td>
<td style="text-align: center;">AI pair programmer that offers
autocomplete-style suggestions as you code</td>
<td style="text-align: center;"><a
href="https://mlverse.github.io/chattr/articles/copilot-chat.html">Interact
with GitHub Copilot Chat</a></td>
</tr>
<tr class="even">
<td style="text-align: center;"><a
href="https://docs.databricks.com/en/machine-learning/foundation-models/index.html#databricks-foundation-model-apis">Databricks</a></td>
<td style="text-align: center;">DBRX, Meta Llama 3 70B, and Mixtral 8x7B
via <a
href="https://docs.databricks.com/en/machine-learning/foundation-models/index.html#pay-per-token-foundation-model-apis">Databricks
foundational model REST API</a>.</td>
<td style="text-align: center;"><a
href="https://mlverse.github.io/chattr/articles/backend-databricks.html">Interact
with Databricks foundation chat models</a></td>
</tr>
</tbody>
</table>

## Using

### The App
+------------------+-------------------------------+----------------+
| Provider | Models | Setup |
| | | Instructions |
+:================:+:=============================:+:==============:+
| OpenAI | GPT Models accessible via the | [Interact with |
| | OpenAI’s REST API. `chattr` | OpenAI GPT |
| | provides a convenient way to | models](ht |
| | interact with GPT 4, and 3.5. | tps://mlverse. |
| | | github.io/chat |
| | | tr/articles/op |
| | | enai-gpt.html) |
+------------------+-------------------------------+----------------+
| [LLamaGPT | LLM models available in your | [Interact with |
| -Chat](https://g | computer. Including GPT-J, | local |
| ithub.com/kuvaus | LLaMA, and MPT. Tested on a | mo |
| /LlamaGPTJ-chat) | [GPT4ALL](h | dels](https:// |
| | ttps://gpt4all.io/index.html) | mlverse.github |
| | model. **LLamaGPT-Chat** is a | .io/chattr/art |
| | command line chat program for | icles/backend- |
| | models written in C++. | llamagpt.html) |
+------------------+-------------------------------+----------------+
| [GitHub | AI pair programmer that | [Interact with |
| Copil | offers autocomplete-style | GitHub Copilot |
| ot](https://docs | suggestions as you code | Chat](http |
| .posit.co/ide/us | | s://mlverse.gi |
| er/ide/guide/too | | thub.io/chattr |
| ls/copilot.html) | | /articles/copi |
| | | lot-chat.html) |
+------------------+-------------------------------+----------------+
| [Databricks | Meta Llama 3.1 405B and 3.3 | [Interact with |
| ](https://docs.d | 70B via [Databricks | Databricks |
| atabricks.com/en | foundational model REST | foundation |
| /machine-learnin | API | chat |
| g/foundation-mod | ](https://docs.databricks.com | mode |
| els/index.html#d | /en/machine-learning/foundati | ls](https://ml |
| atabricks-founda | on-models/index.html#pay-per- | verse.github.i |
| tion-model-apis) | token-foundation-model-apis). | o/chattr/artic |
| | | les/backend-da |
| | | tabricks.html) |
+------------------+-------------------------------+----------------+

## Using {#using}

### The App {#the-app}

The main way to use `chattr` is through the Shiny Gadget app. By
default, in RStudio the app will run inside the Viewer pane. `chattr`
@@ -192,13 +181,13 @@ The screen that opens will contain the following:

![Screenshot of the Sniny gadget options](man/figures/readme/chat2.png)

### Additional ways to interact
### Additional ways to interact {#additional-ways-to-interact}

Apart from the Shiny app, `chattr` provides two more ways to interact
with the LLM. For details, see: [Other
interfaces](https://mlverse.github.io/chattr/articles/other-interfaces.html)

## How it works
## How it works {#how-it-works}

`chattr` enriches your request with additional instructions, name and
structure of data frames currently in your environment, the path for the
@@ -255,7 +244,7 @@ chattr(preview = TRUE)
#> [Your future prompt goes here]
```

## Keyboard Shortcut
## Keyboard Shortcut {#keyboard-shortcut}

The best way to access `chattr`’s app is by setting up a keyboard
shortcut for it. This package includes an RStudio Addin that gives us
@@ -264,23 +253,29 @@ to be assigned to the addin. The name of the addin is: “Open Chat”. If
you are not familiar with how to assign a keyboard shortcut see the next
section.

### How to setup the keyboard shortcut
### How to setup the keyboard shortcut {#how-to-setup-the-keyboard-shortcut}

- Select *Tools* in the top menu, and then select *Modify Keyboard
Shortcuts*

```{=html}
<img src="man/figures/readme/keyboard-shortcuts.png" width="700"
alt="Screenshot that shows where to find the option to modify the keyboard shortcuts" />
```

- Search for the `chattr` adding by writing “open chat”, in the search
box

```{=html}
<img src="man/figures/readme/addin-find.png" width="500"
alt="Screenshot that shows where to input the addin search" />
```

- To select a key combination for your shortcut, click on the Shortcut
box and then type *press* the key combination in your keyboard. In
my case, I chose *Ctrl+Shift+C*

```{=html}
<img src="man/figures/readme/addin-assign.png" width="500"
alt="Screenshot that shows what the interface looks like when a shortcut has been selected" />
```
1 change: 1 addition & 0 deletions chattr.Rproj
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
Version: 1.0
ProjectId: 539355de-b8e8-4dff-931c-49b5461c9f07

RestoreWorkspace: Default
SaveWorkspace: Default
Original file line number Diff line number Diff line change
@@ -3,8 +3,8 @@ default:
{readLines(system.file('prompt/base.txt', package = 'chattr'))}
provider: Databricks
path: serving-endpoints
model: databricks-meta-llama-3-70b-instruct
label: Meta Llama 3 70B (Databricks)
model: databricks-meta-llama-3-1-405b-instruct
label: Meta Llama 3.1 405B (Databricks)
max_data_files: 0
max_data_frames: 0
include_doc_contents: FALSE
Original file line number Diff line number Diff line change
@@ -3,8 +3,8 @@ default:
{readLines(system.file('prompt/base.txt', package = 'chattr'))}
provider: Databricks
path: serving-endpoints
model: databricks-dbrx-instruct
label: DBRX (Databricks)
model: databricks-meta-llama-3-3-70b-instruct
label: Meta Llama 3.3 70B (Databricks)
max_data_files: 0
max_data_frames: 0
include_doc_contents: FALSE
33 changes: 0 additions & 33 deletions inst/configs/databricks-mixtral8x7b.yml

This file was deleted.

2 changes: 1 addition & 1 deletion man/chattr_use.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions tests/testthat/_snaps/backend-databricks.md
Original file line number Diff line number Diff line change
@@ -49,8 +49,8 @@
Message
* Provider: Databricks
* Path/URL: serving-endpoints
* Model: databricks-meta-llama-3-70b-instruct
* Label: Meta Llama 3 70B (Databricks)
* Model: databricks-meta-llama-3-3-70b-instruct
* Label: Meta Llama 3.3 70B (Databricks)
! A list of the top 10 files will be sent externally to Databricks with every request
To avoid this, set the number of files to be sent to 0 using `chattr::chattr_defaults(max_data_files = 0)`
! A list of the top 10 data.frames currently in your R session will be sent externally to Databricks with every request
8 changes: 4 additions & 4 deletions tests/testthat/test-backend-databricks.R
Original file line number Diff line number Diff line change
@@ -4,7 +4,7 @@ test_that("Submit method works", {
return("test return")
}
)
def <- test_simulate_model("databricks-meta-llama3-70b.yml")
def <- test_simulate_model("databricks-meta-llama-3-3-70b.yml")
expect_equal(
ch_submit(def, "test"),
"test return"
@@ -32,7 +32,7 @@ test_that("Completion function works", {
x
}
)
def <- test_simulate_model("databricks-meta-llama3-70b.yml")
def <- test_simulate_model("databricks-meta-llama-3-3-70b.yml")
expect_null(
ch_databricks_complete(
prompt = "test",
@@ -57,7 +57,7 @@ test_that("Error when status is not 200", {
x
}
)
def <- test_simulate_model("databricks-meta-llama3-70b.yml")
def <- test_simulate_model("databricks-meta-llama-3-3-70b.yml")
expect_error(
ch_databricks_complete(
prompt = "test",
@@ -84,7 +84,7 @@ test_that("Missing host returns error", {


test_that("Init messages work", {
def <- test_simulate_model("databricks-meta-llama3-70b.yml")
def <- test_simulate_model("databricks-meta-llama-3-3-70b.yml")
def$max_data_files <- 10
def$max_data_frames <- 10
expect_snapshot(app_init_message(def))
6 changes: 5 additions & 1 deletion tests/testthat/test-chattr-use.R
Original file line number Diff line number Diff line change
@@ -31,7 +31,11 @@ test_that("Missing token prevents showing the option", {
test_that("Menu works", {
skip_on_cran()
withr::with_envvar(
new = c("OPENAI_API_KEY" = "test", "DATABRICKS_TOKEN" = NA),
new = c(
"OPENAI_API_KEY" = "test",
"DATABRICKS_HOST" = NA,
"DATABRICKS_TOKEN" = NA
),
{
local_mocked_bindings(
menu = function(...) {
22 changes: 10 additions & 12 deletions vignettes/backend-databricks.Rmd
Original file line number Diff line number Diff line change
@@ -22,17 +22,16 @@ knitr::opts_chunk$set(
[Databricks](https://docs.databricks.com/en/introduction/index.html)
customers have access to [foundation model
APIs](https://docs.databricks.com/en/machine-learning/foundation-models/index.html)
like DBRX, Meta Llama 3 70B, and Mixtral 8x7B. Databricks also provides
the ability to train and [deploy custom
like Meta Llama 3.3 70B and 3.1 405B. Databricks also provides the
ability to train and [deploy custom
models](https://docs.databricks.com/en/machine-learning/foundation-models/deploy-prov-throughput-foundation-model-apis.html).

`chattr` supports the following models on Databricks by default:

| Model | Databricks Model Name | `chattr` Name |
| Model | Databricks Model Name | `chattr` Name |
|---------------------|------------------------------|---------------------|
| DBRX Instruct | `databricks-dbrx-instruct` | `databricks-dbrx` |
| Meta-Llama-3-70B-Instruct | `databricks-meta-llama-3-70b-instruct` | `databricks-meta-llama3-70b` |
| Mixtral-8x7B Instruct | `databricks-mixtral-8x7b-instruct` | `databricks-mixtral8x7b` |
| Meta Llama 3.3 70B Instruct | `databricks-meta-llama-3-3-70b-instruct` | `databricks-meta-llama3.3-70b` |
| Meta Llama 3.1 405B Instruct | `databricks-meta-llama-3-1-405b-instruct` | `databricks-meta-3.1-405b` |

: [Supported Databricks pay-per-token foundation
models](https://docs.databricks.com/en/machine-learning/foundation-models/index.html#pay-per-token-foundation-model-apis)
@@ -84,12 +83,12 @@ DATABRICKS_TOKEN = ####################
### Supported Models

By default, `chattr` is setup to interact with GPT 4 (`gpt-4`). To
switch to Meta Llama 3 70B you can run:
switch to Meta Llama 3.3 70B you can run:

```{r}
library(chattr)
chattr_use("databricks-meta-llama3-70b")
chattr_use("databricks-meta-llama-3-3-70b")
```

#### Custom Models
@@ -107,9 +106,9 @@ endpoint name of `"CustomLLM"`:
```{r}
library(chattr)
# use any existing databricks foundation model name (e.g. datarbicks-dbrx)
# then adjust the default model name to 'CustomMixtral'
chattr_use(x = "databricks-dbrx", model = "CustomLLM")
# use any existing databricks foundation model name (e.g. databricks-meta-3.1-405b)
# then adjust the default model name to 'CustomLLM'
chattr_use(x = "databricks-meta-llama-3-1-405b", model = "CustomLLM")
```

## Data files and data frames
@@ -139,4 +138,3 @@ time you start the Shiny app:
! A list of the top 10 files will be sent externally to Databricks with every request
To avoid this, set the number of files to be sent to 0 using chattr::chattr_defaults(max_data_files = 0)Î
```