Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] JSONB support #710

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from
Draft

[WIP] JSONB support #710

wants to merge 10 commits into from

Conversation

madejejej
Copy link
Contributor

@madejejej madejejej commented Jan 16, 2025

This PR adds a very basic JSONB support:

  • it can parse some of the SQLite's JSONB format
  • it does not implement any new functions
  • however, we can use the existing JSON functions to work with the JSONB format:
➜  limbo git:(jsonb) ✗ sqlite3 jsonb.db
SQLite version 3.47.2 2024-12-07 20:39:59
Enter ".help" for usage hints.
sqlite> create table jsonbs (key Text, val Blob);
sqlite> insert into jsonbs (key, val) VALUES ('int', jsonb(5)), ('text', jsonb('"abcdef"')), ('array', jsonb('[1,2,3]')), ('object', jsonb('{"a":1,"b":2}'));
sqlite> select json(val) from jsonbs;
5
"abcdef"
[1,2,3]
{"a":1,"b":2}

➜  limbo git:(jsonb) ✗ target/debug/limbo jsonb.db 
Limbo v0.0.12
Enter ".help" for usage hints.
limbo> select * from jsonbs;
int|5
text|gabcdef
array|k123
object|�a1b2
limbo> select json(val) from jsonbs;
5
"abcdef"
[1,2,3]
{"a":1,"b":2}

I'd like to get it merged to the main branch so we can let more people contribute towards JSONB.

My suggestion towards JSON/JSONB support is that we first make it work and then make it performant.


Refs:

@seridescent
Copy link

hey! i'd be interested in helping out with this, if thats okay with you. FWIW, i was also looking into JSONB support, so i'd like to think i could help out.

do you have a strong idea of how this would be broken down? anything you'd be interested in delegating?

@madejejej
Copy link
Contributor Author

@seridescent I see a few possible ways of splitting this up:

  1. By functionality: serialization/deserialization
  2. By splitting into deep/board:
  • deep: implementing differences between Text / TextJ / TextRaw ; Int / Int5 ; Float / Float5
  • broad: high-level skeleton
  1. Any suggestions from your end?

What would you rather do?

I'm in the phase of exploration right now, so I'm not yet sure what I'm doing 😂

I believe an important part of this task would be to understand how we want to pass the JSONB value around. SQLite has a JsonParse struct which is used internally by the JSON functions.

I'm not yet sure how SQLite converts it to sqlite3_value later on (that would be equivalent of converting to OwnedValue in Limbo). We'd use OwnedValue::Blob but do we have to introduce a JSONB subtype the same way I did in this PR? https://github.com/tursodatabase/limbo/pull/504/files#diff-30db18ff60c94c1b484132d8b04503416e7e34d3ff4cfa306d08318be3ef69aeR63

@madejejej
Copy link
Contributor Author

We might also want to write an extensive Python test that:

  1. Uses SQLite to write many possible JSONB variants into a file database
  2. Uses Limbo to read the JSONB back from the SQLite database and compare results (using the hex function?)
  3. Then, does a vice-versa process - Limbo as a writer and SQLite as a reader

@seridescent
Copy link

i don't have a preference right now. interested in helping but i haven't contributed to limbo yet, so you probably have a better idea of how collaborating might work than i do.

re: JSONB IR -> sqlite3_value
do you mean this function or something else? at first glance, it doesn't look like the blob would also have a JSONB type, but i didn't look that hard.

I believe an important part of this task would be to understand how we want to pass the JSONB value around. SQLite has a JsonParse struct which is used internally by the JSON functions.

ah yeah, that makes a lot of sense. i wrote some preliminary thoughts in the other PR (sorry to split discussion threads) wrt usable vs. performant internal representations, but best to discuss with others as well.

@madejejej
Copy link
Contributor Author

re: JSONB IR -> sqlite3_value do you mean this function or something else? at first glance, it doesn't look like the blob would also have a JSONB type, but i didn't look that hard.

Yeah, this one! For TEXT, the function sets a JSON subtype using a flag:

sqlite3_result_subtype(ctx, JSON_SUBTYPE);

If it doesn't do anything special for blobs, I'm wondering how it optimizes chained json function calls, eg. SELECT jsonb('{"a":[1,2,3]}') -> 'a' -> 1. Would SQLite run parsing for jsonb('{"a":[1,2,3]}') -> 'a' again or is there some magic to prevent it?

I believe an important part of this task would be to understand how we want to pass the JSONB value around. SQLite has a JsonParse struct which is used internally by the JSON functions.

ah yeah, that makes a lot of sense. i wrote some preliminary thoughts in the other PR (sorry to split discussion threads) wrt usable vs. performant internal representations, but best to discuss with others as well.

Replied there 👍

i don't have a preference right now. interested in helping but i haven't contributed to limbo yet, so you probably have a better idea of how collaborating might work than i do.

I'd say feel free to do whatever interests you most :) I'm on Limbo's Discord so we can chat there :)

@madejejej madejejej force-pushed the jsonb branch 2 times, most recently from 3bb338e to 478ae1b Compare February 10, 2025 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants