Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom transforms for creating arbitrarily complex POROs with identity maps #76

Open
noteflakes opened this issue Jun 13, 2024 · 6 comments

Comments

@noteflakes
Copy link
Contributor

Right now Extralite supports custom transforms by providing a transform proc that takes the raw values and can return custom objects. A related idea is the ability to turn query results into arbitrarily complex PORO objects, with support for identity maps. This can be especially useful when performing joins where we want to turn the results into a bunch of related objects.

Some preliminary work towards this goal was done in commit bbcf12e.

Example query:

SELECT books.id, books.title, authors.id, authors.name
  FROM books LEFT OUTER JOIN authors ON authors.id = books.author_id

What we want is an identity map expressing relations as nested objects:

# The author object is the same for both book objects.
[
  {
    id: 1,
    title: "foo",
    author: {
      id: 1,
      name: "bar"
    }
  },
  {
    id: 2,
    title: "baz",
    author: {
      id: 1,
      name: "bar"
    }
  }
]

The projection transform is expressed using a DSL:

transform = Extralite.project do
  # the identity modifier is used to express which field is used as key to the identity map
  id.identity
  title
  author {
    id.identity
    name
  }
end

transform.class #=> Extralite::Transform

books = DB.query(transform, sql)

So instead of passing a proc, we pass a transform object. The transform is performed at the C-level, accompanied by identity maps for both book and author objects.

An important limitation for this is that if identity maps are used, the entire result set must be converted before any rows are returned to the application code.

@noteflakes
Copy link
Contributor Author

noteflakes commented Jun 13, 2024

Internal reperesentation of the transform:

SPEC = {
  identity_idx: 0,
  columns: [
    :id,
    :title,
    {
      name: :author,
      dentity_idx: 0,
      columns: [:id, :name]
    }
  ]
}

@noteflakes
Copy link
Contributor Author

Algorithm (without identity map):

class Transform
  def process(row_hash, spec)
    spec[:columns].each do |c|
      case c
      when Hash
        relation_hash = row_hash[c[:name]] = {}
        process(relation_hash, c)
      else
        row_hash[c] = get_next_column_value
      end
    end
  end
end

@noteflakes
Copy link
Contributor Author

With identity map:

def prepare_spec(spec)
  if spec[:identity_idx]
    spec[:id_map] = {}
  end
  spec[:columns].each do |c|
    prepare_spec(c) if c.is_a?(Hash)
  end
end

def process(spec)
  id = get_column_value(spec[:identity_idx])
  if (row = spec[:id_map][id])
    return row
  end

  row = {}
  populate_row(row, spec)
  row
end

def populate_row(row, spec)
  spec[:columns].each_with_index do |c, i|
    case c
    when Hash
      row[c[:name]] = process(c)
    else
      value = row[c] = get_next_column_value
      if i == spec[:identity_idx]
        spec[:id_map][value] = row
      end
    end
  end
end

@noteflakes
Copy link
Contributor Author

noteflakes commented Jun 13, 2024

Working example here: https://gist.github.com/noteflakes/3653ec742d76fe5b6e3557315fe8f503

I'm really pleased with how relatively simple it is. Remains to be seen how useful that can be... There are some additional features to add:

  • Support for one to one and one to many relationships.
  • Support for cyclical references. Using the example above, the author object might contain a list of all the relevant books.
  • Support for automatic "symmetric" relation where the relation is done on both sides, so automatic book[:author][:books]
  • Support for specifying column index.
  • Support for transforming individual values, e.g. use value as key for hash lookup, fetching some auxiliary object, etc.

@noteflakes
Copy link
Contributor Author

noteflakes commented Jun 16, 2024

Some more possibilities:

# custom value conversion
t = Extralite.transform do
  stamp(Time) # Time.at(v)
  value(Float) # v.to_f
  flag(:bool) # (v != 0)
  flag(MyClass) # MyClass.new(v)
  tag { |v| "(#{v})" } # custom conversion proc
end

# JSON object
t = Extralite.transform do
  __set(:json)
end

# Merge JSON
t = Extralite.transform do
  id.identity
  name
  __merge(:json)
end

# return identity map instead of array
t = Extralite.transform do
  emit_map
  id.identity
  name
end

# one_to_many relationship
t = Extralite.transform do
  id.identity
  title
  tags [{
    id.identity
    name
  }]
end

# one_to_many emitted as single values
t = Extralite.transform do
  id.identity
  title
  tags [{
    emit_single_value name
    id.identity
    name
  }]
end

@noteflakes
Copy link
Contributor Author

How are transforms used for actually fetching values:

t = Extralite.transform do
  id.identity
  title
  tags [{
    id.identity
    name
  }]
end

sql = <<~SQL
  select
    articles.id, articles.title,
    tags.id, tags.name
  from
    articles
  join
    tags on tags.article_id = articles.id
SQL

db.query(t, sql)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant