Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize multiplicity witgen #2043

Merged
merged 26 commits into from
Nov 8, 2024
Merged

Generalize multiplicity witgen #2043

merged 26 commits into from
Nov 8, 2024

Conversation

georgwiese
Copy link
Collaborator

(mostly cherry-picked from #1958)

This PR generalizes the multiplicity witness generation we had in FixedLookup in the following ways:

  • Instead of searching for a column of a specific name, takes the witness column from the phantom lookups
  • As a result, it is also able to handle arbitrarily many lookups
  • The witness generation should work for all machine types which can be connected via lookups: Fixed lookup, block machines, and secondary VMs.

A common use-case that is not yet covered is global range constraints, as they are not represented as machine calls in witness generation. This should be fixed in a follow-up PR.

@georgwiese georgwiese changed the title [WIP] Generalize multiplicity witgen Generalize multiplicity witgen Nov 5, 2024
@georgwiese georgwiese marked this pull request as ready for review November 5, 2024 18:31
@Schaeff Schaeff self-requested a review November 7, 2024 08:44
})
.collect::<BTreeMap<_, _>>();

// Multiplicity columns are not connected to any machine, because we removed identities that reference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that the higher-stage polynomial identities involving multiplicity columns are removed, but what about the phantom lookup? Shouldn't the RHS columns include the multiplicity column and thus be connected to the rest of the columns?

So I think we should change all_row_connected_witnesses to include the multiplicity column on the RHS columns.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the multiplicity does not appear in the RHS! It is an extra field in the phantom lookup that is the ignored by the rest of witgen (which treats phantom lookups and native ones the same).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But shouldn't we change all_row_connected_witnesses?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the multiplicities to lookup_witnesses, which is the starting point of all_row_connected_witnesses. Now, they don't need to be explicitly removed anymore.

@@ -186,16 +229,41 @@ pub fn split_out_machines<'a, T: FieldElement>(
// Note that this machine comes last, because some machines do a fixed lookup
// in their take_witness_col_values() implementation.
// TODO: We should also split this up and have several instances instead.

// Compute sizes of fixed lookup multiplicity columns.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to move the creation of the fixed lookup machine to a helper function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to FixedLookup::new


/// A map poly_id -> (index -> count).
/// Has a (possibly entry) entry for each value in `identity_id_to_multiplicity_column`.
counts: BTreeMap<PolyID, BTreeMap<usize, usize>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be better to use Vec instead of a BTreeMap for the inner type. Did you test that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, a Vec that would be resized on demand (because we don't know the number of rows a priori)?

I have not tested this, but:

  • The counts should be accessed at most once per machine call, so I wouldn't expect a BTreeMap access to be a bottleneck.
  • I'd expect the vector to be sparse in practice. We typically use permutations to connect machines, so in practice I'd expect LogUp to be used only for fixed lookups (typically of size $2^{16}$ to $2^{18}$) and global range constraints. Only the former is handled in this PR (range constraints will need another pass over the data in an upcoming PR).

executor/src/witgen/generator.rs Outdated Show resolved Hide resolved

/// For a given identity ID, increments the count of the corresponding multiplicity column at the given index.
pub fn increment(&mut self, identity_id: u64, index: usize) {
if let Some(poly_id) = self.identity_id_to_multiplicity_column.get(&identity_id) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this panic if the identity is not in there instead of currently silently continuing?

Copy link
Collaborator Author

@georgwiese georgwiese Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The increment function is called for any machine call, but unless the machine is connected via a phantom lookup, no multiplicity needs to be tracked and this is a no-op, because identity_id_to_multiplicity_column would not have a key for identity_id.

executor/src/witgen/machines/block_machine.rs Outdated Show resolved Hide resolved
Comment on lines 40 to 71
pub fn generate_columns_single_size<T: FieldElement>(
&self,
size: DegreeType,
) -> BTreeMap<PolyID, Vec<T>> {
self.counts
.iter()
.map(|(poly_id, counts)| {
let mut column = vec![T::zero(); size as usize];
for (index, count) in counts {
column[*index] = T::from(*count as u64);
}
(*poly_id, column)
})
.collect()
}

/// Materializes the multiplicity columns, using different sizes for each of them.
pub fn generate_columns_different_sizes<T: FieldElement>(
&self,
sizes: BTreeMap<PolyID, DegreeType>,
) -> BTreeMap<PolyID, Vec<T>> {
self.counts
.iter()
.map(|(poly_id, counts)| {
let size: DegreeType = sizes[poly_id];
let mut column = vec![T::zero(); size as usize];
for (index, count) in counts {
column[*index] = T::from(*count as u64);
}
(*poly_id, column)
})
.collect()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way we can reduce duplication here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I can call generate_columns_different_sizes in generate_columns_single_size.

@georgwiese
Copy link
Collaborator Author

Nice, adding the multiplicity to Connection instead of maintaining another map improved the code a lot I think! Now, the code to create the fixed lookup is outrageously verbose and contains many duplications, planning to fix that before the next round of review.

@georgwiese
Copy link
Collaborator Author

OK, I think this is ready for another look :)

.insert(connection.id, *connection)
.is_none());
if let Some(multiplicity) = connection.multiplicity_column {
remaining_witnesses.remove(&multiplicity);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the multiplicity column occurs in a row-connected identity? I think we should at least check that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this branch, we're talking about a fixed lookup, right? So the entire RHS is fixed columns. Do you mean that there could be constraints that connect those to witness columns?

If yes, I think that would be quite esoteric (those witness columns would not depend on an input and could be derived from the fixed columns at compile time), and it was also not checked before this PR (we continued for any connection without witness columns on the RHS).


for connection in &all_connections {
// If the RHS only consists of fixed columns, record the connection and continue.
if connection.is_lookup()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe extract the condition into a function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I added a FixedLookup::is_responsible. In the near future, it should just have a try_new() like any other machine...


// Extract all witness columns in the RHS of the lookup.
let lookup_witnesses = {
let callee_columns = refs_in_selected_expressions(connection.right)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not introduce a function refs_in_connection_rhs that includes the multiplicity?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! I think this actually fixed a bug: If there are multiple phantom lookups with distinct multiplicities, we now include all multiplicities in the machine witness.

@chriseth chriseth added this pull request to the merge queue Nov 8, 2024
Merged via the queue into main with commit 2c016d5 Nov 8, 2024
14 checks passed
@chriseth chriseth deleted the refactor-multiplicity-witgen branch November 8, 2024 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants