Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equivalence class projection does not find new equivalent classes correctly #14326

Open
askalt opened this issue Jan 27, 2025 · 2 comments · May be fixed by #14327
Open

Equivalence class projection does not find new equivalent classes correctly #14326

askalt opened this issue Jan 27, 2025 · 2 comments · May be fixed by #14327

Comments

@askalt
Copy link
Contributor

askalt commented Jan 27, 2025

Consider the following case:

We have an exec with schema = [a,b,c] and an equivalent class: a == b. We project [a+c,b+c] from it. As a==b we can conclude that a+c==b+c, but that case does not work for now, i.e. the next test fails:

#[tokio::test]
async fn test_stat_projection_equivalent_classes() -> Result<()> {
    // - columns: [a, b, c].
    // - "a" and "b" in the same equivalence class.
    // - then after a+c, b+c projection col(0) and col(1) must be
    // in the same class too.
    let schema = Arc::new(Schema::new(vec![
        Field::new("a", DataType::Int32, false),
        Field::new("b", DataType::Int32, false),
        Field::new("c", DataType::Int32, false),
    ]));
    let mut eq_properties = EquivalenceProperties::new(Arc::clone(&schema));
    eq_properties.add_equal_conditions(&col("a", &schema)?, &col("b", &schema)?)?;

    let properties = PlanProperties::new(
        eq_properties,
        Partitioning::UnknownPartitioning(1),
        EmissionType::Both,
        Boundedness::Bounded,
    );

    let projection = ProjectionExec::try_new(
        vec![
            (
                binary(
                    col("a", &schema)?,
                    Operator::Plus,
                    col("c", &schema)?,
                    &schema,
                )?,
                "a+c".to_owned(),
            ),
            (
                binary(
                    col("b", &schema)?,
                    Operator::Plus,
                    col("c", &schema)?,
                    &schema,
                )?,
                "b+c".to_owned(),
            ),
        ],
        Arc::new(PropertiesExec {
            properties,
            schema: Arc::clone(&schema),
        }),
    )?;

    let actual_properties = projection.properties();
    let eq_group = actual_properties.eq_properties.eq_group();

    assert!(!eq_group.is_empty());
    Ok(())
}

It is because current code that builds new classes just compares source expressions:

let mut new_classes: IndexMap<Arc<dyn PhysicalExpr>, EquivalenceClass> =
IndexMap::new();
mapping.iter().for_each(|(source, target)| {
new_classes
.entry(Arc::clone(source))
.or_insert_with(EquivalenceClass::new_empty)
.push(Arc::clone(target));
});

Instead, I propose to build normalized mapping (where each source expression is normalized against existing equivalence classes) and then and then make a comparison.

@askalt
Copy link
Contributor Author

askalt commented Jan 27, 2025

Please, see the possible patch #14327

@alamb
Copy link
Contributor

alamb commented Jan 28, 2025

Thanks @askalt -- we'll check it out

FYI @ozankabak and @berkaysynnada

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants