Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPPM graph refactor #391

Open
wants to merge 83 commits into
base: main
Choose a base branch
from
Open

IPPM graph refactor #391

wants to merge 83 commits into from

Conversation

caiw
Copy link
Member

@caiw caiw commented Oct 25, 2024

Main changes

Note

I'M SORRY that this is such a big diff. I'm aware it makes it hard to review. If I had had more time I'd have written a shorter letter, etc.

Also I'm sorry that I have renamed some classes/files in a way which prevents Github from showing diffs properly. In particular builder.py and IPPMBuilder are more or less replaced by grapy.py and IPPMGraph. data_tools.py was somewhat a collection of miscellaneous functionality which has been moved into other classes.

Little fixes

A few issues I created and fixed in the course of the refactor.

Unblocks

Issues removed from scope, but which can now be tackled, hopefully more easily.

@caiw caiw added ⚙️ refactor Changes which relate to code structure rather than functionality IPPM generation labels Oct 25, 2024
@caiw caiw self-assigned this Oct 25, 2024
@caiw caiw force-pushed the ippm-graph-refactor branch from f91409f to b949c97 Compare November 1, 2024 13:19
@caiw caiw force-pushed the ippm-graph-refactor branch from cd9e56b to 26d14ec Compare November 1, 2024 16:14
should_merge_hemis: bool = False,
should_exclude_insignificant: bool = True,
should_shuffle: bool = True,
should_normalise: bool,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worthwhile to add some default values for these parameters. It comes down to whether you think which parameters have high variance across runs and which ones will remain constant. The constant ones can be moved to optional/default valuse.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think should_normalise, should_cluster_only_latency, should_max_pool will be False for most runs unless someone wants to experiment.

should_shuffle will be True. As for the threshold for significance, not sure, although estimating it from the data seems better than absolute threshold (exclude_points_above_n_sigma > exclude_logp_vals_above)

Copy link
Member Author

@caiw caiw Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd left the existing default values in the subclasses, but it sounds like it makes logical sense for them to be set in the base class too.

Copy link
Member Author

@caiw caiw Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I've left the defaults for the should_exclude_above_n_sigma params as 5, which is what it was originally, but we can revisit that.

@neukym
Copy link
Member

neukym commented Feb 12, 2025

demo_ippm.ipynb isn't working quite right for me: This seems ... partially denoised?
Screenshot 2025-02-12 at 10 13 29

@caiw
Copy link
Member Author

caiw commented Feb 12, 2025

@neukym Yeah I saw this, it is related to the non-passing tests. Something trivial I think but I've not tracked down what yet.

@caiw
Copy link
Member Author

caiw commented Feb 14, 2025

Here's a record of where I got to before signing off this evening. @anirudh1666 perhaps this will help as you look at the tests; and I'll pick up again here next week.

I've been stepping through demo_ippm.ipynb on both main and this branch. As @neukym noted above, the denoising "isn't working" on this branch.

In particular, here's how the full "Kymata2023Q3" dataset looks like when denoised on main:
clipboard
and on branch:
clipboard11

My plan was to simplify this comparison by just filtering down to one transform at the top of the notebook using

expression_set: HexelExpressionSet = KymataMirror2023Q3Dataset().to_expressionset()
expression_set = expression_set[[
    "TVL loudness (short-term)",
]]

Now here's something interesting.

On main, this looks as expected:
clipboard1
And now on branch:
clipboard2

So when filtering down to one function, the branch works just like main.

Now moving up to two functions.

expression_set: HexelExpressionSet = KymataMirror2023Q3Dataset().to_expressionset()
expression_set = expression_set[[
    "TVL loudness (instantaneous)",
    "TVL loudness (short-term)",
]]

On main:
clipboard8
On branch:
clipboard9

So somehow the denoising is changing when adding multiple functions, on branch.

Next comment is one difference I've found, but it needs some discussion/thinking.

@caiw
Copy link
Member Author

caiw commented Feb 14, 2025

TL;DR

  1. @neukym @anirudh1666 Is the Šidák correction on branch out by an order of magnitude, and if so, would this change the recommendations of the paper?
  2. @neukym Šidák correction on main currently doesn't take into account the number of transforms being included in the IPPM — but should it?

How are points filtered for significance?

On main, points are filtered for significance using a threshold set in the strategy init:

class DenoisingStrategy(ABC):
    """
    Superclass for unsupervised clustering algorithms.
    Strategies should conform to this interface.
    """

    def __init__(self, ..., should_exclude_insignificant: bool = True, ...):
        ...
        self._threshold_for_significance = (
            DenoisingStrategy._estimate_threshold_for_significance(
                normal_dist_threshold  # defaults to 5; i.e. 5-sigma thresholding
            )
        )

Here's the entirety of estimate_threshold_for_significance(), minus the docstring:

    @staticmethod
    def _estimate_threshold_for_significance(x: float) -> float:
        def __calculate_bonferroni_corrected_alpha(one_minus_probability):
            return 1 - pow(
                (1 - one_minus_probability), (1 / (2 * TIMEPOINTS * NUMBER_OF_HEXELS))
            )

        probability_X_gt_x = 1 - NormalDist(mu=0, sigma=1).cdf(x)
        threshold_for_significance = __calculate_bonferroni_corrected_alpha(
            probability_X_gt_x
        )
        return threshold_for_significance

A quick note here that this is called "bonferroni" because the code was copied from an old version of the codebase before @ollieparish rightly pointed out this is in fact Šidák correction not Bonferroni correction.

Questions about this correction for @neukym

Is this correct? Is this correcting probability, or one-minus-probability as suggested in the variable name? It looks to me like probability_X_gt_x should be the upper tail of the normal distribution, which is to say it's (depending on how you look at it) the p threshold not 1-p threshold. So I think the variable name is a little misleading, and that the signature is otherwise correct.

As another point of comparison, on main currently there's also a snippet in plot/plot.py doing sidak correction for the alpha threshold:

    sidak_corrected_alpha = 1 - (
        (1 - alpha)
        ** np.longdouble(1 / (2 * len(expression_set.latencies) * n_channels * len(show_only)))
    )

(this code recently touched by @young-x-skyee but just to change the float to a longdouble). So here it looks like the correction is applied to the alpha level, not 1-p. On branch, this is refactored into a single function in one place, assuming that it's working on p.

How much correction to apply

Of particular note however is that the number of timepoints and hexels are stored in constants in constants.py:

TIMEPOINTS = 201
NUMBER_OF_HEXELS = 200000

(These were previously stored in vars in a few places, so I just put them in one place.)

My understanding is that the denominator in the exponent, everything after the 2 * (i.e. TIMEPOINTS * NUMBER_OF_HEXELS in the IPPM code and len(expression_set.latencies) * n_channels * len(show_only) in the expression plot code) is the number of comparisons we're correcting for.

(Bonus question for @neukym: what's that 2 doing there? If that's 2 hemispheres I don't think we need it if we just count all the hexels. Thoughts?)

It would be nice if these values were derived from the data rather than constants, and I've tried to do this on branch. However if this is true than I think the constants are wrong...

len(expression_set.latencies)  # 201 - that's correct
len(expression_set.hexels_left) + len(expression_set.hexels_right)  # 20484

So 20,484 is out from NUMBER_OF_HEXELS = 200_000 by a factor of 10.

Another point is also raised comparing to the Šidák correction in the expression plotting code: this includes the number of transforms in the correction factor (that's len(show_only) in that code). In trying to derive the IPPM threshold from the data, I have included that logic. Here's the IPPM threshold on branch:

            def get_threshold(es: ExpressionSet) -> float:
                """Get Šidák n-sigma threshold"""
                n_comparisons = len(es.transforms) * len(es.latencies) + get_n_channels(es)
                return p_to_logp(sidak_correct(p_threshold_for_sigmas(exclude_points_above_n_sigma),
                                               n_comparisons=n_comparisons))
def sidak_correct(alpha: float, n_comparisons: int) -> float:
    """Applies Šidák correction to an alpha threshold."""
    return 1 - (
        (1 - alpha)
        ** longdouble(1 / (2 * n_comparisons))
    )

(just converting to logp values here because that's how probability is stored in expression sets).

But then my question is: am I right to do this? On main, the IPPM code doesn't adjust the threshold depending on how many transforms are in play - but should it? Or should we treat each transform entirely independently?

On main, about 86% of the points are removed using this threshold before denoising even begins, so it makes a big difference to what's going on.

Conclusion

The branch code is obviously doing something wrong, but given that by some argument a threshold should change depending on the number of transforms included in the IPPM, and that interacts with the behaviour observed on branch, I think we should resolve this before continuing.

@caiw
Copy link
Member Author

caiw commented Feb 14, 2025

(And of course I hasten to add that if there is an error on main, it's entirely something which @neukym and I should have picked up in review — that's on us!)

@neukym
Copy link
Member

neukym commented Feb 15, 2025

Yes I agree.

Bonus question for @neukym: what's that 2 doing there? If that's 2 hemispheres I don't think we need it if we just count all the hexels. Thoughts?

Yes, you are right, the 2 is there for the number of hemispheres.

Is this correct? Is this correcting probability, or one-minus-probability as suggested in the variable name?

Good catch. The correct one is on the plot.py:

  sidak_corrected_alpha = 1 - (
        (1 - alpha)
        ** np.longdouble(1 / (2 * len(expression_set.latencies) * n_channels * len(show_only)))
    )

...although as you point out the 2 is erroneous here if n_channels can be either the total hexels or sensors - it is a legacy from when this was 'number of hexels per hemisphere'. Your sidak_correct in probability.py is a good solution I think - both plot.py, ippm.py etc should refer back to this one function:

def sidak_correct(alpha: float, n_comparisons: int) -> float:
    """Applies Šidák correction to an alpha threshold."""
    return 1 - (
        (1 - alpha)
        ** longdouble(1 / (n_comparisons))
    )

where n_comparisons = n_timepoints * n_transforms * n_channels. Note I have just removed the erroneous 2 from this function myself in commit 300662e.

@neukym Šidák correction on main currently doesn't take into account the number of transforms being included in the IPPM — but should it?

Yes I think it should - we take account of this in main's plot.py, and this is the one we should be using.

@neukym @anirudh1666 Is the Šidák correction on branch out by an order of magnitude, and if so, would this change the recommendations of the paper?

Yes, you are right that the following logic is correct:

len(expression_set.latencies)  # 201 - that's correct
len(expression_set.hexels_left) + len(expression_set.hexels_right)  # 20484

NUMBER_OF_HEXELS = 200000 is incorrect, and you are right to discard it. 👍

@neukym Šidák correction on main currently doesn't take into account the number of transforms being included in the IPPM — but should it?

Yes, I think it should. To standardise things, perhaps we should add this function to probability.py, and then always use this?:

def calculate_alpha_threshold(n_timepoints, n_channels, n_transforms):
   n_comparisons = n_timepoints * n_channels* n_transforms
   alpha = p_threshold_for_sigmas(5) # use 5-sigma
   alpha_threshold =  sidak_correct(alpha, n_comparisons)
   return alpha_threshold

Where

n_timepoints = len(expression_set.latencies)  # 201
n_channels = len(expression_set.hexels_left) + len(expression_set.hexels_right)  # 20484
n_transforms = len(show_only)

This then standardises everything into a single function, and it forces everyone to explicitly put in the number of time points, channels and transforms into it. (I'm worried if we don't make this function explicit there's a risk that one of us will forget to put in the number of transforms, as has been happening so far.)

The problem

I think I am right in saying the denoiser uses this threshold to do an initial filter, before it starts clustering. If we were to use the above logic this would mean that the denoised expression dataset is specific to the scope of some specific set of transforms in show_only, which doesn't seem quite right. We perhaps need to chat about this - I hadn't thought of this situation - that the denoising procedure would be reliant on the value in show_only.

One safe hack would be not to include n_transforms in the denoiser, so for that, in DenoisingStrategy.py the input values for calculate_alpha_threshold would be:

n_timepoints = len(expression_set.latencies)  # 201
n_channels = len(expression_set.hexels_left) + len(expression_set.hexels_right)  # 20484
n_transforms = 1

We then reapply:

n_timepoints = len(expression_set.latencies)  # 201
n_channels = len(expression_set.hexels_left) + len(expression_set.hexels_right)  # 20484
n_transforms = len(show_only)

in the IPPM plotting step? This would ensure that we only see the clusters that are significant at this higher threshold. Bit hacky though.

The other, cleaner, solution is that denoising simply isn't callable unless you are plotting an IPPM (in which case show_only will be available). This would require the denoiser to work quite quickly, so it can be done on the fly each time you plot an IPPM.

@anirudh1666
Copy link
Collaborator

@caiw @neukym
Sidak Correction Response

Would the Sidak correction change the recommendations of the paper?
My understanding is that with the new threshold, the denominator will increase, so sidak_corrected_alpha will be lower. Which in turn should raise the threshold for significance? (May be wrong) In this case, it may have some impact on the results but not significantly. The results show that taking the spike with the largest log p value is optimal, so this change may remove some spikes that were previously maximally significant. Consequently, some transforms which had significant nodes may no longer have any significant nodes. It wouldn’t change the location of nodes or the most significant node. With this mind, I don’t think the results will be significantly impacted but we should definitely verify this once this branch is merged into main.

Response to other questions

I agree with your corrections and Andy’s responses. Putting this in a central location to make it consistent across the Kymata pipeline also gets a +1 for me.

The problem highlighted by Andy

I just want to confirm my understanding of this first. So, you are saying that when we generate the expression plot for all of the transforms, we have a threshold. But when we do denoising, the threshold for significance changes based on the subset size. This intuitively feels wrong because selecting a subset of the results should not change the results? If my understanding is correct, I think it is better to use the threshold for all of the transforms even for the subset. I don’t like the hacky solution because the plotting doesn’t accurately represent what is input into the denoiser. We can quickly discuss this if necessary.

Update on DenoisingStrategies integration/unit tests and denoising bug

I’ve been tinkering with the new denoising strategies and ExpressionSets this past week. Unfortunately, I think it might take some time to fix these bugs and write the tests, but I will continue working on it whenever I have some time. So far, I’ve been able to create a simplistic integration test that passes successfully and identified a few areas of concern. There may be some problems in ExpressionSet._clear_points. For example, it does not clear the insignificant points, which is why the denoised time series has the small black spikes. The fix for this is to tag points to retain and remove the rest rather than tag the points to remove and remove them (points to remove never include the insignificant points, so they aren’t removed). There are other problems too, for example, shuffling the points changes the coordinates in the sparse array but we don’t reflect this in _clear_points as it assumes the points are ordered the same. I think there are some other problems as well from doing experiments but I haven’t found the root cause for them all yet. At the moment, however, the code I have written is mainly for my own understanding/prototyping rather than actually to commit, so I don’t have anything to commit atm. I will continue to look into this, with the following deliverables:

  • Refactor ExpressionSet, maybe some of DenoisingStrategy. The expression set class is quite big, so I think it would be good to refactor it into small modules that abstract away some of the complexity, which makes it more readable and debuggable. Also, will add some pretty printing functionality as it can be hard to trace the state of ExpressionSets using solely print statements. This will take time though as I need to build an in-depth understanding of ExpressionSet and find good cut points.
  • Fix the denoising bug(s). The goal is to have the demo_ippm.ipynb notebook working again. Can’t really estimate a bug in advance but will continue to do this until denoising is functional again. As discussed with Cai, the lack of denoising is what is causing the time complexity to explode because generating the IPPM has combinatorial time complexity, and having ~2004084 spikes leads to massive latency.
  • Integration tests for DenoisingStrategies. My testing philosophy now is to extensively test the public interfaces with a variety of inputs instead of painstakingly testing each function. It can make refactoring/maintaining the tests tedious and you can indirectly test the private functions through the public interface. Also, integration tests > unit tests (unit tests break too often with changes). As a result, I will write some integration tests for DenoisingStrategy that do not mock out the clustering functionality, which should make it easier to maintain for people unfamiliar with mocking. The trick is to test as many possible combinations of inputs (or the most important) that the user can provide as possible. There are exceptions, if there is a particularly complex private function, I will test it to ensure it is working as intended.

I will make my changes on ippm-graph-refactor-anirudh to avoid clashing with Cai’s changes.

PS (Unasked for programming principles but it may help with refactoring/implementing features in the future)

The most useful (also funniest) guide I have found for software engineering is the following: https://grugbrain.dev/. Personally, I think we could have done this refactoring a bit better by doing it in smaller increments and testing after each step. Debugging becomes really difficult with large changes and in the longer run, takes more time than doing it in smaller increments. The saying for this is “Don’t outrun your headlights”. I would have also preferred starting with a simpler version of ExpressionSet that is correct and works, then iteratively optimise and generalise it, making sure the tests/demo_ippm are working after each iteration. As the saying goes, “Premature optimisation is the root of all evil”. Another advantage of starting simple then optimising is that the simpler version is more flexible and amenable to change based on feedback. The most optimised code makes the strongest assumptions about the context it will be operating in, which can make it brittle. Of course, you may have valid reasons for your decisions, which is fine as well, in which case, you can always disregard these tips.

In the link provided, the grug developer explains this in the “Refactoring” section.

@caiw
Copy link
Member Author

caiw commented Feb 21, 2025

Hey, I've not been able to get to this properly today for half-term childcare reasons, but I have some thoughts. I'll get back to it properly next week.

Alpha-level correction

@neukym said:

Note I have just removed the erroneous 2 from this function myself in commit 300662e.

Great, I'll use this, and I've added some tests for specific values, using an online Sidak calculator to verify the correct results.

@caiw said:

Šidák correction on main currently doesn't take into account the number of transforms being included in the IPPM — but should it?

@neukym said:

Yes, I think it should. To standardise things, perhaps we should add this function to probability.py, and then always use this?

We could, but I think we should push this to another PR. In particular we describe Sidak correction in the plot axis name for expression plots, so I'm slightly unsure about also setting that choice elsewhere. In particular, if some reviewer some time demands bonferroni correction, or we decide that one correction or another makes sense at some point, I wonder if we should keep it a bit more procedural.

I don't feel strongly on this point.

@anirudh1666 said:

But when we do denoising, the threshold for significance changes based on the subset size. This intuitively feels wrong because selecting a subset of the results should not change the results?

I know what you mean, but it's all about controlling the chance that we find false signal in the noise. The more things we test, the more likely things are likely to appear significant by chance. The alpha level is precisely the level of "false discoveries" we are willing to accept. So the more transforms we let in into a given "omnibus test" (testing multiple things all at the same time), the more likely one of them is likely to pass a threshold by chance, so the more we have to restrict ("correct") the threshold for significance to control the overall false-discovery rate) — I think each transform we add (and every latency and hexel) is like an extra jellybean colour, which is why we have to control the alpha level so aggressively.

@neukym is the expert on the stats comparisons made in the kymata gridsearch, and under certain circumstances we can control the FDR without correcting a threshold in this brute-force way, but doing the correction is the safest/most conservative way. Unless @neukym would argue the model comparison we do is somehow automatically controlling the familywise error rate (I've not thought about this in full), I think we should do full correction for all comparisons.

@neukym said:

I think I am right in saying the denoiser uses this threshold to do an initial filter, before it starts clustering. If we were to use the above logic this would mean that the denoised expression dataset is specific to the scope of some specific set of transforms in show_only, which doesn't seem quite right. We perhaps need to chat about this - I hadn't thought of this situation - that the denoising procedure would be reliant on the value in show_only.

So I've changed my mind on this. We're using an alpha threshold to apply an initial filter of points passed to the denoiser, but we're not actually using it for a formal omnibus NHST, so in that sense we don't need to control the FWER in a strict sense like we do when presenting an expression plot (where the coloured points are RESULTS so we need to control the FWER in those results). So long as the significance of the nodes in the final IPPM survive a fully corrected NHST, I think we are fine applying an independent threshold to each transform going into the denoiser.

To be clear, my recommendation is the same as @neukym's:

  1. Apply a fixed (n_transforms = 1) threshold to the ExpressionSet — this is essentially to help the denoiser, so it doesn't matter if it's not fully corrected.
  2. Apply the denoiser.
  3. Apply a corrected (n_transforms = len(transforms)) threshold to the surviving clusters to ensure we don't have uncontrolled FWER.

If this happens within the IPPM-building step, it can be documented in code, and the thresholds should* be quick to calculate and apply.

(*As @anirudh1666 points out there is currently a bug in this, but that's easily fixable)

@neuky said:

The other, cleaner, solution is that denoising simply isn't callable unless you are plotting an IPPM (in which case show_onlywill be available). This would require the denoiser to work quite quickly, so it can be done on the fly each time you plot an IPPM.

I don't favour this exactly — there are other reasons we may want to denoise multiple transforms in future which don't directly result in IPPMs (e.g. see what Tianyi is doing with thousands of related transforms from ANN nodes). As previously discussed, I think we should formalise denoising as a process on ExpressionSets, and decouple it from the process of drawing an IPPM from a denoised ExpressionSet.

@anirudh1666 said:

My understanding is that with the new threshold, the denominator will increase, so sidak_corrected_alpha will be lower. Which in turn should raise the threshold for significance? (May be wrong)

It's the other way: on main we're correcting for ~10x the comparisons which means the threshold will be lower, because α << 1. (Lower in absolute terms — always confusing with p-values, -log p-values, etc). So on main the number of points going into the denoiser is much smaller than it will be having fixed the correction bug on branch.

@anirudh1666 said:

Consequently, some transforms which had significant nodes may no longer have any significant nodes. It wouldn’t change the location of nodes or the most significant node

Yeah, that makes sense to me — good point.

Refactoring ExpressionSet

@anirudh1666 said:

  • Refactor ExpressionSet, maybe some of DenoisingStrategy. The expression set class is quite big, so I think it would be good to refactor it into small modules that abstract away some of the complexity, which makes it more readable and debuggable. Also, will add some pretty printing functionality as it can be hard to trace the state of ExpressionSets using solely print statements. This will take time though as I need to build an in-depth understanding of ExpressionSet and find good cut points.

I'm loath to do this lightly. We spent quite some time building this class to be performant in terms of space and time, and also future-proof in terms of possible additional data sources, e.g. OPM, EEG, ECoG.

In particular, compared to what the kymata.org API serves, which is what the IPPM functions were originally coded against, ExpressionSets contain a lot more information (i.e. the p-values of all the non-best transforms at each datapoint), but we do use that additional information in our analyses.

In particular ExpressionSets do model selection — i.e. always showing the best model at each datapoint from the available set — automatically, as part of the data model. In the API version, this model selection is done prior to the data model, and as such ad-hoc model selection is impossible — a batch of results from the API is what it is and can't be explored in the same way an ExpressionSet can.

Similarly...

@anifudh1666 said:

 I would have also preferred starting with a simpler version of ExpressionSet that is correct and works, then iteratively optimise and generalise it, making sure the tests/demo_ippm are working after each iteration.

Yeah, I fully mea-culpa that this is a monster refactor and that's not nice for anyone and has made the testing hard — sorry! The points made in the blogpost are well taken!

I will say though that ExpressionSet has been around for ages as a class and is used for a lot of disk i/o as well as the main kymata gridsearch, so one goal of this PR was rather to align the IPPM code with the core data model of the rest of kymata-core, as opposed to maintaining a separate, parallel data model for purposes of denoising.

We previously (in the Matlab version of the code around for years) had a lot more of a simple/modular/single-user version of ExressionSet called "endtables", which caused us a lot of problems. In particular, the ability to split and combine ExpressionSets computed in different experimental runs, and have them validate the ligitimacy of this operation, and perform flexible model selection as part of this split/combine operations is really important, and in our previous more modular approach this required a lot of user bookkeeping and was a source of errors and mental effort. It was dependent on the user to keep track of what functions had been added and removed from the endtable, especially when saving and loading data to disk, rather than the class itself automatically tracking this.

IMO, the mathematical framework we're trying to describe in the class here is just inherently rather complex, and that complexity is admittedly reflected in the class. However previously when we had a simpler class, it actually just pushed that complexity onto the user, requiring them to do things in a certain order and keep track of things using things like choice of file names (😱) and the top-level script, which meant an actually fairly deep understanding of the underlying maths was required to even write a top-level script which behaved correctly. What we pay for in class complexity here we to some degree get back in "automatic correctness" of the behaviour.

I guess what I'm saying is that I'm not sure the optimisation is as premature as it may appear :)

But

I'm absolutely not saying that these classes can't be improved! Maybe even radically, and I do appreciate your insights into writing maintainable code - this isn't my background. What I'm saying is I would recommend we consider this as a separate issue, and not a requirement of aligning the IPPM code with the ExpressionSet class as-is (i.e. this PR).

Having said that...

@anirudh1666 said:

There may be some problems in ExpressionSet._clear_points

and

There are other problems too, for example, shuffling the points changes the coordinates in the sparse array

This .clear_points functionality is new as of this PR, so it is not as well battle-tested as the rest of the class, and I am sure you're right as to these bugs. This is the priority for me to fix, and I think is the problem.

Next

Rather than @anirudh1666 spending ages wrestling with the unfamiliar ExpressionSet class, I think I should take another shot at stepping the tests through in main vs branch to see where the broken integration tests are actually failing. I believe Anirudh is right that it's in .clear_points and shuffling, and clearly the behaviour is wrong, so that's where to look. @neukym happy to chat about this in term of general priorities but maybe that's the next step for this PR?

@caiw
Copy link
Member Author

caiw commented Feb 21, 2025

Anyway, long comment finally posted now I have internet access again. Thanks @neukym and @anirudh1666 for the thoughtful comments!

@neukym
Copy link
Member

neukym commented Feb 21, 2025

Thanks both - great discussion.

@neukym happy to chat about this in term of general priorities but maybe that's the next step for this PR?

Yes, completely agree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment