Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add several subclasses of Step/Generic - Make interactive usage succinct #30

Open
PeterDSteinberg opened this issue Oct 19, 2017 · 1 comment
Assignees

Comments

@PeterDSteinberg
Copy link
Contributor

I made this notebook in elm with landsat data showing usage patterns of Step from xarray_filters and some improvements we could use.

Implement and test the following ideas:

Step using a "for_each_array" decorated callable

class ForEachStep(Step):
    keep_attrs = True
    func = None     
    pass_attrs = False
    def transform(self, X, y=None, **kw):
        kw = kw.copy()
        kw.update(self.get_params(deep=True).copy())
        # TODO here should we filter args and kwargs (see func_signatures.py in xarray_filters)
        if kw.pop('pass_attrs'):
            kw['attrs'] = X.attrs
        dset = kw.pop('func')(X, **kw)
        if kw.pop('keep_attrs', True):
            dset.attrs.update(X.attrs)
        return dset

Would be used with a function like this:

@for_each_array
def set_nans(arr):
    arr = arr.copy(deep=True) 
    arr.values = arr.values.astype(np.float32)
    arr.values[arr.values <= 1] = np.NaN
    arr.values[arr.values == 2**16] = np.NaN
    return arr

And then using ForEachStep as a base class with parameters:

class SetNaNs(ForEachStep):
    func = set_nans                                     #<----currently fails

or calling the ForEachStep constructor directly

ForEachStep(func=set_nans).fit_transform(dset)

Step using a "data_vars_func" decorated callable

Make something like this:

class DataVarsStep(Step):                             
    func = None                                       # func should have signature of **data_vars
                                                                # (expecting data_vars of "X")
    def transform(self, X, y=None, **kw):
        kw = kw.copy()
        kw.update(self.get_params(deep=True).copy())
        # TODO here should we filter args and kwargs (see func_signatures.py in xarray_filters)
        return kw.pop('func')(dset=X)

To be used with functions like normalized_diffs

def normed_diff(a, b):
    return (a - b) / (a + b)

@data_vars_func
def normalized_diffs(**dset):
    print('Called with ', dset.keys())
    dset['ndwi'] = normed_diff(dset['layer_4'], dset['layer_5'])
    dset['ndvi'] = normed_diff(dset['layer_5'], dset['layer_4'])
    dset['ndsi'] = normed_diff(dset['layer_2'], dset['layer_6'])
    dset['nbr']  = normed_diff(dset['layer_4'], dset['layer_7'])
    return dset

Use as follows

class NormedDiff(DataVarsStep):
    func = normalized_diffs                       # currently fails for reason mentioned above

Or like this with constructor:

DataVarsStep(func=normalized_diffs).fit_transform(dset)

Fix the descriptor pattern in Generic/Step

  • First think about Generic vs Step in xarray.pipeline.py and see if we need both? Or how their differences should be explained to end user
  • Make sure Generic / Step can use any type of data in their descriptor pattern that builds the parameters list for the step. For example, If trying to do the snippet below, the code fails due to layers being a list and func being a callable but runs ok if both are set to None. If we allow the descriptor pattern with any data type, that will be easier because the user won't have to pass func=something, layers=something on initialization (ChooseBands is just an example that is specific to a Landsat notebook).
class ChooseBands(Generic):                   # TODO - this section should work but currently doesn't
    include_normed_diffs = True
    layers = DEFAULT_LAYERS
    func = choose_bands
@PeterDSteinberg PeterDSteinberg changed the title Add several subclasses of Step Add several subclasses of Step/Generic - Make interactive usage easier Dec 9, 2017
@PeterDSteinberg PeterDSteinberg changed the title Add several subclasses of Step/Generic - Make interactive usage easier Add several subclasses of Step/Generic - Make interactive usage succinct Dec 9, 2017
@PeterDSteinberg
Copy link
Contributor Author

One TODO to add to this issue:

  • If creating a Step or Generic subclass and I define the transform or fit_transform but not both, then it should be assumed that my fit_transform and transform are the same function (i.e. automatically doing fit_transform = transform or vice versa as needed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants