Possible addition of diffusion kernel over discrete/categorical input spaces #1411
Replies: 5 comments 2 replies
-
This seems like a good kernel to have! I'd be open to a PR. Make sure to add unit tests (see https://github.com/cornellius-gp/gpytorch/blob/master/test/kernels/test_rq_kernel.py for an example). |
Beta Was this translation helpful? Give feedback.
-
Agreed that this would be a great kernel to have! It looks like this implementation works directly with the graph Laplacian which has cubic complexity w.r.t. the number of vertices in the graph. It would be great to extend this (maybe at a later point) to work with the graph Cartesian product and exploit the resulting Kronecker structure, just like for COMBO here. This is particularly interesting for Bayesian optimization with categorical variables. Additionally, using a horseshoe prior on the inverse lengthscale makes a lot of sense to support as well. |
Beta Was this translation helpful? Give feedback.
-
Hi Folks, I am thinking of it as "the categorical features only contribute to the distance between data points". I didn't find any Any hints about where to look or read are highly appreciated ;-) Regards, |
Beta Was this translation helpful? Give feedback.
-
Did this get anywhere? Trying to understand more generally best practices for mixed features (float and categorical/int). I haven't found examples yet. And I mean in situations with high-arity categoricals where naively one-hotting is not a good approach. |
Beta Was this translation helpful? Give feedback.
-
Hi, I am Aryan. I am a PhD student at Washington State University. I like GPyTorch a lot and regularly use it in my own research on combinatorial Bayesian optimization. Thanks for the great library! I think adding diffusion kernel for discrete/categorical input spaces will be a nice addition to the library. It is a very useful (extension of RBF to discrete spaces) kernel for Bayesian optimization over discrete/combinatorial spaces (potentially good for BoTorch).
I am providing my implementation of the diffusion kernel in GPyTorch format below, hoping it will be useful. It is a batch-compatible implementation for the ARD version of the kernel that supports arbitrary number of categories in each dimension.
Thanks and Happy new year!
Beta Was this translation helpful? Give feedback.
All reactions