Questions about How-To Material #13

dancbeaulieu · 2022-01-28T13:46:36Z

dancbeaulieu
Jan 28, 2022

I am curious as to the potential theoretical advantage of implementing Trainable Feature Maps from the Qiskit Circuit Library versus creating a custom quantum feature map?
I was curious as to how binding parameters per the “How To Bind User Parameters to a Quantum Kernel” can be used to assist Quantum Kernel Training? It wasn’t clear in the documentation or the article.
The advantages of creating a custom Kernel Loss function and when it would be most useful to do so?
- Should this be a standard part of my process when using the Quantum Kernel training prototype?

Jan 31, 2022

Hello Daniel, thanks for your great questions!

I am curious as to the potential theoretical advantage of implementing Trainable Feature Maps from the Qiskit Circuit Library versus creating a custom quantum feature map?

The choice of circuit family for the quantum kernel can have a significant impact on the machine learning model. For these kernel-based methods, we know that a necessary, but not always sufficient, condition for any computational advantage is that the quantum kernel be hard to estimate classically. The quantum kernel built from the ZZFeatureMap in the Qiskit library is conjectured to be one such example. (More recently, a quantum kernel and learning problem based on the h…

View full answer

BryceFuller · 2022-01-31T22:01:52Z

BryceFuller
Jan 31, 2022
Maintainer

Hello Daniel, thanks for your great questions!

I am curious as to the potential theoretical advantage of implementing Trainable Feature Maps from the Qiskit Circuit Library versus creating a custom quantum feature map?

The choice of circuit family for the quantum kernel can have a significant impact on the machine learning model. For these kernel-based methods, we know that a necessary, but not always sufficient, condition for any computational advantage is that the quantum kernel be hard to estimate classically. The quantum kernel built from the ZZFeatureMap in the Qiskit library is conjectured to be one such example. (More recently, a quantum kernel and learning problem based on the hardness of the discrete logarithm problem was shown to produce a quantum advantage over all classical learners).

It is known that, in principle, kernels that incorporate knowledge about the target problem can outperform problem-agnostic kernels. The discrete logarithm kernel mentioned above is an extreme example of this. In practice, it is not always clear how one can encode problem-specific information into the quantum feature map -- this is an ongoing research challenge. By introducing trainable parameters we can explore broad families of quantum feature maps in search of ones that capture useful structure about our target problem.

I was curious as to how binding parameters per the “How To Bind User Parameters to a Quantum Kernel” can be used to assist Quantum Kernel Training? It wasn’t clear in the documentation or the article.

Think of a parameterized quantum kernel as being analogous to a classical neural network. The neural network has inputs and trainable parameters/weights and this parameterized model represents a family of possible functions. Once you assign a specific value to all the parameters of your model, it now represents one specific function on the inputs, rather than a family of possible functions. We are invoking precisely the same idea when we discuss binding parameters to a quantum circuit. By binding values to each user parameter in our quantum feature map, we are doing the analogous operation to specifying the weights of our neural network. This essentially selects one particular quantum feature map from a parameterized family of feature maps.

As our classical optimizer searches over a parameterized family of feature maps it will be repeatedly binding new values to the user parameters of our trainable quantum feature map. This in turn will yield specific quantum feature maps drawn from our parameterized family, and we can then evaluate the performance of these feature maps so that our classical optimizer can attempt to learn progressively better user parameters for our given problem.

The advantages of creating a custom Kernel Loss function and when it would be most useful to do so? Should this be a standard part of my process when using the Quantum Kernel training prototype?

The broad goal of kernel training is to find a quantum kernel that captures some underlying structure of our data. Phrased differently: we are less concerned with the ability of a quantum kernel to precisely learn the labels of a specific training dataset and are primarily interested in our kernel's ability to generalize / infer the labels of unseen data. For this reason, it is natural to optimize a trainable quantum kernel according to how well it captures the structure of your data, rather than optimizing it simply for training accuracy.

The SVCLoss (or, 'svc_loss'), for example, returns the value of the SVC's optimized objective function (which has been trained using a specific quantum kernel on a specific dataset). This quantity is explicitly related to bounds on the generalization error of your underlying model, and is therefore a useful tool when seeking to build a quantum kernel that performs well on unseen data. The study of generalization bounds for quantum and classical models is an ongoing area of research and these bounds are known to often be quite loose. The important takeaway here, is that optimizing the SVCLoss in place of training accuracy is a step in the right direction; however, we may yet find even better loss functions that lead to stronger generalization. It may even be the case that using dataset-specific loss functions will lead to even greater performance. For this reason, we encourage users to define their own kernel loss functions and we hope this will facilitate research on quantum kernel training strategies.

It is also worth noting that we currently focus on using quantum kernels for binary classification; however, kernel methods extend to other problem areas such as regression, clustering, and more. We've designed the interface to accommodate users who are interested in optimizing kernels for these tasks and would like to design suitable loss functions.

I hope these answers make things a bit clearer!

0 replies

dancbeaulieu · 2022-02-02T15:00:17Z

dancbeaulieu
Feb 2, 2022
Author

Bryce, Thank you for your reply, my apologies for the late reply, still digesting it. I will likely have some more questions once I get a chance to experiment with what you sent over on live data. Already have tried the QKT methods on multiple datasets, and created custom kernels for those datasets and found good results. Daniel Beaulieu Specialist Master ***@***.******@***.***>

0 replies

BryceFuller · 2022-02-08T16:51:44Z

BryceFuller
Feb 8, 2022
Maintainer

I'm glad to hear it!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about How-To Material #13

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Questions about How-To Material #13

dancbeaulieu Jan 28, 2022

Replies: 3 comments

BryceFuller Jan 31, 2022 Maintainer

dancbeaulieu Feb 2, 2022 Author

BryceFuller Feb 8, 2022 Maintainer

dancbeaulieu
Jan 28, 2022

BryceFuller
Jan 31, 2022
Maintainer

dancbeaulieu
Feb 2, 2022
Author

BryceFuller
Feb 8, 2022
Maintainer