Classical optimizers: PSO and BO

DZ D. Zhu NL N. M. Linke MB M. Benedetti KL K. A. Landsman NN N. H. Nguyen CA C. H. Alderete AP A. Perdomo-Ortiz NK N. Korda AG A. Garfoot CB C. Brecque LE L. Egan OP O. Perdomo CM C. Monroe

This protocol is extracted from research article:

Training of quantum circuits on a hybrid quantum computer

**
Sci Adv**,
Oct 18, 2019;
DOI:
10.1126/sciadv.aaw9918

Training of quantum circuits on a hybrid quantum computer

Procedure

We explored two different classical optimizers in this study: PSO and BO.

PSO is a gradient-free optimization method inspired by the social behavior of some animals. Each particle represents a candidate solution and moves within the solution space according to its current performance and the performance of the swarm. Three hyperparameters control the dynamics of the swarm: a cognition coefficient *c*_{1}, a social coefficient *c*_{2}, and an inertia coefficient *w* (*24*).

Concretely, each particle consists of a position vector θ_{i} and a velocity vector *v _{i}*. At iteration

In our problem, each particle corresponds to a point in parameter space of the quantum circuit. For example, in the fully connected circuit with two layers, each particle consists of an instance of the 14 parameters. Recall, however, that parameters are angles and therefore periodic; we customized the PSO updates above to use this information. In Eq. _{1}, ${p}_{i,d}^{(t)}$ and ${\mathrm{\theta}}_{i,d}^{(t)}$ can be thought of as two points on a circle. Instead of using the standard displacement ${p}_{i,d}^{(t)}-{\mathrm{\theta}}_{i,d}^{(t)}$, we used the angular displacement, that is, the signed length of the minor arc on the unit circle. We used the same definition of displacement for the swarm’s best position ${g}_{i,d}^{(t)}$. Last, in Eq. 2, we made sure to express angles always using their principal values.

In our experiments, we set the number of particles to twice the number of parameters of the circuit. Position and velocity vectors of each particle were initialized from the uniform distribution. For the coefficients, we used *c*_{1} = *c*_{2} = 1 and *w* = 0.5.

BO is a powerful global optimization paradigm. It is best suited to finding optima of multimodal objective functions that are expensive to evaluate. There are two main features that characterize the BO process: the surrogate model and an acquisition function.

The surrogate model is nonparametric model of the objective function. At each iteration, the surrogate model is updated using the sampled points in parameter space. The package used in this study is OPTaaS by Mind Foundry. It implements the surrogate model as regression using Gaussian process (*36*). A kernel (or correlation function) characterizes the Gaussian process, we used a Matern 5/2 as it provides the most flexibility.

The acquisition function is computed from the surrogate model. It is used to select points for evaluation during the optimization. It trades off exploration against exploitation. The acquisition function of a point has a high value if the cost function is expected to give a notable improvement over historically sampled points or if the uncertainty of the point is high, according to the surrogate model. A simple and well-known acquisition function, Expected Improvement (*37*), is used here.

In our case, OPTaaS also leverages the cyclic symmetry of the angles by embedding the parameter space into a metric space with the appropriate topology, effectively allowing the Gaussian process surrogate model to be placed over a hypertorus rather than a hypercube. This greatly alleviates the so-called curse of dimensionality (*38*) and allows for much more efficient use of samples of the objective function.

It is the key in BO to adequately optimize the acquisition function during each iteration. OPTaaS puts considerable computational resources toward this nonconvex optimization problem.

There are two major reasons why the BO out performs PSO in our specific case. First, PSO spends significant amount of computation resource exploring trajectories far from optimal, while BO mitigates it by the use of acquisition function. Second, the maintenance of the surrogate model enables us to make much better use of the information from the historical exploration of the parameter space.

Note: The content above has been extracted from a research article, so it may not display correctly.

Q&A

Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.