Calculation of the probability of achievement

Kyohei Hanaoka

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Calculation of the probability of achievement

KH Kyohei Hanaoka

This method is extracted from research article: iScience, Jun 2021

Bayesian optimization for goal-oriented multi-objective inverse material design

DOI: 10.1016/j.isci.2021.102781

Request a Protocol

Ask a question

Favorite

Optimizations of the probability of achievement (PA) were performed following the logarithmic transformation. Given design parameter, X, the log of the PA for M objective properties can be obtained as follows:

where, g_m, $μ$ _m and $σ$ _m are the predefined goal, predicted mean and predicted standard deviation for the m-th objective property, respectively. Φ is the cumulative distribution function of the standard normal distribution. Note that a classical experiment navigation method for the robust product design with noisy measurements called Nakazawa method also uses a similar scoring function based on the joint probability of goal achievement.(Inage, 2019) And Bayesian optimization with the PA can be regarded as a machine learning-based sequential implementation of this classical design method.

Given the predefined goal g_m for each objective property y_m, the achievement function for M objectives is defined as follows:

where w_m and ρ are predefined parameters. Note that the need for w_m arises from scale difference in objective properties. According to previous studies in the operational research field,(Hakanen and Knowles, 2017) ρ was set to 0.05 and w_m was calculated as the reciprocal number of differences between maximum and minimum values in Pareto optimal design within the design parameters explored.

Bayesian optimization with LCB acquisition, implemented in GpyOpt was used to optimize the achievement function. In GPyOpt, the LCB acquisition is implemented as Equation 4, and default value of the parameter a (a=2) was used.

Acquisition function optimizations were performed using the default protocol implemented in GPyOpt. In this protocol, the putative global minimum (maximum for the PA) of the acquisition functions is searched by 1000 initial random searches and subsequent optimization of the top-5 local minimum using quasi-Newton method, L-BFGS-B, implemented in SciPy.(Virtanen et al., 2020) Finally, a design parameter with the minimum searched-for value is selected for the next experiment.

The multi-objective optimization methods that efficiently and thoroughly find the Pareto optimal solutions have been well studied in the field of the operations research. Among them, Non-dominated Sorting Genetic Algorithms-II (NGSA-II)(Deb et al., 2002) is one of the standard methods for optimization problems with a few objectives. In order to obtain the Pareto optimal solutions, NGSA-II implemented in Platypus library was used with sufficient number of optimization steps. The number of the obtained Pareto optimal solutions for the six mathematical benchmark problems was set to 1000, while that for the virtual inverse material design problem was set to 10000 in order to accurately evaluate experimental costs of finding Pareto optimal solutions.

For the virtual material design experiment, regression models constructed from experimental data were used as a substitute for time-consuming real-world experiments. The regression models can be obtained from the work of Wang et al.(Wang et al., 2020a; https://pubs.acs.org/doi/abs/10.1021/acsami.0c11667)

For the virtual material design experiment, average numbers of experiments before finding a first Pareto optimal solution were evaluated (Figure 9C). A set of objectives is judged as Pareto optimal when it is not Pareto-dominated by any of 10000 Pareto optimal solutions obtained by NGSA-II. Because, finding of exact Pareto optimal solutions is unnecessarily difficult, a small value δ was added to each objective value obtained by the Bayesian optimization before Judgment of Pareto optimal solutions. The small value δ for each objective was calculated as difference between the maximum and minimum values of each objective in the true Pareto optimal solutions multiplied by 0.005.

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol