Optimizations of the probability of achievement (PA) were performed following the logarithmic transformation. Given design parameter, X, the log of the PA for M objective properties can be obtained as follows:
where, gm, m and m are the predefined goal, predicted mean and predicted standard deviation for the m-th objective property, respectively. Φ is the cumulative distribution function of the standard normal distribution. Note that a classical experiment navigation method for the robust product design with noisy measurements called Nakazawa method also uses a similar scoring function based on the joint probability of goal achievement.(Inage, 2019) And Bayesian optimization with the PA can be regarded as a machine learning-based sequential implementation of this classical design method.
Given the predefined goal gm for each objective property ym, the achievement function for M objectives is defined as follows:
where wm and ρ are predefined parameters. Note that the need for wm arises from scale difference in objective properties. According to previous studies in the operational research field,(Hakanen and Knowles, 2017) ρ was set to 0.05 and wm was calculated as the reciprocal number of differences between maximum and minimum values in Pareto optimal design within the design parameters explored.
Bayesian optimization with LCB acquisition, implemented in GpyOpt was used to optimize the achievement function. In GPyOpt, the LCB acquisition is implemented as Equation 4, and default value of the parameter a (a=2) was used.
Acquisition function optimizations were performed using the default protocol implemented in GPyOpt. In this protocol, the putative global minimum (maximum for the PA) of the acquisition functions is searched by 1000 initial random searches and subsequent optimization of the top-5 local minimum using quasi-Newton method, L-BFGS-B, implemented in SciPy.(Virtanen et al., 2020) Finally, a design parameter with the minimum searched-for value is selected for the next experiment.
The multi-objective optimization methods that efficiently and thoroughly find the Pareto optimal solutions have been well studied in the field of the operations research. Among them, Non-dominated Sorting Genetic Algorithms-II (NGSA-II)(Deb et al., 2002) is one of the standard methods for optimization problems with a few objectives. In order to obtain the Pareto optimal solutions, NGSA-II implemented in Platypus library was used with sufficient number of optimization steps. The number of the obtained Pareto optimal solutions for the six mathematical benchmark problems was set to 1000, while that for the virtual inverse material design problem was set to 10000 in order to accurately evaluate experimental costs of finding Pareto optimal solutions.
For the virtual material design experiment, regression models constructed from experimental data were used as a substitute for time-consuming real-world experiments. The regression models can be obtained from the work of Wang et al.(Wang et al., 2020a; https://pubs.acs.org/doi/abs/10.1021/acsami.0c11667)
For the virtual material design experiment, average numbers of experiments before finding a first Pareto optimal solution were evaluated (Figure 9C). A set of objectives is judged as Pareto optimal when it is not Pareto-dominated by any of 10000 Pareto optimal solutions obtained by NGSA-II. Because, finding of exact Pareto optimal solutions is unnecessarily difficult, a small value δ was added to each objective value obtained by the Bayesian optimization before Judgment of Pareto optimal solutions. The small value δ for each objective was calculated as difference between the maximum and minimum values of each objective in the true Pareto optimal solutions multiplied by 0.005.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.