This work assumes that the result party desires to compute the logistic regression model over collected data by different data owners. Each data owner computes multiple shares (based on the number of computation parties) of its sensitive data and sends them separately to each computation party. Note that each computation party receives an equal number of dependent and independent variables. Each computation party should append the received shares and their corresponding dependent variables in the correct order. Finally, computation parties send their computed shares of logistic regression coefficient to the result party, and the result party, then, simply sum these shares together to compute the final result.
We now present our privacy-preserving logistic regression training algorithms that employ the previously mentioned approaches. These algorithms summarize the crucial steps in the proposed protocols for both honest and dishonest majority security assumptions. In our proposed algorithms, each data owner provides a share of data for the computation parties as input. The only output of the algorithm is the computed model coefficients . Notably, we will not employ a convergence check after each iteration to prevent unnecessarily revealing information about the input. specifies the upper bound of the number of iterations needed for convergence.
In Algorithm 1, we propose a very accurate privacy-preserving logistic regression model training protocol. In this algorithm, we only employ highly accurate approximations, such as matrix inversion and fixed Hessian, which have a negligible effect on the computation output’s accuracy. Moreover, instead of approximating the Sigmoid function, we use our approach introduced in “Gradient” section to compute the exact value of it (lines 8-12).
The primary purpose of proposing Algorithm 2 is to achieve a highly efficient privacy-preserving logistic regression model training protocol. Various approximation approaches such as fixed Hessian matrix, least-square approximation for the Sigmoid function, and matrix inversion algorithm are employed to obtain our goal. This logistic regression training algorithm demonstrates how the introduced approximation approaches can be combined efficiently to compute the logistic regression coefficient in a privacy-preserving manner.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.