Training setting

YW Yuchen Wang
XC Xingjian Chen
ZZ Zetian Zheng
LH Lei Huang
WX Weidun Xie
FW Fuzhou Wang
ZZ Zhaolei Zhang
KW Ka-Chun Wong
ask Ask a question
Favorite

To optimize scGREAT, the goal is to minimize the loss function, which encompasses all trainable parameters, denoted as Θ including the parameters θ for the decoder, for positional embedding, matrices WQ, WK, WV, and WO for the attention module, δ for the FFN layer, and Φ for the Dense layer. We trained scGREAT with the Adam stochastic gradient descent method. Taking the cell-type-specific ChIP-seq network ground truth as an example we employed default hyperparameters with a 0.00001 learning rate and a 0.999 weight decay rate every 10 steps. Training continues until either convergence or after completing 80 epochs. We utilize a mini-batch approach with a batch size of 32. The model training was conducted on an NVIDIA GeForce RTX 3080 GPU with 10 GB of memory.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A