To optimize scGREAT, the goal is to minimize the loss function, which encompasses all trainable parameters, denoted as including the parameters θ for the decoder, for positional embedding, matrices , , , and for the attention module, δ for the FFN layer, and for the Dense layer. We trained scGREAT with the Adam stochastic gradient descent method. Taking the cell-type-specific ChIP-seq network ground truth as an example we employed default hyperparameters with a 0.00001 learning rate and a 0.999 weight decay rate every 10 steps. Training continues until either convergence or after completing 80 epochs. We utilize a mini-batch approach with a batch size of 32. The model training was conducted on an NVIDIA GeForce RTX 3080 GPU with 10 GB of memory.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.