3.3. Adaptive Fine-Tuning Module

LZ Lei Zhao
ZO Zhonghua Ou
LZ Lixun Zhang
SL Shuxiao Li
ask Ask a question
Favorite

Drawing on the experience of meta-learning-based pretraining methods, we propose the AFT module to obtain the hybrid finetuning strategy. AFT firstly performs adaptive epoch learning using the idea of “chunk by chunk” on the validation classes of the base dataset, which evaluates the model's performance for each chunk and establishes an adaptive termination rule to output an adaptive epoch that needs to be set at the fine-tuning stage. Then, the higher one between the FSLDA model and the adaptive fine-tuned model is retained, and the optimal hybrid epoch is acquired. Finally, the above procedures are executed on massive pseudofine-tuning tasks to output the final hybrid fine-tuning strategy, ensuring that most tasks converge to higher performance.

Specifically, massive pseudofine-tuning tasks, each of which includes a support set and a query set, are randomly sampled from the validation classes of the base dataset to imitate the fine-tuning task. Like metric-based meta-learning, the support set here is also of the C-way K-shot style. All the remaining samples in the selected classes are used as the query set to evaluate the performance of the model. As shown in Figure 3, we first use the support set to get the FSLDA model and obtain its accuracy mAPm0 using the query set. During adaptive epoch learning, we divide the maximum allowable epochs into N chunks, and each chunk contains c nodes. To improve the learning speed, only the model at the last epoch in each node is evaluated by the query set to get its accuracy. We regard the mean of all nodes' performance in a chunk as a representation of the chunk's performance, so as to get the macrochange trend of the accuracy curve. For the mth pseudofine-tuning task, we can get its “chunk by chunk” performance series, denoted as {mAPm0,…, mAPmb, mAPmb+1, mAPmn, ⋯}, where b is the starting evaluation chunk index to avoid disturbances at the initial fine-tuning stage. The process terminates if the accuracy gain is negligible and outputs the adaptive chunk index:

Illustration of adaptive epoch learning. Once the FSLDA model is available, we fine-tune it with the support set using the idea of “chunk by chunk” and get the corresponding sequential mAP with the query set. The fine-tuning process terminates if the accuracy gain is negligible. Note that adaptive epoch learning runs on the validation classes of the base dataset.

Then, we combine the advantages of the FSLDA model and the adaptive epoch learning and set the optimal hybrid epoch as

where a is the number of epochs contained in a chunk.

When the optimal hybrid epochs for M pseudofine-tuning tasks are ready, the optimal hybrid finetuning strategy can be finally acquired by

where M′=∑m=1M1(epochm) indicates the number of tasks needing to be fine-tuned, and 1 is the indicator function. When most pseudofine-tuning tasks do not need the fine-tuning stage (epoch=0), the optimal hybrid fine-tuning strategy adopts FSLDA as the final strategy. Otherwise, it uses the 0.9 quantile of the optimal hybrid epochs to ensure that most tasks can be converged. In the latter case, the optimal hybrid fine-tuning strategy performs both FSLDA and AFT.

The pipeline for AFT is summarized as Algorithm 1.

Pseudocode for the AFT module.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A