Selecting reliable negative associations using positive unlabelled leaning

QC Qiuhao Chen
LZ Liyuan Zhang
YL Yaojia Liu
ZQ Zhonghao Qin
TZ Tianyi Zhao
ask Ask a question
Favorite

In our dataset, unlabelled associations can contain hidden positives that were not labelled as such due to various reasons such as cost or technological limitation, so it is unsafe to assume that all unlabelled examples are negative, especially if positive examples are rare compared with all possible examples. Treating all unlabelled examples as negatives can introduce a bias in the learning process because the negative samples are not truly a representative of the negative class. This can skew the distribution of the dataset and cause the learned model to perform poorly.

So, the PU learning technique is introduced to identify reliable negative associations to help avoid the issues associated with treating all unlabelled data as negative associations [34]. Many techniques have been proposed to address this problem. iPiDi-PUL, iPiDi-sHN and piRDA use the bagging strategy to select high-quality negative associations [15–17]. In this study, we combined three different methods including PU bagging, two-step and spy technique to select reliable negative associations, and compared their performance in the prediction task. The steps of these three PUL method are as follows.

The final reliable negative associations set is the union of reliable negative associations obtained from these three methods.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A