详细信息
自训练半监督加权球结构支持向量机多分类方法
Multi-class classification based on self-training semi-supervised weighted sphere structured support vector machine
文献类型:期刊文献
中文题名:自训练半监督加权球结构支持向量机多分类方法
英文题名:Multi-class classification based on self-training semi-supervised weighted sphere structured support vector machine
作者:赵芳[1];马玉磊[2]
第一作者:赵芳
机构:[1]新乡学院计算机与信息工程学院;[2]新乡学院继续教育学院
第一机构:新乡学院计算机与信息工程学院
年份:2014
卷号:26
期号:3
起止页码:404-408
中文期刊名:重庆邮电大学学报:自然科学版
收录:CSTPCD;;北大核心:【北大核心2011】;CSCD:【CSCD_E2013_2014】;
基金:河南省科技攻关计划发展项目(112300410266)~~
语种:中文
中文关键词:半监督学习;自训练;多分类;球结构支持向量机;加权球结构支持向量机
外文关键词:semi-supervised, self-training, multi-classification, sphere structured support vector machine, weighted spherestructured support vector machine
摘要:针对自训练半监督支持向量机算法中的低效问题,采用加权球结构支持向量机代替传统支持向量机,提出自训练半监督加权球结构支持向量机。传统支持向量机需要求解二次凸规划问题,在处理大规模数据时会消耗大量存储空间和计算时间,特别是在多分类问题上更加困难。利用球结构支持向量机进行多类别分类,大大缩短了训练时间,降低了算法复杂度。球结构支持向量机在不同类别样本数目不均衡时训练分类错误倾向于样本数目较小的类别,通过权值的引入,降低了球结构支持向量机对样本不均衡的敏感性,补偿了类别差异对算法推广性能造成的不利影响。在人工数据集和UCI(university of california irvine)数据集上的实验结果表明,该方法对有标记样本的鲁棒性较好,不仅能够提高效率,且分类精度也有显著提高。
According to the low efficiency of self-training semi-supervised support vector machine ( S3 VM ), a novel algorithm termed self-training semi-supervised weighted sphere structured support vector machine was proposed, using weighted sphere structured support vector machine instead of traditional support vector machine. Traditional support vector machine needs to solve a quadratic programming problem and suffers from the problem of large memory requirement and computing time on large datasets, especially on multi-class problem. When dealing with multi-classification problems by using sphere structured support vector machine, the training time was reduced. When training sets with uneven class size were used, the classification error based on sphere structured support vector machine was undesirably biased towards the class with fewer samples in the training sets. While by introducing a weight, the novel algorithm was insensitive to unbalanced datasets, and compensate for the unfavorable impact caused by the uneven class size. Experiments on artificial and UCI( university of california irvine) datasets support that the proposed algorithm was robust to the labeled samples, and it improved the efficiency and accuracy.
参考文献:
正在载入数据...