• 为了保证你在浏览本网站时有着更好的体验，建议使用类似Chrome、Firefox之类的浏览器~~
• 如果你喜欢本站的内容何不Ctrl+D收藏一下呢，与大家一起分享各种编程知识~
• 本网站研究机器学习、计算机视觉、模式识别~当然不局限于此，生命在于折腾，何不年轻时多折腾一下

# 机器学习导论（7）–多分类学习之ECOC编码处理

3年前 (2016-11-12) 3625次浏览

ECOC 是 Error-Correcting Output Codes 的缩写。中提到 ECOC 可以用来将 Multiclass Learning 问题转化为 Binary Classification 问题，本文中我们将对这个方法进行介绍。

E. Allwein, R. Schapire, and Y. Singer. Reducing multiclass to binary: A unifying approach for margin classifiers. Journal of Machine Learning Research, 1:113-141, 2002.

Q(0)Q(+1)Q(1)===K1KP(0)1K+K2KP(+1)1K+K2KP(1)

 from scipy.io import loadmat   mat = loadmat('uci-20070111-optdigits.mat')['int0'].astype(float) X = mat[:-1,:] Y = mat[-1,:] isplit = X.shape[1]/2 traindat = X[:,:isplit] label_traindat = Y[:isplit] testdat = X[:, isplit:] label_testdat = Y[isplit:]     import shogun.Classifier as Classifier from Classifier import LibLinear, L2R_L2LOSS_SVC, from Classifier import ECOCStrategy, LinearMulticlassMachine from shogun.Features import RealFeatures, MulticlassLabels from shogun.Kernel import GaussianKernel from shogun.Evaluation import MulticlassAccuracy   def nonabstract_class(name): try: getattr(Classifier, name)() except TypeError: return False return True   import re encoders = [x for x in dir(Classifier) if re.match(r'ECOC.+Encoder', x) and nonabstract_class(x)] decoders = [x for x in dir(Classifier) if re.match(r'ECOC.+Decoder', x) and nonabstract_class(x)]   fea_train = RealFeatures(traindat) fea_test = RealFeatures(testdat) gnd_train = MulticlassLabels(label_traindat) if label_testdat is None: gnd_test = None else: gnd_test = MulticlassLabels(label_testdat)   base_classifier = LibLinear(L2R_L2LOSS_SVC)   print('Testing with %d encoders and %d decoders' % (len(encoders), len(decoders))) print('-' * 70) format_str = '%%15s + %%-10s %%-10%s %%-10%s %%-10%s' print((format_str % ('s', 's', 's')) % ('encoder', 'decoder', 'codelen', 'time', 'accuracy'))   def run_ecoc(ier, idr): encoder = getattr(Classifier, encoders[ier])() decoder = getattr(Classifier, decoders[idr])()   # whether encoder is data dependent if hasattr(encoder, 'set_labels'): encoder.set_labels(gnd_train) encoder.set_features(fea_train)   strategy = ECOCStrategy(encoder, decoder) classifier = LinearMulticlassMachine(strategy, fea_train, base_classifier, gnd_train) classifier.train() label_pred = classifier.apply(fea_test) if gnd_test is not None: evaluator = MulticlassAccuracy() acc = evaluator.evaluate(label_pred, gnd_test) else: acc = None   return (classifier.get_num_machines(), acc)     import time for ier in range(len(encoders)): for idr in range(len(decoders)): t_begin = time.clock() (codelen, acc) = run_ecoc(ier, idr) if acc is None: acc_fmt = 's' acc = 'N/A' else: acc_fmt = '.4f'   t_elapse = time.clock() - t_begin print((format_str % ('d', '.3f', acc_fmt)) % (encoders[ier][4:-7], decoders[idr][4:-7], codelen, t_elapse, acc))

 Testing with 6 encoders and 5 decoders ---------------------------------------------------------------------- encoder + decoder codelen time accuracy Discriminant + AED 9 2.830 0.0875 Discriminant + ED 9 2.960 0.1000 Discriminant + HD 9 2.820 0.0993 Discriminant + IHD 9 2.770 0.1050 Discriminant + LLB 9 2.780 0.1007 Forest + AED 27 9.380 0.0982 Forest + ED 27 9.290 0.1000 Forest + HD 27 8.610 0.0975 Forest + IHD 27 8.680 0.1046 Forest + LLB 27 8.730 0.1004 OVO + AED 45 2.920 0.0975 OVO + ED 45 2.920 0.0975 OVO + HD 45 2.860 0.0961 OVO + IHD 45 2.770 0.0961 OVO + LLB 45 2.690 0.0975 OVR + AED 10 1.030 0.0954 OVR + ED 10 1.000 0.0954 OVR + HD 10 1.000 0.0925 OVR + IHD 10 0.970 0.0947 OVR + LLB 10 0.990 0.0954 RandomDense + AED 23 12.370 0.0947 RandomDense + ED 23 13.230 0.0975 RandomDense + HD 23 13.580 0.0982 RandomDense + IHD 23 14.390 0.0979 RandomDense + LLB 23 15.270 0.0968 RandomSparse + AED 35 6.330 0.1032 RandomSparse + ED 35 8.200 0.1004 RandomSparse + HD 35 7.780 0.0957 RandomSparse + IHD 35 6.240 0.0972 RandomSparse + LLB 35 7.780 0.0975

f(x)=(f1(x),,fL(x))

d(Bi,f(x))=j=1L(Mij,fj(x))

(y,f)=yf

fkjkfj>fkjkfj,kk

Deeplearn, 版权所有丨如未注明 , 均为原创丨本网站采用BY-NC-SA协议进行授权 , 转载请注明机器学习导论（7）–多分类学习之 ECOC 编码处理

• 版权声明

本站的文章和资源来自互联网或者站长
的原创，按照 CC BY -NC -SA 3.0 CN
协议发布和共享，转载或引用本站文章
应遵循相同协议。如果有侵犯版权的资
源请尽快联系站长，我们会在24h内删
除有争议的资源。
• 网站驱动

• 友情链接

• 支持主题

邮箱：service@deeplearn.me