“`” 参考回答:
考虑一个多分类问题,即预测变量y可以取k个离散值中的任何一个.比如一个邮件分类系统将邮件分为私人邮件,工作邮件和垃圾邮件。由于y仍然是一个离散值,只是相对于二分类的逻辑回归多了一些类别。下面将根据多项式分布建模。
考虑将样本共有k类,每一类的概率分别为<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780754032_55AD4EF88F7E3AD26F2CD92B856E3CE8"">,由于<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780734055_FECFD8E308BD37D59C0485500E9F903E"">,所以通常我们只需要k-1个参数<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780710620_E75398371BAA46411285BA02E9B31529"">即可
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780686297_B9A323E4B63969BABB432C94241872AC"">,<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780670060_07223BB17E158C69D5CC16AB69EA1089"">
为了推导,引入表达式:
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780653625_EF2B4EA5EADF1C9C886AD245EE3F3085"">
上面T(y)是k-1维列向量,其中y = 1, 2, …k.
T(y)i 表示向量T(y)的第i个元素。
还要引入表达式<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780638345_394803B5E9198D011F6406B3BE3E3951"">,如果大括号里面为真,则真个表达式就为1,否则为0.例如:1{2=3} = 0和1{3=3} = 1.
则上面的k个向量就可以表示为<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780618356_8A3CEDBA782BECC1F781F704839E1B1E"">
以为y只能属于某一个类别,于是T(y)中只能有一个元素为1其他元素都为0,可以求出k-1个元素的期望:<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780585157_9A2AFCAA427EE8D613BEAB8823AABE80"">
定义:<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780566695_6303D5194694BAB414C25ED0D950084A"">
其中i = 1,2,…k.则有:
也<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780545108_75847EAFDFBEF2B1F7293B5EB9EA93F8"">
就容易得出:<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780530817_BF7C4D4701A89F54B81141C7C39F59F7"">,由该式和上面使得等式:<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780514381_9F97E100D5E05908A40E8643FA7028F2"">一起可以得到:<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780497399_E58AD253BCF4ED985E901176E56D8EB1"">这个函数就是softmax函数。
然后假设<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780480942_A4C083330C29D979912BA2C60DF61410"">和<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780459453_6BD4A2027B1D0DA5BFDBA09C07EDFB72"">具有线性关系,即<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780442539_24D412E5E6994ECE25683D4A27FA9F64"">
于是从概率的角度出发:
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780427162_84262429E420B08486929D2DA76549A0"">
其中<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780407386_982395D6BA00A790D46FB3B006F9295C"">这个模型就是softmax回归(softmax regression), 它是逻辑回归的泛化。
这样我们的输出:
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780380844_93B3D62F538DCD81B879D415FF8AFB44"">
就是输出了x属于(1,2,…k-1)中每一类的概率,当然属于第k类的概率就是:<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780363300_31F8F724F33941669B4EDB4B912451FC"">
下面开始拟合参数
同样使用最大化参数θ的对数似然函数:
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190317/311436_1552780344713_55B9170B6E66F63328892E74B55F18CE"">
这里使用梯度下降和牛顿法均可。
<pre><code> "“`
Was this helpful?
0 /
0