“`” 参考回答:
考虑一个多分类问题,即预测变量y可以取k个离散值中的任何一个.比如一个邮件分类系统将邮件分为私人邮件,工作邮件和垃圾邮件。由于y仍然是一个离散值,只是相对于二分类的逻辑回归多了一些类别。下面将根据多项式分布建模。
考虑将样本共有k类,每一类的概率分别为<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552912332812_49DEA3A986F227D201CCC75A9F898196"">,由于<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552912316380_DBF51BEB257F892842BFE9D8EA63B67A"">,所以通常我们只需要k-1个参数<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552912296547_51EBF30C85A21C7DEE5E7210782179B9"">即可
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552912277088_72EF8A8B451D26666AF2B0948D72B8B7"">,<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552912263846_04DBFCB5E6C1EFE47914374A2FD6E31B"">
为了推导,引入表达式:
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552912251158_C962D59D15587B82F9369EE90B9F5F0E"">
上面T(y)是k-1维列向量,其中y = 1, 2, …k.
T(y)i 表示向量T(y)的第i个元素。
还要引入表达式<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552912237198_90DF233EC01F187D2C39849CF206BE7D"">,如果大括号里面为真,则真个表达式就为1,否则为0.例如:1{2=3} = 0和1{3=3} = 1.
则上面的k个向量就可以表示为<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552912218885_2FB3BE5FD28C64BE8F3722D9ADA39CD5"">
以为y只能属于某一个类别,于是T(y)中只能有一个元素为1其他元素都为0,可以求出k-1个元素的期望:<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552912204968_040957ECCA47B5DBA52C4AD11631D00C"">
定义:<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552912191552_EDDC1007CDBC89BF6FC0FFF61FBC5173"">
其中i = 1,2,…k.则有:
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552912178118_8A678117375294A622DF9EF02883AEC6"">
也就容易得出:<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552912162976_0392BA677073060A89D408E926ABAED9"">,由该式和上面使得等式:<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552912142676_7D5FE97B6B95A23709D7BA87566D8173"">一起可以得到:<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552912020684_7230EFBD6A12F70A5C4AE71A61421A15"">这个函数就是softmax函数。
然后假设<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552912001310_B469096FC45F5EE52E99FD0AE9BF49F8"">和<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552911987122_5324B720996E57EEC63F94FDF3155C0B"">具有线性关系,即<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552911965946_60229DCD5261960716C722C07147A653"">
于是从概率的角度出发:
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552911950771_AE7E660593807F2457EC2DEE2DE93E13"">
其中<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552911933261_2859076CD1212DAD72F6C04A67F723D8"">这个模型就是softmax回归(softmax regression),它是逻辑回归的泛化。
这样我们的输出:
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552911911951_FCE07DCC5BB5009B1504A8568F025CCA"">
就是输出了x属于(1,2,…k-1)中每一类的概率,当然属于第k类的概率就是:<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552911897289_9B9FCDA45372DC22697AC68B771918EF"">
下面开始拟合参数
同样使用最大化参数θ的对数似然函数:
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190318/311436_1552911884751_B46F06E75A30F53AF5C22717DF0D38C2"">
这里使用梯度下降和牛顿法均可。
<pre><code> "“`
Was this helpful?
0 /
0