“`” 参考回答:
逻辑回归本质上是线性回归,只是在特征到结果的映射中加入了一层逻辑函数g(z),即先把特征线性求和,然后使用函数g(z)作为假设函数来预测。g(z)可以将连续值映射到0 和1。g(z)为sigmoid function.
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190315/311436_1552624004848_E7CE042FBEFB1843DBDF7CBC11F63358"">
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190315/311436_1552624039328_D93C5C8BB77DECD3743FD4B449A66E23"">
sigmoid function 的导数如下:
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190315/311436_1552624079107_25A6F4938FAF4F5021AA43DF2C167899"">
逻辑回归用来分类0/1 问题,也就是预测结果属于0 或者1 的二值分类问题。这里假设了二值满足伯努利分布,也就是
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190315/311436_1552624133501_38402DF2A4F39DCE4B47E0F0058EB22B"">
其也可以写成如下的形式:
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190315/311436_1552624169461_C10597590C1F80BF897CFAAB01AAE3C0"">
对于训练数据集,特征数据x={x1, x2, … , xm}和对应的分类标签y={y1, y2, … , ym},假设m个样本是相互独立的,那么,极大似然函数为:
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190315/311436_1552624194159_4C51583919397A517272634BF6F96B98"">
log似然为:
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190315/311436_1552624220293_576FE1589F3DEF2EEACA159D7FA1C5A7"">
如何使其最大呢?与线性回归类似,我们使用梯度上升的方法(求最小使用梯度下降),那么<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190315/311436_1552624301854_7456736DC4C6ABC5CD67D04AFCA7236B"">
。
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190315/311436_1552624393303_26E2F3F1A0183E71954E3F7BCF248419"">
如果只用一个训练样例(x,y),采用随机梯度上升规则,那么随机梯度上升更新规则为:
<img alt=""img"" referrerpolicy=""no-referrer"" src=""https://uploadfiles.nowcoder.com/images/20190315/311436_1552624437888_433A2D48F01B5C99162784250B8FDFEA"">
<pre><code> "“`
Was this helpful?
0 /
0