LogisticRegression lr = new LogisticRegression()
.setMaxIter(10)
.setRegParam(0.3)
.setElasticNetParam(0.8); //弹性参数,用于调节L1和L2之间的比例,两种正则化比例加起来是1,详见后面正则化的设置,默认为0,只使用L2正则化,设置为1就是只用L1正则化
//.setThreshold(0.8) threshold变量用来控制分类的阈值,默认值为0.5。表示如果预测值小于threshold则为分类0.0,否则为1.0。
LogisticRegressionModel lrModel = lr.fit(df);
LogisticRegressionTrainingSummary trainingSummary = lrModel.summary();
// Obtain the loss per iteration. 获取每次迭代的损失
double[] objectiveHistory = trainingSummary.objectiveHistory();
for (double lossPerIteration : objectiveHistory) {
System.out.println("loss-- "+lossPerIteration);
}
用上面的代码来做 train, 放30条数据是没有问题的。 数据量大的时候报错:
18/01/30 14:48:52 INFO OWLQN: Step Size: NaN
18/01/30 14:48:52 INFO OWLQN: Val and Grad Norm: NaN (rel: NaN) NaN
18/01/30 14:48:52 ERROR OWLQN: Failure! Resetting history: breeze.optimize.NaNHistory:
18/01/30 14:48:52 INFO OWLQN: Step Size: 1.000
18/01/30 14:48:52 INFO OWLQN: Val and Grad Norm: NaN (rel: NaN) NaN
18/01/30 14:48:52 ERROR OWLQN: Failure! Resetting history: breeze.optimize.NaNHistory:
损失函数日志 :
loss-- NaN
loss-- NaN
loss-- NaN
loss-- NaN
这是什么鬼? 该怎么解决!!