基于数据挖掘的企业失信识别模型研究.pdf
文本预览下载声明
基于数据挖掘的企业失信识别模型研究
数
业
161203105328
2020 业
信
的业基于数据挖掘的企业失信识别
模型研究的研究的
的
的数据
2020 业
基于数据挖掘的企业失信识别模型研究
企业失信数据的数
据挖掘企业失信Python 的pandas 数
据的失失30% 的19 企
业失信的 12 R 的DMwR 基于
knn 数据的失数据的企业型
企业4 企业失信数据
模型模型模型企业失信识别模
型ROC 模型
的别90.9%92% 92.57%
模型的AUC 0.51 0.61 模型的
MLP 模型MLP 模型的92%AUC
0.89.
企业失信 数据挖掘 ROC
2020 业
Researchon the model of corporate dishonestyrecognitionbased
onData Mining
Abstract
How to detect whether enterprises break the law or not has become a problem in the era
of big data. This paper will use data mining method to predict whether the enterprise is
dishonest.First of all, we use Pythons pandas package to make statistics on the missing
values of the data, and remove the indicators with a missing rate of more than 30%, and
thenremovetheremaining19indicatorsthathavenoimpactontheenterprisesdishonesty.
At last, we leave 12 indicators. Then, based on the principle of KNN algorithm, the dmwr
package of R is used to fill the missing data.Secondly, this paper makes a data visualiza-
tion analysis on the four indicators of data, namely, enterprise type, registration authority,
enterprise status and jurisdiction authority.Finally, the decision tree model, random forest
model and gradient promotion decision tree model are selected to establish
显示全部