基于随机森林的癌症数据预测 计算机科学和技术专业.doc
文本预览下载声明
摘要
—PAGE 35—
基于随机森林算法的癌症数据预测
摘 要
近年来,机器学习、深度学习技术在各行各业的应用推动了这些领域的智能化发展。而且,疾病预测技术还给人们带来了深度影响,改变着人们的日常工作、学习和生活,在医学方面,应用计算机相关技术进行疾病的预测已经成为当下研究的热点,医疗数据爆炸增长,已经建立起来了庞大的医疗数据库,有潜在的实用价值。随着以深度学习为代表的计算机相关技术的不断发展与成熟,出现了大数据分析技术与医学健康领域开始紧密结合。本课题主要任务是通过Python开发环境设计基于随机森林算法的癌症数据预测系统,收集病人肿瘤的周长、半径、面积等参数信息,通过构建随机森林模型,对模型进行训练,从而实时预测患者病情。系统实现可以预测患者死亡风险,全面地分析信息之间隐含的内在联系,为癌症患者的病情预防起到关键作用。
关键词:随机森林;数据挖掘;癌症预测;机器学习
ABSTRACT
ABSTRACT
In recent years, The use of machine learning and in-depth learning technologies in all life environments has promoted the intelligent development of these areas. Furthermore, disease prognosis has a significant impact on peoples daily life, work and research. The application of computer technology in medical disease prediction has become a hot spot of current research. The explosive growth of medical data has established a huge medical database, which has potential practical value. With the continuous development and maturity of big data analysis technology represented by deep learning, big data analysis technology has been deeply combined with medical and health field. The main task of this project is to design a cancer data prediction system based on random forest algorithm through Python development environment, collect the parameter information of patients tumor perimeter, radius, area and so on, and build a random forest model to train the model, so as to predict patients condition in real time. The system can predict the death risk of patients, comprehensively analyze the internal relationship between information, and play a key role in the prevention of cancer patients.
Keywords: Random forest; Data mining; Cancer prediction; machine learning
目录
目 录
TOC \o 1-3 \h \z \u 摘 要 I
ABSTRACT II
前言 1
1 绪论 2
1.1 研究背景及意义 2
1.2 研究现状 3
1.2.1 大数据挖掘 3
1.3 本课题主要工作 4
2 相关技术简介 5
2.1 Python语言 5
2.2 数据挖掘 6
2.2.1 数据挖掘方法 6
2.2.2 数据挖掘流程 7
2.3 机器学习 8
2.3.1 支持向量机 9
2.3.2 随机森林算法 9
3 系统分析 11
3.1 可行性分析 11
3.1.1 技术可行性
显示全部