Cancer Prediction using Data Mining technique

Published on . Written by

Cancer Prediction using Data Mining technique

Introduction


Skyfi Labs Projects
Today, Data Mining becomes highly important among the health care areas along with industrial areas. Mostly data mining is helpful to predict the result by analyzing the data. Collecting data, computing data, analyzing data are an indivisible area of business methodology. As we all know data mining is very popular for analysis and modeling, so the article is for Cancer prediction using data mining. The article focuses on the back end technologies for project implementation. If you are interested in this domain then go through the article. Skyfi Labs helps you to learn more technologies and boost your career.

Read more..

SLNOTE
Project Description

This project focuses on the guidelines and basic knowledge required to implement the Cancer Prediction using ML and Data Mining. So here we are using the history of the disease with conditions and criteria of the disease. As it depends on prediction analysis, we must have the training and testing datasets. Even we can use a single dataset for training and testing both. It is a very simple project for second-year engineering students. Who can learn the practical implementation of the concept by developing such a project? So follow this article for the primary guidelines of the project.


SLLATEST
Implementation Guidelines

  1. The very first step is to find a dataset. You can easily get a dataset from Kaggle which is the largest platform of the collection of datasets. There are many datasets for Cancer risk prediction by the World Health Organization or any private organization.
  2. The dataset contains age, gender, calories, BP, and some other health-related attributes.
  3. Then import required libraries like numpy, pandas, matplotlib and seaborn, etc. 
  4. Here we are using collab notebook by Google. So it is very easy to upload the dataset. So you have to import the dataset that we have downloaded already. These are no need to install python. Just you have to go through link for the colab console
  5. There is a diagnosis column in the dataset which contains M and B values. M for Malignant and B for Benign.
  6. We have to encode the categorical data values using sklearn library. It will convert the M into 1 and B into 0 for better understanding.
  7. You can plot a graph using seaborn library for better understanding. Also, get the correlation between the columns of the dataset.
  8. Then divide the dataset into 75% and25% for training and testing respectively. Scale the train and test data.
  9. Here we are using the decision tree model which has the highest accuracy for training and testing data. You can also choose the random forest model also but check its accuracy first.
  10. The last step is to print the prediction. So print the prediction of the highly accurate model. You can print both predicted results and actual results for better understanding.
  11. Please note that you can check the accuracy of each model for both training and testing and choose the most accurate model for better results.
Benefits

Model Understanding

Practice on Colab platform

So here are some basic guidelines for the Cancer risk prediction project. You can search more for the betterment of the project. So stay connected with Skyfi Labs.


SLDYK
Kit required to develop Cancer Prediction using Data Mining technique:
Technologies you will learn by working on Cancer Prediction using Data Mining technique:


Any Questions?


Subscribe for more project ideas