Paper Title
Risk Classification for NSCLC Survival Using Microarray and Clinical Data

Abstract
Lung cancer is one of the most common cancers in the world, and Non-Small Cell Lung Cancer (NSCLC) is the most dangerous and common type of lung cancer. Therefore, it is of paramount importance to predict NSCLC survival, so that suitable treatments can be sought. Nonetheless, conventional methods of risk classification of cancer survival rely solely on histopathology data and predictions are not reliable in many cases. In this paper, we proposed a risk classification model using high-throughput gene expression data and clinical data to predict NSCLC survival. We used Gain Ratio (GR) and Improved Gene Expression Programming (IGEP) algorithms for attribute selection. For classification, we used Support Vector Machine (SVM) alongside with 10-fold cross validation. The results demonstrated the effectiveness of the proposed model with the average accuracy 90.7% which is higher than other representative models. We obtained three gens LCK, DUSP6 and ERBB3 with T_stage and N_stage clinical factors that can get good prediction results. Index Terms- Lung cancer, Risk classification, microarray dataset, clinical data.