Applying Machine Learning Techniques to Identify Top Stocks
In this paper, we focus on the application of machine learning algorithms such as the hierarchical clustering technique, the principle component analysis technique and the logistic regression technique. The main objective is to firstly, identify the most important technical factors that may be associated with “good” stocks and also to identify a cluster of stocks that may be classified as “good” stocks. Using only technical data, two principle components were identified, “Price High” and “Price Growth”, contributing 46% and 29% to explaining the variability in the stock data. Hierarchical Clustering was used to understand which stocks were similar. Notably, stocks that scored high on the factor analysis were also grouped together by the hierarchical clustering method. Further, a machine learning model, the logistic regression model predicted the top five stock to go up in price, these same five stocks were identified by the hierarchical clustering technique. The logistic regression model had a sensitivity of 80%.
Keywords- Machine learning model; logistic regression; principle component analysis; hierarchical clustering; stocks; stock trading