Spatio-Temporal Stream Analysis Based On Hadoop And R Cooperating Via Relational Database
This paper presents how Hadoop, MySQL, and R cooperate on different machines for the analysis of diverse data streams generated in smart grids, particularly integrating geographic information. Our analysis scenarios benefit from big data processing, information sharing, and abundant statistical analysis, respectively. Specifically, 1) R solely plots the geographic information obtained from an ESRI shape file not so intensively shared. 2) R combines the location information with the district-by-district statistics downloaded from MySQL. 3) Hadoop first summarizes the daily report from a potentially massive amount of data and stores in MySQL. Then, R downloads and conducts the trend pattern analysis using its neural network library. In such procedures, R installs the libraries of RMySQL, GISTools, and neuralnet. This framework can host additional components for more sophisticated analysis for ever-growing smart grid entities such as electric vehicles.
Keywordsó Smart Grid, Massive Data Stream, Hadoop, R Package, MySQL Database.