A Data Mining Infrastructure For Cheminformatics

An enormous increase of data sources for chemical information and biological science requires a new development methodology for mining useful information.Such data sources give us an opportunity to utilize computational tools to mine useful information and to find new patterns in data sets that explain scientific phenomena not yet known. It is also important that non-expert users can access the latest cheminformatics methodology and models to spread the new discoveries. We present our previous developments in cheminformatics procedures and infrastructure that provide an appropriate approach to mining large chemical datasets. We also discuss the limitation of previous challenge and propose a new infrastructure with the state-of-the-art techniquesexpectedto improvethe performance. Index Terms - Cheminformatics, Work flow, Web service, Big Data.