Emerging Technology Big Data-Hadoop Over Datawarehousing ETL
Over the last few years, organizations across public and private sectors have made a strategic decision to turn big data into competitive advantage. The challenge of extracting value from big data is similar in many ways to the age-old problem of distilling business intelligence from transactional data. At the heart of this challenge, the process is to extract data from multiple sources, then transform it to your analytical needs, and load it into a data warehouse for subsequent analysis. This whole process is known as “Extract, Transform & Load” (ETL). There are many traditional ETL tools available in the market like Informatica, Abinitio, Data Stage and many others which are popular in industry. Then why this open source Apache Hadoop has gained lot of importance and changing the era in industry. In-fact many companies have started their research on BIG DATA- open source Apache HADOOP. A confusion has raised in most of us like ‘‘Hadoop replaces the relational database and is becoming the new data warehouse?''. This paper clears much confusion that generally raise for a beginner who are willing to learn and work on Hadoop and it also shed light on the differences and help architects to identify when to deploy Hadoop and when it is best to use a data warehouse. This paper also introduces a new way of ETL process for taking business intelligence decisions in Apache Hadoop by performing ELT, ELTL .A thorough analysis had done on various research papers and collected various opinions from scientific researchers of different MNC companies like IBM, INTEL, TERADATA Corporation, INFOSYS, DATASTAX, CLOUDERA Corporation and many others".
Keywords: ETL, Big Data-Hadoop, Informatica ,Abinitio ,Data Stage, Data Warehousing, ELT, ELTL