Introduction to Hadoop - adopting the Big Data platform in your organization
12:45 - 13:15, 18th of May (Thursday) 2017/ TECH STAGE
With more than 50 Hadoop related projects just at the Apache Foundation itself, getting started with the technology can be hard. Understanding the key concepts and motivation behind Hadoop and its distributed file system is fundamental for effective development on the platform. After covering these basics in the talk, I will show you how the technical adoption of Hadoop in an organization may look like based on my experiences. To help you understand the process, I will use a made up example of a simplified data model of a bank transaction system. Let’s take it from there and go through: Big Data platform architecture design, tools for data processing and retrieval, defining workflows on Hadoop, and their orchestration. Some of the Hadoop ecosystem Apache projects I will relate to during the talk include: Hive, Sqoop, Oozie, Spark, Avro, and Parquet.