Big Data : introduction and its ecosystem


Karim Benzidane

Big Data is a term that has been around for quite some time now, but there is still a lot to be understood and what to do with it. The concept is continuing its evolution, as well as getting more and more embraced. It is the fuel behind many ongoing waves of IT’ tremendous shifts, including AI, Data Science, Security, and IoT. Through this talk, we will get the lay of the land regarding Big Data, track the various use cases and paths to research, and also showcase some of the most used tools and frameworks for it.
Karim Benzidane, a researcher at heart with more than 5 years’ experience as an IT Security and infrastructure Consultant, with an avid interest into Cloud Computing and Big Data through his research. He is a member of the Openstack Foundation, as well as the "Security as a Service" group of Cloud Security Alliance where he contributed in the writings of the SECaaS guide V1.0. His research is focused on the security issues of Cloud Computing and specifically at the Cloud infrastructure level by working on intrusion management through various Big Data techniques and mechanisms. He also holds several IBM certifications in Big Data, Cloud Computing and Security and gives trainings in them as well.

Big Data Training Program

This tutorial is for those new to data science and interested in understanding why the Big Data is what it is today. It provides an introduction to become familiar with the basic concept of big data to answer various questions about it. It is also a chance to get acquainted with the most common frameworks in the Big Data world, such as Hadoop, Spark, Storm, HBase…to learn how to handle the variety, velocity, and volume aspects of Data.
Big Data Introduction
     • Big data: definition taxonomy
     • Big Data Use Cases
     • Data Science
Hadoop ecosystem
     • Introduction of the Hadoop framework
     • Map/Reduce
     • HDFS
     • Yarn
Querying in Big Data
     • SQL in Big Data
     • Hive and HiveQL
     • Data access over Hadoop via Hive
Data Storage
     • NoSQL
     • HBase
Data Movement
     • Sqoop - Flume
     • Kafka
Realtime/Streaming in Big Data
     • Spark
     • Storm
     • Flink