More and more applications have to provide scalable, high-throughput, fault-tolerant stream processing of live data streams. A very good example of this are all the Internet of Things – IoT – applications. Through this in-depth workshop you will get a jump-start on scaling distributed computing by taking an example time series application and coding through different aspects of working with such a dataset.
What you will learn
Entire workshop will take 3 hours. You’ll get up to speed on emerging techniques and technologies by analyzing case studies, develop new technical skills , share emerging best practices in big data (Hadoop, Spark, IoT, Kafka API and Apache HBase).
We will cover building an end to end distributed processing pipeline using various distributed stream input sources, to rapidly ingest, process and store large volumes of high speed data.
The depth and breadth of what’s covered at workshop requires basic knowledge of Scala and Java to work on exercises intended to teach you the features of Spark Streaming for processing live data streams ingested from sources like Apache Kafka, sockets or files, and storing the processed data in HBase.
Additional information
https://www.mapr.com/services/mapr-academy/big-data-hadoop-online-training
https://www.mapr.com/blog/getting-started-sample-programs-apache-kafka-09
https://www.mapr.com/blog/getting-started-sample-programs-mapr-streams
https://www.mapr.com/blog/high-speed-kafka-api-publish-subscribe-streaming-architecture-how-works-message-level
https://www.mapr.com/blog/spark-streaming-hbase
https://www.mapr.com/blog/guidelines-hbase-schema-design
NB!
Participants are requested to have laptops with internet access (wifi connection will be available) and the following software:
- JDK 8
- Git
- Maven 3.x or later
- Virtual Box
About master-class instructors
Tugdual Grall
Technical Evangelist, MapR
Tugdual Grall is the Technical Evangelist at MapR, where he helps developers and enterprise to adopt big data technologies . He is a big data and Spark expert. He is frequently invited to speak at big data–related conferences. He is passionate about designing application architecture, building new products, big data analytics.
Over the last 20 years, Tugdual has successfully led the development of several innovative technology products from concept to release. Prior to joining MAPR, he was the CTO of eXo, after working at Oracle for nine years. Before eXo, he worked in a number of hi-tech companies, leading new product development.
Crystal Valentine
VP Technology Strategy, MapR
Crystal Valentine is Vice President of Technology Strategy at MapR, a Silicon Valley-based big data company. She has an extensive background in big data research and practice. Before joining MapR, she was a professor of computer science at Amherst College. She is the author of several academic publications in the areas of algorithms, high-performance computing, and computational biology and holds a patent for Extreme Virtual Memory.
As a former consultant at Ab Initio Software working with Fortune 500 companies to design and implement high-throughput, mission-critical applications and as a tech expert consulting for equity investors focused on technology, she has developed significant business experience in the enterprise computing industry.
Dr. Valentine received her doctorate in Computer Science from Brown University and was a Fulbright Scholar to Italy.
LinkedIn: https://www.linkedin.com/in/crystal-valentine-29003a53
Comment