Speakers list agenda

Kafka: Building a scalable search infrastructure with Kafka Connect and Kafka Streams

09:30 - 13:00, 7th of May (Tuesday) 2019/ WORKSHOP 4
Amber Expo

Trainer: Pere Urbon-Bayes(Confluent)

Workshop description:

In this workshop we are going to explore the best practises used in the industry to build scalable and reliable search architectures. In the session we are going to see how Apache Kafka, Kafka Connect and Kafka Streams can be used for this purpose.

After a quick review of the why and how, we’re going to quickly jump into a real life e-commerce example of using Kafka Connect and Kafka Streams to build a world class search. Based on this example we’ll practise how to build scalable indexing pipelines.

Even if we focus this workshop on search, this very same workshop ideas could be applied to other data engineering task where moving data from system A to system B, including some data transformations and preparation, is the main objective.

In this workshop, we’ll go hands-on into exploring:

  • Apache Kafka best practises as an streaming platform to connect your data.
  • Kafka Connect as a powerful tool to ingest your search documents into your search engine.
  • Kafka Streams as a valuable asset to enrich your data points before their reach your search engine.
  • Best practises for data enrichment, aggregation, cleansing, filtering and consolidation to scale your search solution.

To have a successful journey in this workshop you would need:

  • Previous basic experience of Apache Kafka as an streaming platform.
  • Previous knowledge of Elasticsearch or Solr as a search engine.
  • Basic managing of docker, as the platform we will be using to run the workshop.
  • No previous knowledge of Kafka Connect and Kafka Streams is required, but would certainly be of value.

Environment: 

  • A laptop with a Linux distribution or a Mac OS, in case you have a windows machine you will need virtualization to have a linux machine. This workshop materials are not tested using windows.
  • Latest version of Docker for Linux or Mac. (18.06.0-ce |1.22.0).
  • Java 8.
  • Your Java IDE of choice (Eclipse or Intellij).

Prework will be sent via email closer to the workshop.

Language: English.

Location: AmberExpo (Gdańsk, Żaglowa 11)

TOPICS:
Bigdata DataTech Kafka Workshop

Pere Urbon-Bayes

Confluent Inc