We are happy to announce our third event for 2017! This time, we welcome Konstantine Karantasis (Software Engineer at Confluent Inc) who will talk about how to build streaming pipelines with Apache Kafka & Confluent Open Source tools. Our second speaker is Alex Palamides (Data Scientist at Clayton Euro Risk) who will present us the Microsoft R Server solution and how to perform big data operations with it. At the end, Landoop engineers will show us their new powerful framework Lenses for Apache Kafka ™</a>, in a mini presentation.

1st Talk:

Title: Riding the Streaming Wave DIY style: Using & Building Kafka Connect Plugins with Confluent Open Source

Stream processing is changing the way companies organize their data systems architecture and respond to events critical to their business. In this talk, we’ll review how software available with Confluent Open Source can help you hit the ground running when integrating your data systems to Apache Kafka. We’ll see how Kafka Connect API can be leveraged to do the heavy lifting at scale and how new tools in Confluent Open Source help you use, test and even develop Kafka Connect plugins.

Konstantine Karantasis is a Software Engineer at Confluent, Inc. working from Palo Alto, CA. He’s the main contributor to open source projects such as the Confluent S3 Connector, classloading isolation in Apache Kafka Connect, Confluent CLI and many more. Previously, he built open source web-services for big data at Yahoo and did HPC research at the University of Illinois at Urbana-Champaign. Konstantine holds a Ph.D. from the University of Patras.

2nd Talk:

Title: Analytics Beyond RAM Capacity: The Microsoft R Server Solution

R is a language and environment for statistical computing and graphics which was developed at Bell Laboratories and is considered one of the default choices for a data scientist. However as by design all computations take place in RAM, it suffers from memory limitations in big data applications. Microsoft R Server (MSR) on the other hand by utilizing RevoScaleR package capabilities follows a different approach; Datasets are stored on the disk and computations are performed into chunks of data, therefore data is inherently distributed. However as most open-source R algorithms require the whole data frame loaded into RAM, the first challenge is to process distributed data indirectly utilizing open-source R algorithms. On the other hand in the MSR most common data operations (manipulation and analysis) are supported by counterpart functions. Moreover the inherently parallel processing makes deployment to a production environment such as SQL Server or on HDFS relative easy. MSR runs either in standalone mode, either within the SQL Server branded as R Services.

Dr. Alex Palamides is a data scientist in Clayton Euro Risk, deploying risk and marketing models in the banking sector mainly programming with R. Previously he was with IRI (EU and US), with the European Space Agency and in various consulting roles. He holds a BSc in Electronic and Computer Engineering from the Technical University of Crete and a PhD in Computational Statistics from the University of Peloponnese.

Mini presentation:

Landoop will give a short introduction and presentation.

Landoop last week, during the Kafka Summit (San Francisco) announced their new powerful framework Lenses for Apache Kafka ™ (a visual interface for interactive queries on Kafka topics via Kafka SQL).

Landoop is a company based in London, Amsterdam and Athens.


