Creating a real time analytical database using Kafka, CDC events and Kudu Niklas Nylund / PAF

Intro PAF ● ● ● ●

PAF or Ålands Penningautomatförening Founded in 1967, first internet business launched 1999 PAF provides online and offline gambling & casino services Main markets are Finland, Sweden, Estonia, Spain

Key figures ● ● ● ●

Revenue 113,5 MEUR (2016) Profit 15,2 MEUR (2016) Number of employees 376 Development office in Helsinki with ~90 employees

Data pipeline & Kafka?

Requirements Non functional requirements ● ●

Provide a near real time data platform that can serve current and future needs for analytics, reporting and machine learning Provide a setup with less spaghetti and improve development speed

Functional requirements ● ● ● ●

No hard real time requirements. Lag preferably < 10 seconds, temporary spikes allowed Reliably handle financial transactions, millions records / 24 hours. Expected to increase Handle logs, metrics, file imports etc from around 100 services Handle at least 10x more data compared to current DW. Including combining different data sets, such as financial transactions, click streams, data from both online and offline business, application metrics and logs

Key decisions Architectural choices ● ● ● ● ●

Exactly once semantics or deduplication Idempotent writes Lambda architecture - streaming / batch Late arriving events - event time vs processing time Event sourcing

Kafka specifics ● ● ● ● ● ●

Use Kafka-connect vs roll your own? No global ordering across topics Kafka Streams vs Kafka client library? Which serializer? AVRO, JSON, Protobuf, custom? Use Schema registry from Kafka stack? Reliable producers & consumers

Other things to consider Downstream database ● ● ● ●

Good UPSERT performance needed for maintaining snapshot tables We needed a JDBC interface for downstream reporting tools Columnar storage format preferred for fast analytical queries Apache Kudu vs AWS offering

Operational aspects ● ●

Monitoring Disaster recovery

Final solution Producing into Kafka ● ● ● ● ● ● ●

In-house micro service sets up triggers for CDC style tables Retry log kept in SQL DB in same transactional scope as the business logic Use Kafka’s schema registry and AVRO Convert SQL to AVRO schema using metadata from JDBC driver One topic per source view Multiple tables joined together into views where necessary One JVM thread per source view, run in docker container with X threads per container

Consuming from Kafka ● ● ● ● ●

In-house micro service that can read any AVRO topic and write it to Apache Kudu Convert AVRO schema to Kudu schema Sets up tables, hash & range columns, partitions etc Manual Kafka offset handling to deal with errors thrown from Apache Kudu Maintains a changelog table of all Kafka records and a snapshot/compacted table based on Kafka record key

The end

Kafka Meetup 18 Oct 2017-1.pdf

Non functional requirements. ○ Provide a near real time data ... Functional requirements. Page 5 of 10. Kafka Meetup 18 Oct 2017-1.pdf. Kafka Meetup 18 Oct ...

268KB Sizes 1 Downloads 129 Views

Recommend Documents

www.FlamesOfWar.com - Meetup
Jul 14, 2012 - Outpost, demolish bridges with an Engineer Combat Company, ..... Support platoons can be of any variant type and do not have to be from the ...

Boston Clojure Meetup -
“Create truly native iOS apps in Java”. Two things ... http://docs.robovm.com/advanced-topics/bro.html ... libraries, not an abstraction on top of iOS/Android. 3. ... The right tool for native development depends on why you want native in the fir

Sylabs MeetUp -
Feb 22, 2018 - Dial(for higher quality, dial a number based on your current location):. US: +1 408 638 0968 or +1 646 876 9923 or +1 669 900 6833. Meeting ID: 148 587 480. International numbers available: https://zoom.us/zoomconference?m=kOPw3VPmJXA_

kafka-connect-postgres-elasticsearch-2018-10-18.pdf
Try one of the apps below to open or edit this item. kafka-connect-postgres-elasticsearch-2018-10-18.pdf. kafka-connect-postgres-elasticsearch-2018-10-18.pdf.

Boston Clojure Meetup -
mobile web. ClojureScript. +. X. = hybrid app. Clojure. +. RoboVM (iOS)/ various (Android). = native app. Android's already Java. On. iOS, compile Java bytecode ...

Agenda-Oct 18, 2017 PC mtg.pdf
Page 1 of 50. UNIVERSITY OF CAMBRIDGE INTERNATIONAL EXAMINATIONS. International General Certificate of Secondary Education. MARK SCHEME for the May/June 2011 question paper. for the guidance of teachers. 0620 CHEMISTRY. 0620/12 Paper 1 (Multiple Choi

Link Letter Carwyn Jones Oct 18, 2016.pdf
Page 1 of 2. From: Royston Jones, blogging as Jac o' the North. http://jacothenorth.net/blog/. To: Carwyn Jones AM,. First Minister,. Welsh Government. 18.10.2016. LINK HOLDINGS (GIBRALTAR) LTD. Dear Mr Jones,. I have no way of knowing if you've read

Oct-18-2016 PEV charging network planning on coupled ...
Oct-18-2016 PEV charging network planning on coupled transportation and power networks - Hongcai.pdf. Oct-18-2016 PEV charging network planning on ...

BVA Meeting Minutes 18 Oct 16.pdf
Respectfully Submitted. Nancy Henderson, Secretary. Page 2 of 2. BVA Meeting Minutes 18 Oct 16.pdf. BVA Meeting Minutes 18 Oct 16.pdf. Open. Extract.

La Repubblica Oct-18-2016 19 det.pdf
IBDPMMBCPSBUP-VJHJ(BFUBOJ. Page 1 of 1. La Repubblica Oct-18-2016 19 det.pdf. La Repubblica Oct-18-2016 19 det.pdf. Open. Extract. Open with.

Chef-provisioning-Tokyo-meetup-feb.pdf
Connect more apps... Try one of the apps below to open or edit this item. Chef-provisioning-Tokyo-meetup-feb.pdf. Chef-provisioning-Tokyo-meetup-feb.pdf.

Dan Dietz Greenville Django + Python Meetup - GitHub
Awaken your home: Python and the. Internet of Things. PyCon 2016. • Architecture. • Switch programming. • Automation component. Paulus Schoutsen's talk: ...

Poll "Code4Lib Chicago 2015 Fall Meetup" - Groups
Sep 1, 2015 - November 2015. December 2015. Fri 13. Mon 16. Fri 20. Mon 23. Tue 24. Thu 3. Allan Berry, UIC. OK. OK. OK. OK. OK. OK. Jeremy Prevost ...

Meetup-20160727-SM-Ansible-Rollout.pdf
Loading… Page 1. Whoops! There was a problem loading more pages. Meetup-20160727-SM-Ansible-Rollout.pdf. Meetup-20160727-SM-Ansible-Rollout.pdf.

Meetup-20160727-MK-Ansible-Einfuehrung.pdf
Python 2.6 oder 2.7. •. „managed node“: Unix (auch Windows). – Python 2.5. – Python 2.4 mit python-simplejson. – (libselinux-python). Whoops! There was a ...

Learn Python the Hard Way - Meetup
By going through this book and copying each example exactly, you will be training your brain to ... music theory, ear training, songs, and anything else I can. ...... give to a human. You print them, save them to files, send them to web servers, all

Cloud Security Meetup 30012017.pdf
Loading… Page 1. Whoops! There was a problem loading more pages. Cloud Security Meetup 30012017.pdf. Cloud Security Meetup 30012017.pdf. Open.

20160513-Docker Meetup-uploaded.pdf
Whoops! There was a problem loading more pages. 20160513-Docker Meetup-uploaded.pdf. 20160513-Docker Meetup-uploaded.pdf. Open. Extract. Open with.

VOTO Mobile Director of Programs - Meetup
We also have an API used to power existing mobile service providers. ... University, McKinsey&Company, the Bill & Melinda Gates Foundation, Esoko, Facebook ...

20160513-Docker Meetup-uploaded.pdf
Page 1 of 24. RANCHER & CONTINUOUS DELIVERY. DockerGrunn #6. Johan van der Geest. Edwin Harmsma. Page 1 of 24 ...

VOTO Mobile Director of Programs - Meetup
Bachelors degree required, Master degree preferred. • Experience living and working in your desired country of activity. • Experience of 2+ years in as many of ...

Kafka Live!
Dec 7, 2007 - "[Wleil ich nicht die Speise finden konnte, die mir schmeckt. Hatte ich sie gefunden, glaube mir, ich hatte kein Aufsehen gemacht und mich vollgegessen wie du und alle." (E 199-. 200). "Forgive me, everybody," whispered the hunger artis

Trevor​​Kafka
Website​:​​www.trevorkafka.com. Height​:​​5​​feet​​9.5​​inches,​​​Weight​:​​161​​lbs,​​​Hair​:​​dark​​brown .... Circus​ ​Training. 2011.