Creating a real time analytical database using Kafka, CDC events and Kudu Niklas Nylund / PAF
Intro PAF ● ● ● ●
PAF or Ålands Penningautomatförening Founded in 1967, first internet business launched 1999 PAF provides online and offline gambling & casino services Main markets are Finland, Sweden, Estonia, Spain
Key figures ● ● ● ●
Revenue 113,5 MEUR (2016) Profit 15,2 MEUR (2016) Number of employees 376 Development office in Helsinki with ~90 employees
Data pipeline & Kafka?
Requirements Non functional requirements ● ●
Provide a near real time data platform that can serve current and future needs for analytics, reporting and machine learning Provide a setup with less spaghetti and improve development speed
Functional requirements ● ● ● ●
No hard real time requirements. Lag preferably < 10 seconds, temporary spikes allowed Reliably handle financial transactions, millions records / 24 hours. Expected to increase Handle logs, metrics, file imports etc from around 100 services Handle at least 10x more data compared to current DW. Including combining different data sets, such as financial transactions, click streams, data from both online and offline business, application metrics and logs
Key decisions Architectural choices ● ● ● ● ●
Exactly once semantics or deduplication Idempotent writes Lambda architecture - streaming / batch Late arriving events - event time vs processing time Event sourcing
Kafka specifics ● ● ● ● ● ●
Use Kafka-connect vs roll your own? No global ordering across topics Kafka Streams vs Kafka client library? Which serializer? AVRO, JSON, Protobuf, custom? Use Schema registry from Kafka stack? Reliable producers & consumers
Other things to consider Downstream database ● ● ● ●
Good UPSERT performance needed for maintaining snapshot tables We needed a JDBC interface for downstream reporting tools Columnar storage format preferred for fast analytical queries Apache Kudu vs AWS offering
Operational aspects ● ●
Monitoring Disaster recovery
Final solution Producing into Kafka ● ● ● ● ● ● ●
In-house micro service sets up triggers for CDC style tables Retry log kept in SQL DB in same transactional scope as the business logic Use Kafka’s schema registry and AVRO Convert SQL to AVRO schema using metadata from JDBC driver One topic per source view Multiple tables joined together into views where necessary One JVM thread per source view, run in docker container with X threads per container
Consuming from Kafka ● ● ● ● ●
In-house micro service that can read any AVRO topic and write it to Apache Kudu Convert AVRO schema to Kudu schema Sets up tables, hash & range columns, partitions etc Manual Kafka offset handling to deal with errors thrown from Apache Kudu Maintains a changelog table of all Kafka records and a snapshot/compacted table based on Kafka record key
Non functional requirements. â Provide a near real time data ... Functional requirements. Page 5 of 10. Kafka Meetup 18 Oct 2017-1.pdf. Kafka Meetup 18 Oct ...
Jul 14, 2012 - Outpost, demolish bridges with an Engineer Combat Company, ..... Support platoons can be of any variant type and do not have to be from the ...
âCreate truly native iOS apps in Javaâ. Two things ... http://docs.robovm.com/advanced-topics/bro.html ... libraries, not an abstraction on top of iOS/Android. 3. ... The right tool for native development depends on why you want native in the fir
Feb 22, 2018 - Dial(for higher quality, dial a number based on your current location):. US: +1 408 638 0968 or +1 646 876 9923 or +1 669 900 6833. Meeting ID: 148 587 480. International numbers available: https://zoom.us/zoomconference?m=kOPw3VPmJXA_
Try one of the apps below to open or edit this item. kafka-connect-postgres-elasticsearch-2018-10-18.pdf. kafka-connect-postgres-elasticsearch-2018-10-18.pdf.
Page 1 of 50. UNIVERSITY OF CAMBRIDGE INTERNATIONAL EXAMINATIONS. International General Certificate of Secondary Education. MARK SCHEME for the May/June 2011 question paper. for the guidance of teachers. 0620 CHEMISTRY. 0620/12 Paper 1 (Multiple Choi
Page 1 of 2. From: Royston Jones, blogging as Jac o' the North. http://jacothenorth.net/blog/. To: Carwyn Jones AM,. First Minister,. Welsh Government. 18.10.2016. LINK HOLDINGS (GIBRALTAR) LTD. Dear Mr Jones,. I have no way of knowing if you've read
Connect more apps... Try one of the apps below to open or edit this item. Chef-provisioning-Tokyo-meetup-feb.pdf. Chef-provisioning-Tokyo-meetup-feb.pdf.
Awaken your home: Python and the. Internet of Things. PyCon 2016. ⢠Architecture. ⢠Switch programming. ⢠Automation component. Paulus Schoutsen's talk: ...
Sep 1, 2015 - November 2015. December 2015. Fri 13. Mon 16. Fri 20. Mon 23. Tue 24. Thu 3. Allan Berry, UIC. OK. OK. OK. OK. OK. OK. Jeremy Prevost ...
Python 2.6 oder 2.7. â¢. âmanaged nodeâ: Unix (auch Windows). â Python 2.5. â Python 2.4 mit python-simplejson. â (libselinux-python). Whoops! There was a ...
By going through this book and copying each example exactly, you will be training your brain to ... music theory, ear training, songs, and anything else I can. ...... give to a human. You print them, save them to files, send them to web servers, all
We also have an API used to power existing mobile service providers. ... University, McKinsey&Company, the Bill & Melinda Gates Foundation, Esoko, Facebook ...
Bachelors degree required, Master degree preferred. ⢠Experience living and working in your desired country of activity. ⢠Experience of 2+ years in as many of ...
Dec 7, 2007 - "[Wleil ich nicht die Speise finden konnte, die mir schmeckt. Hatte ich sie gefunden, glaube mir, ich hatte kein Aufsehen gemacht und mich vollgegessen wie du und alle." (E 199-. 200). "Forgive me, everybody," whispered the hunger artis