We now live in a world with data at its heart. The amount of data being produced every day is growing exponentially and a large amount of this data is in the form of events. Whether it be updates from sensors, clicks on a website or even tweets, applications are bombarded with a never-ending stream of new events. So, how can we architect our applications to be more reactive and resilient to these fluctuating loads and better manage our thirst for data? In this session explore how Kafka and Reactive application architecture can be combined in applications to better handle our modern data needs.
Katherine Stanley: Hello everyone. Thank you so much for joining us today. It's morning for us, but we're so excited to be presenting to you about reacting to an event-driven world. My name's Kate Stanley, I'm a software engineer at IBM, and I work on a product called IBM Event Streams, which is a fully supported Apache Kafka offering. And today, I'm going to be presenting with Grace Jansen.
Grace Jansen: Yep. Hi. So, nice to meet you all. Thanks for joining us today. My name is Grace Jansen. I also work in the UK with Kate, working in our Hursley development site for IBM. I'm a developer advocate and I work mostly with reactive technologies on the Java ecosystem.
So, without further ado, let's get started. So, we're here to talk to you today about sort of reacting to an event-driven world, and how can we better build our applications to be able to deal with this never ending growth of data that we have. I've seen a few comments in the event chat talking about this and just the huge quantities of data that our applications are going to have to be dealing with and are dealing with now and are going to have to deal with in the future. The graph doesn't necessarily represent any particular numbers, it's just to show the exponential growth of that data.
And it's not just the growth that our applications have to deal with, it's also the fluctuating load to our systems, depending on the day of the week, the time of the day, even the season, it depends. So, our applications are having to deal with these fluctuating loads and these huge quantities of data, so how can we better architecture applications to really put this at the heart of our systems? Well, this is where event-driven architecture really comes into play.
So, you can see here's a very basic architectural diagram of an event-driven architecture. In event driven architecture, we have events, so state changes to the system at the heart of our applications. It's really putting those data changes, those events, right at the very center of our systems. And then, in event-driven architecture, we have microservices that are able to publish and subscribe to these events.
So, you can see here, we've got microservice number one, which is publishing events into that messaging backbone, that event-driven backbone there. And we've got microservices like service number two here, which is consuming, but microservices can also do both, so they can publish and consume and do some processing in the middle.
So, that's sort of what event driven architectures really are about, and it's all about putting events at the heart of our system. And in terms of why this is beneficial, there's plenty of examples that are out there that will show you what you can gain by having this event-driven messaging background at the center of your architecture.
So, let's look at one such example. We're going to get some coffee. So, this example is actually on GitHub, it's created by Clement Escoffier. You can go and check out the code if you want to, but we're just going to use this as an example to talk through this scenario.
So, let's say we have our coffee lovers, and they all want to go and get coffee. It would make sense to build an application where they come in and they make HTTP request to the coffee shop to place that order. Now, what we could do is we could then have the coffee shop do another HTTP request on to the barista to tell them, “Okay, now we need to make some coffee.” We can set this up, and if we just have a couple of coffee lovers coming to the coffee shop, then this works pretty well because the coffee lover arrives, they make their request to the coffee shop, say “I'd like some coffee.” They wait for a response. The coffee shop goes onto the barista. Barista makes the coffee and returns, and then it goes all the way back up the chain and the coffee lover gets their coffee.
But as soon as we start to have more users of our coffee shop, we run into a little bit of a problem, which is that, actually, the coffee lovers are having to wait. They're having to wait not only for the coffee shop to get the order and then forward it to their barista, but also for the barista to actually make the coffee and then come all the way back. And the barista and the coffee shop aren't available to react to or make any other orders, while they're working on that one particular order. So, if we had lots and lots of coffee lovers come into the shop, they would all get stacked behind, and we find these quite long waiting times, and it doesn't really give us a very responsive experience.
Look at what happens if we introduce this event backbone. So, it makes sense for us to continue the HTTP request from the coffee lovers into the coffee shop, because the coffee lovers want to know whether their requests to make coffee was successful. They might want to know, for example, you know, is the coffee shop open? If it isn't, then maybe they'll get a response back straight away that says, “No, the coffee shop’s not open. You have to come back another time”, so we can keep this here. But by adding an event backbone, in between the coffee shop and the barista, we can quite quickly get an increase in orders. So, the new flow would be: a coffee lover comes in and they make their request, and the coffee shop would create an order and put it onto the orders area of the event backbone, so in a little category. The coffee shop’s then free to go back to the coffee lover and say, “I've got your order. You can go, you wait for it. I'll let you know later once it's ready.” So then the barista can asynchronously process all of these orders and put them up into the queue to say, “This is the queue of orders that are now completed”, and those can then be published onto a board so that the coffee lovers can see that their coffee is ready and go and get it.
And actually, if you have a look online, we've got a link at the end, there's comparisons where people run the left hand side diagram versus the right and see the increase in performance. And if you run the example in GitHub and start putting lots of requests for coffee, you can really see that you start to get a bottleneck on the left hand side, whereas on the right it's all much more free flowing and you get a much more responsive experience for your application.
We've essentially looked at where we're encouraged to switch to this event-driven architecture, and a lot of people sort of ask the question once they've switched to this event-driven architecture, “Are my microservices and is my microservice-based system now nonblocking and highly responsive?” And it's because we have all these great examples that say, “Yeah, it's great. You just put an event backbone in the middle and it solves all your problems.” It's really tempting to answer: “Yes, I'm using Kafka! Therefore, it must be nonblocking and highly responsive!”
No! Although Kafka is an amazing tool to use when we're building sort of resilient and reactive systems, it's not everything you need. It doesn't cover all of the aspects that we need to make our applications truly nonblocking and highly responsive, which is what we're aiming for with our users. So, how do we actually achieve that?
So, to make a highly responsive app that's really resilient and really responsive, we need to think about every layer of the application. So, Kafka is a really amazing tool when we're looking at the data layer of our application, but our applications aren't just made up of data. So, what we need to look at is the different microservices and how they all interact.
Let's have an example application here with three different microservices. Now,
your enterprise applications probably have a lot more than that three, but this is just a very basic example, sort of to lead you through it. So, the first stage is “Okay, what about internally, in our microservices? How do we make those reactive and responsive?” So that's where we can introduce these concepts like futures, we can introduce reactive programming, libraries, and concepts like reactive stream specification. And we basically call this reactive programming.
So, this is all about controlling the internal logic of your microservices to allow things like asynchronous processing. So, it's all about making the internal logic in your microservice reactive and responsive and nonblocking. But it's not just about an individual microservice because, as I mentioned, our applications are not just made up of one microservice. So, we need to think about how those microservices interact with each other, interact with our data layer and interact with third party components or other applications. And to look at that, we have to look at reactive systems, architectural patterns. So, we have to look at, “Okay, how am I going to create a nonblocking communication between all of these different components?”
So, let's take a closer look at reactive programming. So, here's a basic definition of what reactive programming is, and this does differ, so this is just one example of a definition.
So, it's essentially all about, as I mentioned, asynchronicity, so making sure it's nonblocking and it's about pushing the availability of new information and that's what drives the logic forward within those microservices. So, as I said, it's a logic management within your microservice, and to enable that you can use these different patterns, so we have futures, reactive programming libraries and the reactive stream specification.
So, if you haven't come across futures before, that's essentially a promise to be able to hold the result of an operation once the operation completes. It's all about an asynchronous response, essentially.
We've got reactive programming libraries, like RxJava and SmallRye Mutiny, which are used in some of the frameworks that we'll go over later on in the presentation. And then we've got the reactive stream specification, which is all about handling asynchronous data in a nonblocking manner whilst providing backpressure, which is an important part of reactive systems. But we won't focused too much on reactive programming.
Let's take a look at reactive systems as well. In order to create reactive systems, something called the Reactive Manifesto was created back in 2013. It's essentially, it's a manifesto that lays out four key cornerstones that you need to be thinking about if you want to make your application truly nonblocking and reactive.
So, the first of those is message driven. So, it's all about having an asynchronous messaging backbone within our systems, and that's sort of the bottom cornerstone here. So, message driven is all about having this asynchronicity and enabling our different components to communicate without having blocking in there. That is the sort of underlying cornerstone for the other cornerstones in the Reactive Manifesto.
The two middle ones are elasticity and resiliency. Elasticity is all about being able to both scale up and scale back down the different components and resources in our applications, and this enables us to better manage load and manage those fluctuating loads depending on which aspects of our systems have most traffic at the moment, whether we have more visitors to our site than we usually do, for example, and being able to do that in a cost efficient manner, so not just buying huge amounts of resource and only using a minuscule amount for the just in case scenario that we have lots and lots and lots of load on our system. So, being able to scale that up and down.
Then we have resiliency. So, resiliency is all about being able to respond gracefully to failure. So, not letting our entire application crash just because one tiny little microservice failed. Being able to build-in that resiliency because, as software engineers, no one is able to predict every failure that could potentially happen to our application. It’s just not possible, because new failures pop up every day. And so, building in that resiliency into our application means that we can gracefully respond to those and still have most of our application up and running, and we can then go and reinstantiate the microservice that failed or give more resource to it, etcetera.
Resiliency is arguably one of the most important cornerstones here because without resiliency, our applications are probably going to fail and crash. And without our application, we can't make a reactive application. So, resiliency is a very important cornerstone here.
Now, these three cornerstones all lead up to the last cornerstone, which is responsiveness. This is really what being reactive means. It means being able to respond to events, to state changes and to any actions the user may take on our applications as soon as possible, as quickly and efficiently as we can.
So, these are the different aspects of the Reactive Manifesto, but how do we actually achieve this when we're building and designing our applications? So, there are several reactive architectures or design patterns that you can use. Here are just four of some of the most commonly used ones. So, we have CQRS, event sourcing, saga and sharding.
Sorry, I'll move my mic closer, I can see some comments saying my voice isn't clear. Hopefully this is better. Let me know if it's not.
So, CQRS is all about sort of splitting out that read and write APIs. So, CQRS actually stands for Command Query Responsibility Segregation, and it's, yeah, just spitting out those read and write APIs.
Then we've got event sourcing. So, you can see here, we've got the Kafka symbol, we'll go onto that in a bit more detail later. So, event sourcing is all about persisting the state, such as orders or customer requests as a sequence, so as an immutable log of events, so sort of like I mentioned before.
And then we've got saga. So, saga is all about taking traditional transactions that we would have normally done on a monolithic architecture, but instead of doing them like that, we do it in a distributed manner. So, we create sort of micro transactions that have fullback behavior to account for when things potentially go wrong part way through. So, it's all about having these micro local transactions.
Then we’ve got sharding. So, sharding is all about distributing and replicating the data across a pool of databases that don't share the same hardware or software.
So, it helps to enable greater parallelism within our databases.
So, these are just some of the architectural patterns you can use in your own application when you're trying to create reactive nonblocking applications. We're going to go into more detail on the event sourcing and specifically how we can use Kafka to achieve this.
So, let's have a look at Kafka. For those of you who aren't aware, Kafka is becoming the de facto event streaming platform. That's how it gets described. And it's a platform that is being used more and more in these responsive and reactive applications, but we've already said that it's not good enough just to put Kafka in your architecture diagram and then say “I'm done. That's it.”
So, how can we actually configure Kafka for a reactive system? Well, let's have a look at the Reactive Manifesto and the different elements and how they applied to Kafka. We'll start with message driven. So, the core system relies on asynchronous message passing. So, we need Kafka to provide us with this capability. Kafka is an asynchronous messaging system. So, it does get the asynchronous bit sorted, but it's worth just drilling a little bit into what a message is, and how that's different to an event, because both of those words are used often interchangeably.
So, from the point of view of the Reactive Manifesto, they described messages as an item of data that's sent to a specific location, whereas an event is a signal that's emitted upon reaching a given state. So, you could imagine an event being something like “An order has been placed”, whereas a message could be, “I want you to place this order”. A message can contain an encoded event in its payload. So, it's not that you have to have one or the other, you could have a message that has an event inside it.
So, how does this even apply to Kafka? Well, I already said that Kafka is often described as the de facto event streaming platform, but if we look at what Kafka does, and its main purposes, it has nothing to do with specifically messages or events. So, Kafka is an open source, distributed streaming platform, and there are three main characteristics that they have on their landing page:
Using Kafka, you can publish and subscribe to streams of records, You can store records in a durable way, and you can process streams of records as they occur. It doesn't say “message” or “event” anywhere on here, just says “record”, so the messages or events that we flow through Kafka are described as “Kafka records”, and from Kafka’s point of view, it doesn't mind whether you send in events or messages, that's completely up to you. So, from the message driven angle, we can use Apache Kafka, even though when it was originally created, it was aimed more at distributing these events and being distributed [inaudible].
So, let's have a look next at resiliency in Kafka, and we'll start with Kafka itself. And the great news here is that Kafka is designed to be resilient. That's part of the reason it's distributed. It gives better resiliency and we'll see exactly how that works.
So, for a particular Kafka cluster, it's made up of one or more brokers. So, these are individual containers or machines that are running Kafka and normally you would start with at least three and you'll see why for the resiliency in a minute. So, we have these machines that are brokers, and this is where all of our records are going to go.
And in Kafka, groups of like records are separated down into what's called topics. So, if we think back to our coffee shop demo, a specific topic would be the orders topic or the queue topic. And for a particular topic that's broken down again into partitions, we'll come back to partitions a little bit more later on, but we have the topic is the main grouping and then partitions within a topic, but for a specific topic and a specific partition, there is one broker, which is designated as the leader for that topic, in that partition, all of the other brokers can then be referred to you as followers.
And the way that Kafka works is that any Kafka clients that come in and want to create records or take records off of Kafka and read them off, they will just talk to the leader. So, all of the records are going directly to the leader and everyone's reading them in front of the leader. But what Kafka will do under the covers, every record that is created on the leader will get replicated to the followers. This is being done by Kafka automatically. You don't have to do anything. It just happens. The only thing you have to do is tell Kafka how many times you want to have your topic replicated. And what this means is that if I've got these running in different machines, if one of them were to go down, I've not lost my leader.
But the crucial thing here for resiliency is I haven't actually lost any data because it was already being replicated onto the other brokers. What Kafka will do is it will automatically trigger a leader election. And the two remaining brokers here will between them decide a new leader. So, our original leader is now offline and all of the Kafka clients can just swap over and start talking to the new leader. So, in this way, Kafka provides resiliency built in and it means that it can quite happily handle failures. And I said earlier that we want to have at least three. And the reason for that is because if we've got one going down, we've now got two left. That you can continue running and actually, with Kafka, you may find that at times you want to restart the broker.
So, if you've got at least three replicas, then it means that in this scenario, when one of your leaders has gone down, you could still, if you need it to, take down one more, for maintenance or something like that, so it means you're being able to update your Kafka configuration and handle unexpected failures at the same time.
Just a quick question, Kate, because it relates to the slide. We've got a question saying: How will Kafka clients be notified about a new leader?
So, it's a great question. Kafka will tell them. So basically, when the connection comes in, the Kafka clients are configured to know about all three brokers. So, all of the Kafka clients are configured so that they will react appropriately and talk to the right place. So normally, when the client connects, it won't actually know in advance which leader it’s going to talk to. It will just pick one of the brokers and it will say, “Hi, I'm trying to talk to this topic, this partition”, and the broker will actually tell it which particular broker it needs if it's not the one it's currently talking to. So, the clients will, under the covers, find they can't talk to their current broker anymore, and because they have a list of all of the available brokers, they can just go talk to another one and then get directed to the correct leader. So, you and your application logic don't have to do anything, because the Kafka clients are already configured to understand when Kafka has changed leader and how to find the new leader and where to go next.
So, we've talked about the core of the brokers, but what about producers and consumers? We need to make sure that our records are resiliently getting into Kafka and then being consumed there. So, we'll start with producers. When you're producing to Kafka, generally there's two kinds of guarantees that people talk about, which is at most once, or at least once.
You can get some exactly once in Kafka, but you have to be very careful about how you configure it, and there are only certain use cases where it really makes sense. So, at most once, at least once is normally what you start off with, and you get to at least once or at most once by using acks and retries. If you want a system that is really resilient and you want to make sure your records definitely make it to Kafka, then you need to aim for at least once delivery.
This does mean that you could be at risk of duplicates, so you'll have to think about how you're going to handle that. The configuration options that you can use for that, “acks” is short for “acknowledgements”, and that basically means at what point do you want to wait for the acknowledgement from Kafka? So, you can just do a far and forget, you send the record to Kafka and don't worry if it really got there, you can wait for the leaders to respond and say that the leader got it, or you can wait for the leader and all the replicas to replicate and then get the response. So, depending on how sure you want to be that they get in there, you can then choose your acks accordingly.
The other thing is retries. So, this comes in both from the point of view of making sure that it wasn't a failure sending in the first place, but also, if a leader election happened, you might have to retry to a different location. So often, the Kafka clients will just have a configuration option that you set, and under the covers they'll do the retries for you, but you will need to work out how to handle errors if it's retried a few times and then failed.
So, that's producers, let's have a look at consumers. So, we've looked at, as Kate said, how we can introduce resiliency into producers, and we've looked at the resiliency within the Kafka brokers themselves. What about these resilient consumers? So, as we were mentioning, there were several configuration options you need to consider when you're looking at these to make better use of the resiliency within your application. With consumers, it's a configuration value called the “committing offset”. So, an offset is essentially a value assigned to a record along a topic. It goes from zero upwards in order, and what this basically means is that, as your consumer is consuming the records in this topic, it is able to commit those offsets, saying it has received those, and if it goes down, for example, and fails, when it comes back up, it knows where to start off again, so it doesn't replicate or duplicate the records that it's already processed.
So, there are two different ways of committing this offset. You can either choose to do a manual configuration of committing the offset or automatic. So, to take a better look at this, we'll look at an example again with coffee because it's early in the morning and we all need coffee, and we'll look at what the difference is between manual and automatic committing offset.
So, we've got our barista and we've got three coffee orders, so we've got three records in our topic for coffee. So, our barista, this is with the auto commit. So, this is basically using sort of- Usually, auto commit is based on a timer, so it just automatically commits the offset after a certain amount of time of reading it.
So, our barista takes its first order, the coffee, and it starts processing that coffee order, so it starts making that coffee. Now, because we're in an automatic configuration, because it's based on a timer, we commit that offset to say, “Yep, we've done the coffee order. Next time, if we go down, we can continue to cappuccino.
But unfortunately, as the barista, so as our broker is essentially processing this record, so we're making the coffee, our barista drops the coffee, and so this is essentially the equivalent of our broker going down before finishing processing that record. And so, what that means is that we no longer have a coffee to serve.
So, when our broker comes back up, when our barista relooks at the topic, it's going to start a cappuccino because it's already committed the offset for coffee. So now, our barista goes on to make a cappuccino and then a latte. So, at the end of it all, we only have a cappuccino and a latte. We no longer have that coffee order because we lost it. You can see here how we don't have full resiliency if we are using auto commit.
Let's take a look at manual commit. So, with manual commit, it gives you the option to choose when you decide to commit that offset. And so, in this case, we're not going to commit the offset until we've actually served up the coffee rather than basing it on a timer basis, because, unfortunately, our barista is a little slow. So, as we make the coffee and we can meet the coffee and we don't commit that offset until we've served up the coffee. So, our barista is able to go through them all and say, if he accidentally was to drop one, it means that he wouldn't have already committed that offset, and when he goes back to the list, he'd restart at coffee again. So, we have greater resiliency by using a manual commit offset for our consumers, so we can have greater resiliency in our consumers.
Cool, and I think Grace misspoke there. So, the barista going down represents the consumer going down right here. Cool, so we've got our producers and our consumers’ resiliency in Kafka, so let's talk about elasticity. Being elastic is normally described as scaling up and then back down again, based on load. So, in Kafka, it's actually built to be very configurable, but also to be able to handle a large amount of [inaudible] without having to scale too much. Now, in Kafka, a lot of the focus on elasticity is around being able to scale up and scale down your producers and your consumers, because Kafka is built to be able to handle large fluctuations without you having to change how many brokers you have and things like that. Obviously, you can add more brokers and you can add more topics. Deleting topics means you’re deleting data. And again, with deleting brokers, you have to think about where your records sit and whether you're going to lose any data if you get rid of those brokers.
But let's look at specifically how the producers and the consumers can be easily and elastically scaled up and down. So, I mentioned earlier that for a particular topic, there are one or more partitions and you can decide how many petitions are going to make up your topic.
And this affects both the way that producers talk to Kafka and then consumers as well. So, we'll start with producers. So, for a particular record in Kafka, it's represented as a key and a value, where the key is optional and the value would be your event or your message. If I, as a producer, choose not to have a key, then records will round-robin between the different partitions, so they would go around like this.
Now, I don't have to decide which partition I don't- As a producer, I don't have control over which partition is going to, if I haven't provided a key, it will just, Kafka will sort out the round-robining for me. But if I provide a key, what Kafka will do is it will make sure that all of the records with the same key will go to the same partition.
So, instead of going round, I would, for example, always produce two partitions 0, if I provide the same key. This comes in very useful with things like event sourcing, because you can represent a particular object, so say, an order, with a unique ID and have that as the key, and they'll all go into the same partition.
This is important from the point of view of ordering, because Kafka only guarantees ordering within a specific partition, and the guarantees around being able to produce to the same partition only hold while you have the same number of partitions, because it's based on a hash of the partitions. So if you want ordering and you want to be certain that you can consume, in a minute, all of the records from the specific partition in the correct order, you need to make sure that you don't add partitions all the time, so you need to stick with a certain number, and you need to use keys.
So, how do we then consume from those specific partitions? Well, that's where consumer groups come in. So, Kafka is designed to allow you to have many instances of a single consumer, but also to have them work together, because you don't want to have to scale up and have multiple consumers, but then they're all just reading the same records as each other, because one of the great things about Kafka is you can have consumers reading from all different places, but that means you can have them getting the same records if you don't use consumer groups.
So, let's look at how a consumer group works. I've got two consumer groups, A and B, and when I started up my applications, all I had to do was provide a consumer group ID, and as long as consumers have the same ID, they'll be grouped into the same group.
So we've got consumer group A, which has three consumers, and group B, which has two. And in this case, my topic, I've decided to just have three partitions. Now, what Kafka will do is when the consumers connect and get sent records, it will make sure to separate the partitions across the consumers, and the guarantee you get with the consumer groups is basically between all of the consumers in that group, they will see all the records, but you won't have two consumers seeing the same record.
So, here we've got the three consumers and they get one partition each. In consumer group B, I got two, so in that case, one of them reads from one partition and the other one reads from two. What this does mean, however, is that if I added one more consumer to consumer group A, it can’t read anything. So, it won't receive any records. And the reason for this is it's part of the consumer group, so you've said “I want the consumers between them to get all of the records”. If this consumer connected to a partition, then you suddenly wouldn't have the consumer seeing different records and you would also lose that ordering guarantee.
So, using partitions, you can scale up your producers and have them not overwhelm a particular broker because they're putting to different partitions that are normally spread across the brokers.
And you can also scale up and down your consumers and have them work together in consumer groups. But it does mean that picking the number of partitions that you want is quite crucial, so you need to think about that when you're initially designing. And if you care about ordering, you need to get that right kind of the first time. If you don't mind about ordering, then you have more flexibility to be able to scale up the number of partitions, because if you add more, you don’t care if the records go into different partitions, but different keys.
Great, so I think we've pretty much covered the sort of how we can introduce greater resiliency, greater elasticity, and use Kafka as that messaging backbone to really achieve the responsiveness that we want in reactive applications.
So, how do we get started with all of this? So, if you've not used Kafka before, then there's a great quick start you can use, which helps you get started with Zookeeper and Kafka in literally about 10 minutes. Zookeeper is still needed at the moment with the latest version of Kafka, because it helps for storing metadata, but in the future that metadata will be stored in a topic in Kafka. I've seen a few comments about that in the channel. Right now you do still need to start up Zookeeper before Kafka. So, use that quick start, there's a link at the bottom there, If you'd like to have a go with it and have a play.
Now, when you start that Kafka, you essentially get the standard Java Kafka clients, which are the producer and the consumer classes. Now, these are great for getting started, but they don't necessarily lend themselves that well to building a reactive application, but don't worry. You're not alone. We're not expecting you to go out and make your own version of this.
There are loads of open source frameworks that you can use when you're getting started, if you want to build a Kafka reactive application. So, here are four of the sort of most commonly used Kafka clients for reactive frameworks. They include Reactor Kafka, MicroProfile Reactive Messaging with connectors to Kafka, Alpakka Kafka Connector and the Vert.x Kafka Client. So, all of them are based on different frameworks and so work in slightly different ways. So, we'll go through them just quickly now for you to take a look at.
So, Reactor Kafka is essentially a thin layer of reactive streams over the Kafka client. It helps to hide some of the complexity of Kafka, and it provides that built-in backpressure support. It was produced by some of the Spring developers, although it is an open source project. So, if you're already using a Spring architecture, this might be a great framework for you to try because it fits quite nicely in with the Spring architecture framework.
So then, we've got MicroProfile Reactive Messaging. This is part of the MicroProfile specification, which is an opensource specification built by multiple different vendors in the Java ecosystem, and the reactive messaging specifically is an API that provides asynchronous messaging support based on that reactive stream specification we mentioned earlier.
So you can see here, there's some implementations here for you to be able to use it, because MicroProfile is just a specification, so you do need to look at the implementations of that to be able to use it in production.
We've got SmallRye Reactive Messaging, Open Liberty, Quarkus and WebSphere Liberty. It uses something called channels, so you annotate the beans within the application with either outgoing or incoming annotations, and then they're matched up by their channel name. So, in this case, channel foo matches up method A’s outgoing with method B's incoming. There is a guide later on if you'd like to have a go using MicroProfile Reactive Messaging.
The third option we had was Alpakka Kafka Connector. Now, if you guys have ever used some of the Lightbend open source stack, so Play, Akka, Lagom, this is based off the same actor-based system, so it fits really nicely into Akka streams and sort of the Akka framework, so it enables that connection between Apache Kafka and Akka streams. Again, helps to hide the complexity of Kafka and provide some of that built-in backpressure support. It is based on the actor-based framework, so if you haven't looked at actors before, then you can take a look at sort of what is the actor-based framework. It’s sort of a different level to microservices, so it's not necessarily a one-to-one mapping, so it can be quite a shift in thinking, but if you're already using the actor-based framework, this could be an excellent framework for you guys to try to connect into Kafka in a reactive way.
Then the last option we talked about was Vert.x Kafka Client. So, Vert.x is an Eclipse project. Again, open source. It's polyglot, so you can use it on Java, but you can also use it on Scala, Groovy and a bunch of other languages. It's based on something called the Reactor pattern, but it does run on the JVM. It's nonblocking, and it includes this distributed event-bus, which is quite specific to the Vert.x framework, and it does mean that the code is actually single-threaded within the application.
We actually used Vert.x when we were rearchitecturing one of our own applications that was using Kafka, so Kate's going to go into that in a bit more detail now.
We started looking at applications and why we might want to move to reactive. And there were some certain things that we were hoping to get out of it. So, one of those was simplifying the Kafka APIs and getting a Kafka API that was reactive in nature. We were also interested in built-in backpressure support, I'll kind of explain what that is in a minute, and then also getting some sort of separation between polling and processing.
So, from our background, what we were looking at basically is, we wanted to create an application that would produce and consume to Kafka and then have the UI to display those records. It's not necessarily an example that you would then lift and put directly into a real-life scenario, but what we were really looking for was an application that we could use to kind of test out and see “Is my Kafka working?” “If I've turned security on, have I configured my application correctly to talk to it?”, and also to start to see some of those pieces around partitions and offsets and what's going on in the internals of Kafka.
Now, originally, we wrote this application just using the normal Kafka plans, and then we started to have a look at reactive and there were a few things that we found a little bit tricky that we thought moving to this reactive framework might help with.
So, the main ones were around how we talk to Kafka and how we control that flow. So, in the UI, you can click to start producing, stop producing and the same with consuming. So, we wanted a mechanism to be able to produce, then stop, and the same with consuming. Producing this is pretty straightforward, but with consuming, we ran into some problems.
So, let's have a look in more detail. This is the code that you will often find if you go and look at examples for how to consume from Kafka. So, we have a while (consuming) or while (true) loop, pretty much, and we call this poll method. So Kafka consumers are actually sort of poll based, they do a poll to Kafka and they request new records to come down. Once we've polled and received a set of records, we then have to loop, run the records and process them.
The upshot of this is that if we want to pause, we can pause in such a way that we can poll, and then while we're processing the records, we're not polling again. So we kind of pause there, but all of the flow of talking to Kafka is maintained in this small set of code. So, if we want to break out of this loop and then go back in again and manipulate the flow in that way, it means we suddenly have to think about different threads and sort of having access to the consuming to turn on and off, and we found the code got quite complicated.
We also found that if we wanted to separate the underlying poll from when we're processing and how we process, we couldn't do that because we’d do the poll and then we have to put these records on there. We have to process them immediately.
If we look at the Vert.x Client, as a comparison, here there isn't a call to poll, which has created a Kafka consumer, and then we create a handler. So, all of these reactive frameworks, one of the key things they give you, is they provide a separation between the logic that’s polling and talking to Kafka and the logic that's actually doing this handling.
Kafka does have pause and resume for consumers, and if you have a separation between the logic that you're using to handle any records coming in and the actual logic that's controlling flow, and when you want to consume, and when you want to pause, it makes it much easier to handle this flow and have backpressure where your consumer can stop consuming when it has too many records.
So, you can go, have a look at the code for this application, but I am actually going to demo it. So, this is the URL, if you want to have a look, but I will show you it running.
So, I should've made this big already. So here, I'm just going to run java -jar taget/demo-all, that's my app. I've already started Zookeeper and Kafka, and I've just used the Quickstart that Grace mentioned earlier because that's kind of the easiest way. So, if I start this up, you can see that it has started.
If we go to here, this is the UI for our application, so you can see I already ran it earlier just to check if it was working. Let's hope it is still, and the demo gods are smiling on me.
There we go! So, we can see we're now producing and consuming through Kafka. So, you can start to see the partitions. So, because I've only got one broker, I'm only allowed one partition because I don't have any more, I could add more partitions, but they will end up on the same broker, so from a scaling point of view, it doesn't really help me.
You can see the offsets coming in as well, so you can see that they then link up between the two, so when I'm consuming, I can kind of see where I've gotten to. And you can actually change and put in different messages here, so you can say “hello world”.
Kate, we've got some messages that say- Can everyone see the application running? Some people say they only see the reactive slide. Two seconds. Yes. Okay. Yep. They can see the page, they just need to reload the page. And Kate, are you able to zoom in at all? Thank you. Thanks for your comments, guys.
So, yeah, so we've got producing and consuming here, so I can actually, change the messages coming through, so if you have a look, you can see “hello world” is coming through here.
So, this is a pretty basic application, as I said, but the great thing is you can sit and have a go at this. You can dive into the Java code and see what's going on. And if you want to, you can even try taking out Vert.x and putting in like MicroProfile or Akka or something instead, just to have a go. But what we really found writing this application was it is straightforward to use the basic Kafka clients to begin with.
But actually, when we started moving across to Vert.x, even with this really basic application, it suddenly felt like it fit a lot better with the way that Kafka is designed to work.
So, going back to the presentation, this is where you can go and check it out. If you load it here, the UI and everything is open source, so you can just try the whole thing. And we've also created a blog post, which you can go and have a look at. So, in here we delve into a little bit more detail of exactly what were the changes that we made, and some of the other features of Vert.x that we found really interesting and exciting when we were writing this application.
So, hopefully we've shown you throughout this presentation, how building in a reactive manner can really help you to provide your applications with an inbuilt resiliency, elasticity, and overall better responsiveness to actions and event changes and data changes in your system. And that using Kafka can be a really useful tool to achieve that, but you need to think carefully when you're designing your system, in terms of the configuration values that you're using, when you set Kafka up and sort of design and build your own system.
But, as we said, you are not alone in this. There are these reactive frameworks that are open source. All of the reactive frameworks we’ve shown today are open source and they can help you along the way to really build reactive applications that connect into Kafka and are built for this increased responsiveness.
So, if we haven't bored you enough about reactive systems, I've written a book with one of our colleagues and it's only 26 pages long. So, if you want all of this sort of condensed into a short book, this is free, you can get it using the link here in this slide and I'll paste it in the chat later on.
So, that's a free e-book that you can grab if you'd like to. And if you'd like to get hands on with some of this technology, we've obviously pasted the URL into the chat for our GitHub Repo that Kate showed you with our working application, but you can also take a look at some of our guides, which take you through using MicroProfile Reactive Messaging and sort of testing reactive applications.
So, they're available at this link here and you can get hands on with that code. It only takes 15 to 20 minutes. So, if you'd like sort of a fun activity to do, and get hands on with reactive, then you can check this out.
If you want to have a go with Kafka and explore it a little bit more, you can also check out IBM Event Stream, so that's the product that I work on. It's a fully supported Kafka offering, as I mentioned at the beginning. It also includes an award-winning UI and we have tools like schema registry to your application and a connect to catalog to enable you to run Kafka in a truly enterprise manner. The most recent release is actually based on a Kubernetes operator as well, so it's sort of Kubernetes native in the way that it works. So, feel free to check out Event Streams, if you're interested.
Thank you so much for having us. We really hope this was useful. We will open for questions in a minute, but we've also included a whole set of links here. We'll probably tweet a link to the slides at the end.
If you want to sort of learn more about reactive or Kafka or check out the sites, then you can go and tweet us. But we've got a whole set of links here, including reactive resources for you to read the Reactive Manifesto, and you can kind of read some of the history behind that event driven versus message driven.
And also, we have a whole series of sort of getting started with reactive and different exercises that you can flow through. If you want to check out Kafka, we've got the quick start guide and of course, Event Streams, but Even Streams is actually based on the open source Kubernetes operator, which is called Strimzi. So, if you're interested in running Kafka on Kubernetes and you want to look at something that's open source first, then check out Strimzi, that's a great tool.
And then we've got the links to all of those different libraries, because there are so many people that are doing great work in the Kafka ecosystem at the moment.
It's such a vibrant community, so definitely check out some of those pieces. And we've also got a link to an article that does this sort of comparison between synchronous and asynchronous processing with Vert.x.
So, that's it from us. Thank you so much. Let's open to Q&A. We've got a bunch of questions already coming in. Let me just scroll back up to find some of them.
Grace Jansen: We've got a recent one here from Sihata, and it says: If all microservices are the same application and you want to implement reactive system, is Kafka still required?
Katherine Stanley: If it's just one application and you're not having to have any communication in between different apps then Kafka is probably not what you want because Kafka is designed to pass records sort of out, and then back to another application. It might be that you want to use it for persistence, so instead of using something like a database, you could store information in Kafka and then pull back from there, if your application goes down. But if you're just looking at building an application that's very modular, but still has some of that reactive programming we're talking about, then checking out Vert.x is a great shot, and things like Akka Streams, because they have sort of more inbuilt event pieces, like event driven architecture pieces.
And what you can actually do, so we talked about the internal event-bus in Vert.x, you can actually set that up and have the different parts of your application use that. And then in future, if you want to, you can separate your application fully and put them into different applications and then still have them use Kafka instead of that event-bus, so that gives you kind of a lot of flexibility. So, it might be that to begin with, Kafka isn’t what you need, but you might find later on, you can then introduce it really easily if you use Vert.x.
Grace Jansen: Absolutely. I think we might have covered this question from Neeraj. They've actually asked: Are there any good cloud managed Kafka, or hosting in-house make more sense for a startup? So, I think that was covered in the second to last slide.
Katherine Stanley: I didn't mention this, because I actually work on the version of Event Streams that is Kubernetes based and designed for you to run. But there is a fully managed offering from Event Streams from IBM, it’s also called Event Streams.
And we've actually been running that since 2014, 15, something like that. So, IBM has a lot of experience and the cloud team are very experienced at running Kafka and managing it for you. So, if that's what you want, you can use that and they have a light option, so you can actually try it. Like completely for free.
You just get access to one topic, but it's enough for you to kind of start playing around.
Grace Jansen: Fantastic, we've got another question here: Is it possible to have event generating other events with some events doing transactions using Kafka?
Katherine Stanley: There's a tool called Kafka Streams, which is often used for sort of transforming events and creating new events off the back of that. So, Kafka Streams is a really cool library that basically makes it really easy for you to transform and process streams of events, so you could do things like transforming an event, so when it comes in, maybe filtering and taking out certain ones and then putting a certain set of them onto a new topic.
They've also got some really cool features around like windowing, if you've got like particular values coming through and things like that. So, Kafka Streams is definitely a cool thing to look at.
In terms of transactions Kafka does provide some sort of transaction support, but it does work slightly differently to some of the more traditional messaging tools that you will have seen before. My colleague, Andrew Schofield, has a really good Medium article about it. I'll see if I can find it and link it in the chat.
Kafka does provide some transaction support, but if you're running like fully reactive and fully distributed with microservices, you might find that doing distributed transactions isn’t what you want, because distributed transactions are not great. But yeah, they kind of do transaction IDs and things. I'll see if I can find that link and put it in the chat.
Grace Jansen: Here's a question from Amit: Can we rename a topic?
Katherine Stanley: Oh, no, I don't think you can, actually. What you could do is create a new topic and then use something like Kafka Streams just to flow all of the records across into that topic.
There's also something called Mirrormaker 2, which is part of Kafka now, and that's what we use in Event Streams for geo replication. We're using geo replication as a mechanism for our customers to have one system running in, say, one data center and then like a backup somewhere else, so if they have a disaster and everything goes down, which you would hope wouldn't happen, but just in case, they have a backup, and what Mirrormaker 2 will do is basically stream all those records across into the new cluster.
So, you could do something like that to process and move the topic. One of the things to bear in mind is I'm pretty sure with Mirrormaker 2, when it does the streaming, it adds a prefix to the new topic, so you might find that it doesn't do quite what you want. So, it might be that writing your own application, while that's pretty time consuming, would be the right thing.
But I haven't seen that many people have a request for renaming topics, so it might be worth having a look. there's something called KIPs, which is Kafka Improvement Proposals, I think it stands for. If you search on the Apache Kafka website, you can see all of the KIPs that people have raised and you can raise your own as well, if you have a particular feature you're interested in. It's a really active community.
Grace Jansen: Great. We got another question here from Tammy: When do you suggest to break a topic into subtopics? Example: one separate topic for each coffee and latte and anything else.
Katherine Stanley: Deciding on your topic separation is quite tricky. It depends on your use case and often you've kind of got the overhead of, do you want to have and manage multiple topics or can you have everything in one topic, but then you might have to filter. So, it does unfortunately depend on your use case.
If you're doing something like event sourcing, then that can force it, because if you want ordering, then obviously all of those records will have to be in the same topic. So, you can kind of tune it based on what your use case is. To a certain extent I think you have to work out what your use case is, and then there's quite a lot of different examples online where people talk about different use cases and how they've done it in Kafka.
The other thing, actually, that we should tell people about is the Kafka Summit is coming up. It's on the 25th of August, I think, 24th, 25th. And the great thing about Kafka Summit is there's loads of companies that will come to Kafka Summit, virtually this time, but they'll do talks about their specific use case and how they're using Kafka. So, if you're interested in a specific use case and want to see how other people are doing it, that's a really good place to find out that kind of information.
Grace Jansen: Here's another question: Can we use Kafka for synchronous transactions? This is from Anhinga.
Katherine Stanley: I think that's potentially on about the same topic of the previous transactions was talking about. I found, and I will link here, this is the Medium article, I put it in the chat, that talks about exactly how Kafka does transactions and it contrasts it with queuing systems like IBM MQ, which is what often people are familiar with when they look at transactions in messaging.
So, that's a really good article. I'd recommend people looking for transaction, have a read of that. And hopefully that will answer your questions. And if not, then you can tweet me afterwards or something. But yeah, Kafka is, I guess, newer to transactions than things like MQ, so they do have it, but it works slightly differently in terms of ordering and when things actually appear in the log, because Kafka works fundamentally different from something like MQ, where in Kafka you can reread records.
Grace Jansen: I think we've got time for one last question. We got one here from Nevine: Can replicas be on another hardware, so the hardware failure is also covered?
Katherine Stanley: Where you run your brokers is up to you. Generally, in most of the systems, when you run Kafka you would run all of your brokers on the same hardware. So, for example, if I'm running on Kubernetes or something, but I would run it all in the same Kubernetes cluster.
I think there are some people that maybe run like cross cluster, but I haven't seen that as much, and generally, what we recommend to our customers is if you're worried about failure, having different availability zones and having different brokers in different zones works really well.
And then having Mirrormaker 2 or geo replication happening to a different place. So, if that goes down you have a backup system is the best way to go. Kafka relies on resiliency for the broker, so if one goes down, the others are replicated, but you can still use persistence in Kafka, so that if all your brokers go down at once and then they come back, you can still have like block storage or something that is storing this data and Kafka can pick up from there. So, using a combination of that you can get a high level of resiliency.
Grace Jansen: Great. So, I don't want to rush into anyone else's session time. It is only an hour. So, thank you so much for all of your comments and questions and participation in this. We really appreciate it. And thanks for coming along to our session.
If we haven't gotten around to answering your question, we will try our best to scroll through the chat and try and answer them for you, but if you want to go to another session, we completely understand, but we will answer them all on here.
If you didn't manage to get time to ask a question, or we didn't get around to it, or we just missed it, then as we said, we're both on Twitter. You can message either of us. Send us a direct tweet, for example, or just tweet at us, and we will try and answer you that way, but thank you very much for coming.
“Once again Wurreka has knocked it out of the park with interesting speakers, engaging content and challenging ideas. No jetlag fog at all, which counts for how interesting the whole thing was."
Cybersecurity Lead, PwC
“Very much looking forward to next year. I will be keeping my eye out for the date so I can make sure I lock it in my calendar"
Software Engineering Specialist, Intuit
“Best conference I have ever been to with lots of insights and information on next generation technologies and those that are the need of the hour."
Software Architect, GroupOn
“Happy to meet everyone who came from near and far. Glad to know you've discovered some great lessons here, and glad you joined us for all the discoveries great and small."
Scott Davis, Web Architect & Principal Engineer, ThoughtWorks
“What a buzz! The events have been instrumental in bringing the whole software community together. There has been something for everyone from developers to architects to business to vendors. Thanks everyone!"
Voltaire Yap, Global Events Manager, Oracle Corp.
“Wonderful set of conferences, well organized, fantastic speakers, and an amazingly interactive set of audience. Thanks for having me at the events!"
Dr. Venkat Subramaniam, Founder - Agile Developer Inc.