stream processing software resources and links

Stream processing is useful for many things, including real-time suggestions to users about content that may interest them based on their actions. You can feed streams of events to an event processor to have it search text. Designing Data Intensive Applications by Martin Kleppmann has great information about Streaming and Stream Processing in Chapter 11 Stream processing, this chapter covers the subject in great depth. If you don’t own that book you absolutely should if you are a software engineer. This book doesn’t cover Microservices, however it covers all of the software you use to create microservices and explains each in depth.  If you are engineering any kind of modern app you either have that book or you are lost as to what you are doing and haphazardly choose software because it is all hypie and cool sounding. This book explains the internals of most Databases, NoSQL, RDBMS etc., it really gives you the info you need to make sound decisions. This book will save you hundreds of hours of raw research.

Stream processor list

There are many stream processors, below I list a few.

  1. Apache Flink – accepts input from message queue or file system, can save output to a message queue, database file system and more. Easily connects to KAFKA and Cassandra. Processes streams event by event
  2. Apache SparkSpark website.  processes streams in mini batches Interesting point from Wikipedia article “Spark Streaming has support built-in to consume from Kafka, Flume, Twitter, ZeroMQ, Kinesis, and TCP/IP sockets” Another interesting point from the wikipedia is the Spark MLib a machine learning framework built on top of Spark Core
  3. Apache Storm – Apache storm website  Integrates with KAFKA, Databases and other Messaging Queues.
  4. Apache Kafka – This excellent piece of software functions as a Stream Processor, Message Broker and more.



event storming resources

Event Storming Resources

Event storming is a process of modelling software that works well with domain driven design principles. During an event storming session all involved parties get together and help model a domain, this helps everyone from the developer to the business decision makers understand the system and from a ubiquitous language ( common language). Event storming is just another modelling tool to help with quick design and development.


Event Storming 101: Accelerating Your Software Development in Domain-Driven Design This article covers the absolute basics of Event Storming what it is and how it is used.

Here is another short article about event sourcing. This article includes many useful links at the bottom.


Alberto Brandolini – 50,000 Orange Stickies Later

This video is from the man who created the concept of Event Storming. In this video Alberto discusses how he invented this technique of Event Storming with PostIt notes. He describes and demonstrates how this sort of Brain Storming session is a great way to get all involved parties together.  This technique will help you more quickly and completely model your software. I’ll be writing about this technique in my articles about microservices with Scala.

Lagom notes and facts

This is just a list of notes and facts I have gathered about Lagom. I’m the kind of person that asks 1 million questions and likes to know why things are done.

Why does Lagom use Cassandra as an event store system?

Lagom uses Cassandra as it’s event store. I wanted to know why they would choose Cassandra so I did some digging.  I found the following video helpful in understanding why anyone would use Cassandra for an event store instead of EventStore itself.  The video is not about Lagom, however it is about Cassandra and CQRS/Event Sourcing and why you would choose Cassandra as an event store. Basically Cassandra has features that are required for Event Sourcing and make it a good fit for it.

Why does Lagom use KAFKA as a Message Broker?

The documentation mentions using KAFKA as a Message Broker. I wanted to know why?  This article helped clarify what the difference between a message bus and a message broker.  Basically a broker gives a little less coupling with some other nice features. A message bus is more strict with a Schema like a traditional RDBMS database system, a message broker is more free like a NoSql database it has no Schema. Read the article for the other benefits and differences.

Codecentric: CQRS and Event Sourcing Applications with Cassandra

This video explains how to handle event sourcing with Cassandra

I also found this link to be helpful in understanding why Lagom would use Cassandra instead of EventStore. That article has a nice comparison of Cassandra vs other similar software used for event storage. A quick look at the features is all it takes.

notes facts and resources about domain driven design aggregates

DDD Domain Driven Design Aggregates

When learning about Domain Driven Design one of the harder things is wrapping your head around what aggregates are. In this post I list some facts, some may be redundant, but redundancy helps reinforce concepts.

Here is my personal list of facts and notes about domain driven design aggregates in no general order. This list will be updated as time goes on.

List of Aggregate facts

  1. An aggregate is a consistency boundary that decomposes large models into smaller clusters of domain objects that are technically easier to manage. Basically I refer to these as modules/services, modules should be easily hot swappable.
  2. An Aggregate is a DDD pattern, a cluster of related domain objects that work together as a single unit
  3. Aggregates will have one component object that acts as an interface to the aggregate known as the Aggregate root All outside interaction with the aggregate should be through the aggregate root. An aggregate root is the contract for the entire aggregate, the API interface for the Aggregate/Module/Service
  4. Aggregates are the inner bounded contexts that build the complete bounded context of the entire domain, like building blocks.
  5. An aggregate is a way to group objects inside of a specific domain bounded context. They are a way to modularize things that have to work together to perform a specific task of the system
  6. Aggregates are collections of objects that perform the functions of a particular bounded context.
  7. Aggregates are stored in a repository, aka database
  8. Aggregates are the only thing that can be persisted and retrieved ( re-hydrated )
  9. Repositories are used to manage the persistence of aggregates and ensure a clear separation between the data and the domain
  10. Aggregates communicate with other aggregates via Events,  creating event driven systems
  11. Try to make aggregates as small and specific as possible, this creates less room for bugs and makes the system easier to maintain. Stick to the Single Responsibility Principle  
  12. Large aggregates can suffer from performance issues when they span multiple entities, data stores, database tables Only make aggregates as large as they absolutely have to be, eliminate all unrelated code and actions.
  13. An Aggregate lives in a Bounded Context. Each bounded context is made up of multiple aggregates. Each a aggregate within a bounded context should have a Single Responsibility.
  14. When two or more objects/entities need to interact with each other to perform a task, make them part of the same aggregate.
  15. Aggregates represent concepts in the larger domain.
  16. Aggregates should be behavior focused
  17. Each object within an aggregate should be required to do some action for the aggregate to be included. Don’t have redundant aggregates within other bounded contexts, this is a clue that a new aggregate needs to be born into its own bounded context. Create another module and make the redundant aggregate it’s own.
  18. An aggregate root should be the only entry point for an aggregate. The aggregate root is like an application programmers interface to the aggregate API.
  19. The aggregate root is the coordinator for the aggregate. It handles all of the events by calling the aggregate objects to perform the required tasks/actions.
  20. The aggregate root should expose only the behaviors required by other aggregates. The aggregate root is the contract of the aggregate.  Removing behaviors makes breaking changes and can destroy a system. However, adding new behaviors any time is safe.
  21. Objects within aggregates should not hold references to objects in other aggregates, this creates tight coupling, exactly what we are trying to prevent. This means no object in one module/service/microservice should directly call an object or refer to one inside another module/service/microservice.   This means no communication outside one bounded context to another, the aggregate root is the only way to communicate.
  22. An aggregate can span several database tables when persisted, but this can cause problems.
  23. Domain objects within an aggregate can have direct object references to each other. This means that aggregates ( objects ) within a single bounded context (module/service) can have references to each other. Meaning within a service you can call other objects within the service.
  24. Repositories should hold aggregates. Aggregates can be reconstructed at any time from a repository.
  25. DDD repositories are not the same as code repositories like github
  26. The aggregate roots are AKKA actors that should accept commands and produce events
  27. Repositories are used to manage domain object/aggregate root persistence
  28. A repository manages the retrieval of aggregate/domain objects while ensuring a separation between the domain and data models
  29. A repository is a pattern for storing and retrieving the pieces of an aggregate in a database
  30. A repository enforces the aggregate roots contract by providing the interface to store and retrieve the aggregate parts
  31. Repositories map to the aggregates not the actual data-store such as database tables


Developing microservices with aggregates – Chris Richardson
I found this to be one of the best explanations of aggregates and how they relate in microservice based systems. Here is a link to the site mentioned in the video  where you can find a lot more useful information about microservices.

Lagom resources


Introduction to Lagom

This is a video introduction and quick explanation of Lagom

Lagom Developer Setup

This short video describes how to setup a Lagom project

Events-First Microservices with Lagom by Gideon de Kok

Tom Peck – Building Better Microservices with Lagom

What can Lagom Do for you?

CQRS and Event Sourcing with Lagom by Miel Donkers

From CRUD to Event Sourcing Why CRUD is the wrong approach for microservices

This video doesn’t mention it but it is about Lagom and how it is used to create microservices. This is a video  I will watch more than once and write notes about.  This video is complete with some code examples found here on github. This is a very helpful video if you are wondering how the hell Lagom works overall.


Your first microservices using Scala and Lagom. This is an interesting short article about Lagom. It is more or less a quick explanation of Lagom and how it works.

Deploying a Lagom Application to Openshift.  A lightbend article about how to deploy an application to to openshift.

Lagom Migration guide. This guide covers the changes from Lagom 1.4 to 1.5 the improvements etc. See the Docker images and deployment specs about kustomize

Here is a useful article about CQRS and Event Sourcing and how it relates to Lagom. CQRS and Event Sourcing with Lagom

Persistent read side in Lagom explained

Data persistence in Lagom  An article about saving your data in a Lagom Framework system.

Here is a really good article about lagom. If you are wondering why use lagom, especially since it uses AKKA and Play and you can just do that yourself, then this is the article for you.