Building a PHP CLI tool using DDD and Event Sourcing: software design
Last updated: 2023-08-21 :: Published: 2023-04-05 :: [ history ]You can also subscribe to the RSS or Atom feed, or follow me on Twitter.
When I started this series, I wasn't sure whether this post would be necessary.
What I'm referring to as software design here is arguably still part of the model, and when searching for inspiration online, I did find many examples of models featuring some rather specific implementation details.
As per Domain-Driven Design (DDD), however, the model is also supposed to be accessible to domain experts, so that they can validate it. But should we expect an accountant to know what a repository or a value object is?
That led me to draw the line between what developers and non-developers would understand, and to publish two separate articles – one covering the abstract model (the previous post), and the other the implementation model (this post).
This also meant I could go as technical as necessary in the latter.
In this series
- Why?
- The domain
- The model
- Software design ⬅️ you are here
- Setting up Laravel Zero
- Getting started with EventSauce
- Distribution
In this post
Domain-Driven Design
Before I get into the nitty-gritty of the implementation, I need to get some base concepts out of the way. I'm going to start with DDD, so feel free to skip this section if you're already familiar with the topic.
Domain-Driven Design is both an approach to software design and a set of tools that includes several object-oriented patterns. DDD and Object-Oriented Programming (OOP) work well together because they share a common goal of representing real-world concepts and relationships as code.
This is also why you will most likely recognise some of the patterns described below, even if you're not familiar with DDD:
-
Entity: An entity is a domain object that has a unique identity and a lifecycle that spans multiple interactions with the system. Entities are often referred to as models in non-DDD contexts.
-
Value Object: A value object is a domain object that has no identity and is defined entirely by its properties. Unlike entities, value objects are immutable and can be freely shared and copied between different parts of the system without any side effects.
-
Service: A service can refer to a domain object that encapsulates a specific behaviour or process within the domain. All services do not always belong to the domain, however – a service can also provide utility for other layers (see Layered Architecture further down). Services are stateless and typically operate on one or more entities and/or value objects to accomplish a specific goal.
-
Factory: A factory is a domain object that encapsulates the creation of other domain objects, such as entities or value objects. Factories are often used to simplify the creation of complex objects or to ensure that objects are created consistently and according to specific rules or constraints.
-
Repository: A repository is a domain object used to save and retrieve entities. A repository doesn't create entities (that's what factories are for) but persists them in the storage layer. It also encapsulates the logic related to retrieving objects from that layer (e.g. filters). Repositories are a way to decouple the domain from the underlying storage infrastructure.
There are other DDD patterns that I need to address but they belong to a larger pattern that is worth covering in a separate section – Event Sourcing.
Event Sourcing
Event Sourcing is a software design pattern whereby the state of a system is stored as a sequence of events, rather than as the latest snapshot of that state. Each event represents a change to the system's state and can be used to reconstruct that state at any point in time.
A typical example is a bank statement – rather than displaying your account's currently available amount only, a bank statement will list all the operations performed in the covered period, leading to the current amount.
Event Sourcing is considered a DDD pattern with its own sub-patterns, some of which are described below:
-
Aggregate: An aggregate is a group of domain objects that are treated as a single, consistent unit within the domain. Aggregates typically consist of one or more entities and/or value objects, and are responsible for enforcing business rules (invariants) within the domain.
-
Aggregate Root: An aggregate root is a special type of aggregate that serves as the entry point for all operations on the aggregate. The aggregate root is responsible for ensuring the consistency of the entire aggregate, and is the only object that can modify the state of the aggregate.
-
Invariant: An invariant is a business rule. It's a condition or set of conditions that must always be true within a domain. Invariants are typically enforced by aggregates or aggregate roots and are checked and maintained whenever an event is applied to the system.
-
Event: An event is a record of a change to the system's state. Events are typically immutable and represent a historical record of everything that has happened within the system.
-
Projector: A projector is an object that transforms a stream of events into a separate read model, or projection. Projectors typically subscribe to events and update the projection accordingly.
-
Projection: A projection (also called "read model") is the output of a projector, and represents a read-only view of the system's state at a specific point in time. Projections can be used for querying and reporting purposes, and are typically optimised for specific use cases.
-
Reactor: A reactor (also called "process manager") is an object that listens to specific events and triggers other actions within the system as a result. The main difference with projectors is that reactors respond to an event only once, even if the event is replayed later on (to reconstruct the system's state at a specific point in time, for instance).
If you're new to all this, these definitions probably sound a bit abstract. That's OK – each of these patterns will be illustrated with concrete examples at some point, either in this article or in future posts.
Layered Architecture
The last concept I want to introduce is that of Layered Architecture.
A layered architecture is a software design pattern that involves breaking a system down into logical layers, each of which is responsible for a specific set of functions within the system.
Each layer communicates with lower-level layers only and provides an interface that can be used by higher-level layers.
You may encounter different names for these layers and their number sometimes varies, but they're usually along these lines:
Here's a brief definition for each of them:
-
Presentation: The presentation layer is responsible for handling user interaction and input, and for presenting information back to the user.
-
Application: The application layer orchestrates the interactions between the presentation, domain, and infrastructure layers. It does not contain any business logic.
-
Domain: The domain layer represents the core business concepts and rules of the system, and is responsible for ensuring the consistency and correctness of the system's behaviour.
-
Infrastructure: The infrastructure layer provides the underlying technical infrastructure and services that support the application, such as databases, messaging systems, and network communication.
We can already see that the Layered Architecture, Event Sourcing and Domain-Driven Design patterns overlap to some extent.
Let's combine them further and map out the various moving parts of the application.
Mix and match
This is where we leave the theory aside to enter a more concrete phase of the design. The goal is to come up with classes that will represent model concepts and fulfil user stories, the code thus becoming an expression of the model.
This is also where we determine which layer each of these classes belongs to, intending to keep each layer highly focussed. This is a way to enforce a clear separation of concerns and increase the maintainability of the system, which becomes much easier to reason about.
Let's cut to the chase and unveil the resulting implementation map (not an official term, but it has a nice ring to it):
Like other charts in this series, what you're looking at is already the outcome of multiple iterations. Initial assumptions were adjusted over time, based on implementation feedback that also led to changes to the model and domain.
Let's break the map down layer by layer, top to bottom.
Presentation layer
As the presentation layer is responsible for collecting user input and displaying output in the console, is it home to the Artisan commands (Artisan is Laravel's console component, which is also part of Laravel Zero, the framework used for this application):
There are three commands – Process
, Review
and Export
.
Process
will take the transaction spreadsheet as input and pass it on to lower layers for processing.
Once done, the Review
command will retrieve the results (the tax year summaries) and display them in the console.
Having a separate command for this means users won't have to process the whole file every time, allowing them to display the results from a previous run instead. The Process
command will call Review
once it's done though, to spare the user an extra step.
The Export
command will be responsible for outputting the results in different formats (e.g. CSV). The Review
command will ask the user if they want to save the results to a file, and call the Export
command if that is the case.
Application layer
One level down is the application layer. The definition of the application layer is a little vague but we can see it as the glue between the other layers. Here, the application layer mostly consists of services and concrete repositories:
On the service side, we've got TransactionReader
, a class whose responsibility is to parse the spreadsheet and return raw transactions to the Process
command.
In turn, the Process
command passes the raw transactions to TransactionProcessor
, another service from the application layer. Its job is to turn each raw transaction into an object from the domain layer that then passes to it for further processing:
In short, as far as transaction processing goes, the application layer serves as a translation layer between the presentation and domain layers.
The third and last service of the application layer is TaxYearSummaryExporter
, a class whose job is to export the tax year summaries to various file formats. It receives the summaries from the Export
command, which has previously obtained them from the TaxYearSummaryRepository
repository, which is also part of the application layer (as a quick reminder, a repository is responsible for fetching and recording entities, although this one only does the former):
Now, what's interesting here is that this repository is a concrete repository – the concrete implementation of an abstract repository, which in our case is a PHP interface living in the domain layer.
We'll get to the detail of the domain layer in the next section, but for now, the reason why the concrete repositories live in the application layer and the interfaces they implement in the domain layer is because the former are considered an implementation detail. Within Laravel Zero, concrete repositories will use the Eloquent ORM to interact with the database, but the domain should not be aware of such technicalities, which are infrastructure concerns.
Note that once again, the application layer acts as a translation layer, although this time between the domain and infrastructure layers.
Also note that while TaxYearSummaryRepository
is the only repository represented on the implementation map, other repositories will be necessary for other entities. This is why the application layer features a generic "Concrete repositories" component, and why the domain layer holds its "Repository interfaces" counterpart.
Domain layer
The domain layer is where the bulk of the application – the "core business concepts and invariants of the system" – lives. We've already come across some of its components, in relation to higher levels:
One of these components is the Transaction
value object, which the application layer's TransactionProcessor
service creates from raw transactions before it passes it on to TransactionDispatcher
, a domain service that I will touch on in a bit:
While Transaction
is the only value object featured on the implementation map, there will be many more. Value objects are a core concept of DDD that are more powerful than they seem at first glance – far from being mere data containers, they encapsulate domain concepts whose correctness they guarantee through a set of validation rules.
Take Transaction
as an example – this value object will make sure that the raw transaction data it receives is correctly formatted and valid, and return an error otherwise. In other words, any part of the domain receiving a Transaction
value object is assured of the validity of its properties.
Value objects are also immutable, meaning they can be passed around safely without worrying that their state may change.
I won't turn this section into a course about value objects, but if you'd like to dig deeper, this article by Matthias Noback does a good job of explaining what they are and how they differ from Data Transfer Objects (DTOs), another pattern they're often confused with.
Another component of the domain layer we've touched upon earlier is the TaxYearSummaryRepository
interface, whose concrete implementation lives in the application layer. As stated before, repositories are responsible for retrieving and storing entities – in this case, TaxYearSummaryRepository
's job is to retrieve TaxYearSummary
entities from the database:
These entities are also projections, and this is where we enter the realm of Event Sourcing.
Projections are created by projectors and are basically views – data representations of specific events happening in the system.
A tax year summary is a report of any capital gain, income or non-attributable allowable cost incurred during a specific tax year. To come up with them, TaxYearSummaryProjector
listens to any events that would cause these figures to change and creates or updates the relevant TaxYearSummary
projections when they occur.
But where do these events come from? From the TaxYear
aggregate, which encapsulates everything that happens during any given tax year. Or, to put it differently, the TaxYear
aggregate keeps track of and records all the events belonging to the tax year it represents:
Another component that is implied in the above is the TaxYear
aggregate root, which is the entry point for anything happening to the aggregate.
The only way to record a new state for the aggregate – to apply an event – is to ask the aggregate root to do so. The aggregate root is the aggregate's guardian, and it won't update the aggregate's state before making sure no invariant is being violated.
An example of invariant violation would be trying to record a capital gain for a date that isn't within the tax year represented by the aggregate. The aggregate root will check the event's date first, and reject it as it doesn't belong to the tax year.
But who asks the TaxYear
aggregate root to update the aggregate's state? Other domain components.
Let's come back to the TransactionDispatcher
service we came across earlier. It's the domain service that receives the Transaction
value objects from the application layer's TransactionProcessor
service, after the latter created them from raw transaction data:
The TransactionDispatcher
service is an implementation of the strategy design pattern.
Upon receiving a transaction, it selects the handlers that will deal with it, based on the transaction's type and properties:
If it detects that the transaction is a transfer (i.e. between two wallets belonging to the user), it will dispatch it to the TransferHandler
service. If the transaction has a fee, the handler will tell the TaxYear
aggregate root to record a new non-attributable allowable cost for the tax year (I invite you to check the domain article if you want to know why).
Likewise, if TransactionDispatcher
detects that the transaction incurs some income, it will pass it on to the IncomeHandler
service, which will then tell the TaxYear
aggregate root to record the corresponding income for the relevant tax year:
While these two handlers are fairly simple, things get a little more complicated with the last two ones – NonFungibleAssetHandler
and SharePoolingAssetHandler
. Let's start with the former.
If TransactionDispatcher
detects that a transaction involves an NFT (you know – monkey JPEGs), it will pass that transaction to the NonFungibleAssetHandler
service.
Now, NFTs are a bit special, because several things can happen to them while a user holds them. After an NFT is acquired, its cost basis can change before it is sold again (disposed of), which is typically the case when the NFT was minted from several other NFTs (again, please refer to the domain article for details).
All these scenarios (acquisition, cost basis update, disposal) represent events that change the state of the NFT, and that we need to keep track of. You guessed it – we've got another aggregate on our hands:
Like TaxYear
, NonFungibleAsset
is implicitly both an aggregate and an aggregate root, the latter responsible for validating events before recording them within the former.
Some of these events are also taxable events that the TaxYear
aggregate must be aware of so that, in turn, the TaxYearSummaryProjector
class can update the right TaxYearSummary
projections.
How does that happen? Through a reactor:
Like projectors, reactors listen to events, but instead of updating a projection, they trigger various actions. They do so only once, whereas a projector will respond to an event as many times as it is replayed.
Here, NonFungibleAssetReactor
will listen to any NFT-related events that are taxable events (typically, disposals) and, when it catches one, will tell the relevant TaxYear
aggregate root to update its aggregate's state.
That's it for NonFungibleAssetHandler
, which leaves us with the SharePoolingAssetHandler
class. This service deals with the other kind of cryptoassets – the ones that fall under the rules of share pooling. And on the surface, it works similarly to the NonFungibleAssetHandler
class:
Depending on the transaction type, different things happen to the share-pooling asset held by the user (she can acquire more of it, dispose of some, exchange some of it for another asset, etc.). All these events are recorded within the SharePoolingAsset
aggregate through its root, which updates the aggregate's state so long as the invariants are observed.
There is a SharePoolingAssetReactor
class listening to any taxable events coming from SharePoolingAsset
aggregates, which tells the relevant TaxYear
aggregate root to update its aggregate's state when it catches one.
While this high-level explanation is enough for this article, under the hood the SharePoolingAsset
aggregate is far more complex than the NonFungibleAsset
aggregate. It involves many value objects, and guards convoluted invariants related to share-pooling rules.
Infrastructure layer
Lastly, the infrastructure layer is essentially the plumbing of the software, of which the database is a common example:
It also comprises the configuration files, external dependencies and, in the context of Laravel, the service providers (used to bootstrap various parts of the application).
Application structure
We now have a pretty good idea of the layers and classes that will make up the application.
We've introduced concepts and patterns borrowed from DDD and Event Sourcing and used terms coming from the ubiquitous language to name our components.
We've even started integrating Laravel concepts, but there's more work to be done on that front. This is the focus of this section.
Laravel does not enforce any particular architectural style and doesn't get in the way of most DDD patterns (e.g. you can easily introduce value objects or services), and even provides built-in support for some of them (e.g. models, which are entities although the Active Record pattern underlying its ORM is not without criticism). Laravel also has great support for events and queues, which are essential components of Event Sourcing.
Things get a little confusing when it comes to the Layered Architecture pattern, however.
I have read many articles on the topic but there's one that truly stood out and got me on the right track – Conciliating Laravel and DDD, by Loris Leiva. The main takeaway from this post is that fighting the framework is counterproductive.
When it comes to Laravel and Laravel Zero, not fighting the framework means ending up with a mix of components from the presentation, application and infrastructure layers in the app
folder:
dime/
├── app/
│ ├── Aggregates/
│ ├── Commands/
│ ├── Providers/
│ └── Services/
├── bootstrap/
├── config/
├── database/
├── domain/
│ ├── src/
│ └── tests/
├── tests/
└── vendor/
On the above, the app
folder's Aggregates
sub-folder contains aggregate-specific service providers (infrastructure layer), as well as concrete repositories (application layer).
The Commands
folder is part of Laravel's default structure and contains the Artisan commands (presentation layer).
Providers
holds the global service providers (infrastructure layer), and Services
the services used by the presentation layer (e.g. TransactionReader
, which is part of the application layer).
One level up, bootstrap
, config
and database
are also Laravel-native folders, belonging to the infrastructure layer. The tests
folder contains the test suite covering the application layer, and vendor
is home to external dependencies (infrastructure layer).
That leaves us with the domain
folder, which is the only one that isn't part of Laravel's default structure at this level. The domain
folder is dedicated to the domain layer, and comes with its own test suite (in the tests
subfolder), while the rest of the source code lives in src
.
I've seen other approaches further separating the different layers within a Laravel application (successfully so), but still think that the more we bend a framework, the more issues we're likely to come across in the future.
As long as the domain layer is properly isolated, my opinion is that it's OK to compromise a little on the other ones, for the sake of framework compliance. You may of course disagree.
Still, that extra domain
folder needs to be referenced from the PSR-4 autoload mapping sections of composer.json
, so it can get its own namespace:
...
"autoload": {
"psr-4": {
"App\\": "app/",
"Domain\\": "domain/src/",
"Database\\Factories\\": "database/factories/",
"Database\\Seeders\\": "database/seeders/"
}
},
"autoload-dev": {
"psr-4": {
"Tests\\": "tests/",
"Domain\\Tests\\": "domain/tests/"
}
},
...
With the above, the regular App
namespace that Laravel developers are used to is present as usual, but a new Domain
namespace is also introduced and mapped onto the domain
folder.
Closing thoughts
Now that we've established the overall structure of the application, identified the main components making up its various layers, as well as the way they interact with each other, we're in a solid position to begin the implementation.
But where shall we start, exactly?
Setting up the framework is the obvious first step (and will be the object of the next post), but once that's done we still need to decide how to tackle the domain.
A characteristic of aggregates is that they constitute their own little isolated domains, which conveniently breaks down an application into smaller, independent components.
Of the three aggregates identified in this article, the NonFungibleAsset
one is arguably the simplest, so it makes sense to make it our starting point. This will be the focus of another part of this series.
Once all the aggregates are in place, we'll be able to implement the logic connecting them, thus completing the domain layer. That will leave us with the presentation layer, which we can address last.
This post is likely the last purely within the framework of Domain-Driven Design. The software design phase covered above completes the feedback loop opened with the description of the domain and continued with the expression of the model.
These three phases will now keep feeding into each other, producing an ever clearer picture of both the domain issues and their solutions, informed by the software's implementation.
Subscribe to email alerts below so you don't miss the next posts, or follow me on Twitter where I will share them as soon as they are available.
You can also subscribe to the RSS or Atom feed, or follow me on Twitter.