Skip to content

Commit

Permalink
Merge branch 'latest' into application-services
Browse files Browse the repository at this point in the history
  • Loading branch information
peholmst authored Sep 25, 2024
2 parents ade5a05 + c66210d commit c98fca8
Show file tree
Hide file tree
Showing 8 changed files with 1,701 additions and 0 deletions.
1 change: 1 addition & 0 deletions .github/styles/Vaadin/Abbr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ first: '\b([A-Z]{3,5})\b'
second: '(?:\b[A-Z][a-z]+ )+\(([A-Z]{3,5})\)'
# ... with the exception of these:
exceptions:
- ACME
- AJAX
- AKS
- API
Expand Down
2 changes: 2 additions & 0 deletions .github/styles/config/vocabularies/Docs/accept.txt
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ JBoss
JDesktop
JGoodies
JHipster
jOOQ
JRebel
jsoup
jQuery
Expand Down Expand Up @@ -169,6 +170,7 @@ Spring Initializr
[sS]tylable
Telepresence
Temurin
Testcontainers
[tT]hemable
Thymeleaf
[tT]odos?
Expand Down
119 changes: 119 additions & 0 deletions articles/building-apps/application-layer/persistence/flyway.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
---
title: Flyway
description: How to manage your relational database schema with Flyway.
order: 30
---

= Flyway

Whenever you store data in a relational database, you have to manage the database schema in some way. When the application is first installed, you have to create the entire database schema. When new features are deployed, you have to update the database schema. You may need to add new tables to the database. You may need to add new columns to existing tables. You may need to move data from one table to another one, and delete the old one.

Some object-relational mapping tools, like Hibernate, can generate the initial schema for you. They may also be able to perform trivial updates to the schema, like creating new tables. In more complex cases, however, they are at a loss. Therefore, https://www.red-gate.com/products/flyway/community/[Flyway] is the recommended tool for managing database schemas in Vaadin applications.

[NOTE]
On this page, you'll learn enough about Flyway to get started using it in your Vaadin applications. Because Flyway has more features than presented here, you should also read the https://documentation.red-gate.com/flyway[Flyway Documentation].

== Migrations

Flyway is based on the concept of _migrations_. A migration is a script that performs some changes on your database. Every migration is versioned. As you implement new features, you add new migrations to the project. Flyway keeps track of which migrations have been applied to the database in a separate table. This table includes a checksum of every migration script.

When Flyway runs, either at application startup, or as a part of your deployment pipeline, it compares the contents of this table with the migrations in your project. It then executes all the migrations that were missing, from the oldest to the newest version. If the database is new, Flyway creates its metadata table and executes all the migrations.

Versioned migrations should not change after they have been applied. When Flyway runs, it recalculates the checksums of all the migrations in your project, and compares them with the checksums in the table. If there are any mismatches, it throws an exception and aborts.

In addition to versioned migrations, Flyway also supports repeatable migrations. These migrations can change after they have been applied, and are automatically re-applied after every change. Repeatable migrations are always applied after the versioned migrations.

== Writing Migrations

You can write migrations in multiple languages, including Java, but the most common one is ordinary SQL. The migration scripts should follow a specific naming pattern. Versioned migrations start with an uppercase `V`, followed by a version number, two underscores `\__`, a description, and the suffix `.sql`. For example, a migration could be named `V1__initial_schema_setup.sql`.

Repeatable migrations start with an uppercase `R`, followed by two underscores `\__`, a description, and the suffix `.sql`. For example, a repeatable migration could be named `R__people_view.sql`. Repeatable migrations are sorted by their descriptions before they are applied. Take this into account if you need to apply one repeatable migration before another.

You should store your SQL scripts in the `src/main/resources/db/migration` directory of your project. If you are using a <<{articles}/building-apps/project-structure/multi-module#,multi-module project>>, you should store the migrations in the module that handles persistence.

For information about writing migrations in other languages than SQL, see the https://documentation.red-gate.com/flyway[Flyway Documentation].

== Migrating on Application Startup

Spring Boot has built-in support for Flyway. If the `org.flywaydb:flyway-core` module is on the classpath, Flyway is automatically executed on application startup.

Flyway has built-in support for clustered environments. If you launch multiple instances of the same application, pointing to the same database, Flyway makes sure that every migration is applied only once, in the correct order.

Unless you are using an in-memory database like H2, you have to add a database specific module to the classpath, in addition to the database's own JDBC driver. For example, use `org.flywaydb:flyway-database-postgresql` with PostgreSQL and `org.flywaydb:flyway-mysql` with MySQL.

[NOTE]
See the https://documentation.red-gate.com/flyway/flyway-cli-and-api/supported-databases[Flyway Documentation] for a complete list of supported databases.

Spring Boot declares the modules in its parent POM, so you don't have to look up their versions. To use them, add them to your project POM-file, like this:

[source,xml]
----
<dependencies>
...
<dependency>
<groupId>org.flywaydb</groupId>
<artifactId>flyway-core</artifactId>
</dependency>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId> <!--1-->
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.flywaydb</groupId>
<artifactId>flyway-database-postgresql</artifactId> <!--2-->
<scope>runtime</scope>
</dependency>
</dependencies>
----
<1> This is the JDBC-driver for PostgreSQL.
<2> This is the Flyway module for PostgreSQL.

By default, Flyway uses your application's primary data source to apply the migrations. This means that the database user that you use to connect to the database must have enough privileges to execute Data Definition Language (DDL) statements.

From a security point of view, it is better to have one database user for DDL, and another for Data Modification Language (DML) statements. The DDL user is used by Flyway to migrate the database schema. The DML user is used by the application to query and modify data without touching the schema itself.

To make Flyway use its own data source, set the `spring.flyway.[url,user,password]` configuration properties. If you leave out `spring.flyway.url`, Flyway uses the same URL as the application's primary data source.

For more information, see the https://docs.spring.io/spring-boot/how-to/data-initialization.html#howto.data-initialization.migration-tool.flyway[Spring Boot Documentation].

== Migrating with Maven

Sometimes, you may want to run the Flyway migrations as a separate build step. For example, you may not want to make the DDL user credentials available to the application itself for security reasons. Flyway has a Maven plugin that allows you to run the migration scripts as a part of your build chain.

To run Flyway with Maven, you should still keep the migration scripts in the same directory as you did when running Flyway at application startup. However, you should not add any Flyway dependencies to your project. Instead, you should add the Flyway plugin, like this:

[source,xml]
----
<build>
<plugins>
<plugin>
<groupId>org.flywaydb</groupId>
<artifactId>flyway-maven-plugin</artifactId>
<dependencies>
<dependency>
<groupId>org.flywaydb</groupId>
<artifactId>flyway-database-postgresql</artifactId>
<version>${flyway.version}</version>
</dependency>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>${postgresql.version}</version>
</dependency>
</dependencies>
</plugin>
</plugins>
</build>
----

Note, that when you are adding dependencies to a Maven plugin, you have to specify their versions even if they have been declared in a parent POM. Spring Boot declares the versions of all its dependencies as properties, so you don't have to look them up yourself.

Now, whenever you want to run Flyway, execute the following command:

[source,terminal]
----
$ mvn -Dflyway.user=YOUR_DDL_USER -Dflyway.password=YOUR_DDL_USER_PASSWORD -Dflyway.url=YOUR_DB_URL flyway:migrate
----

For more information about what you can do with the Flyway Maven plugin and how to configure it, see the https://documentation.red-gate.com/flyway/flyway-cli-and-api/usage/maven-goal[Flyway Documentation].
11 changes: 11 additions & 0 deletions articles/building-apps/application-layer/persistence/index.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
title: Persistence
description: How do handle persistence in Vaadin applications.
order: 10
---

= Persistence

// TODO Write an introduction here once I know what to write

section_outline::[]
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
---
title: Repositories
description: How to use repositories to store and fetch data.
order: 10
---

= Repositories

The _repository_ was originally introduced as one of the building blocks of tactical Domain-Driven Design, but has since then become common in all business applications, mainly thanks to https://spring.io/projects/spring-data[Spring Data]. A repository is a persistent container of entities that attempts to abstract away the underlying data storage mechanism. At its minimum, it provides methods for basic CRUD operations: Creating, Retrieving, Updating, and Deleting entities.

== Collection Oriented

Collection oriented repositories try to mimic an in-memory collection, such as `Map` or `List`. Once an entity has been added to the repository, any changes made to it are automatically persisted until it has been deleted from the repository. In other words, there is no need for a `save` or `update` method.

A collection oriented, generic repository interface could look like this:

[source,java]
----
public interface Repository<ID, E> {
Optional<E> get(ID id); // <1>
void put(E entity); // <2>
void remove(ID id); // <3>
}
----
<1> You can retrieve entities by their IDs.
<2> You can store entities in the repository.
<3> You can remove entities from the repository.

Creating and storing new entities could look like this:

[source,java]
----
var customer = new Customer();
customer.setName("Acme Incorporated");
repository.put(customer);
----

Retrieving and updating an entity could look like this:

[source,java]
----
repository.get(CustomerId.of("XRxY2r9P")).ifPresent(customer -> {
customer.setEmail("[email protected]");
customer.setPhoneNumber("123-456-789");
});
----

Deleting an entity could look like this:

[source,java]
----
repository.remove(CustomerId.of("XRxY2r9P"));
----

Collection oriented repositories can be quite difficult to implement. The repository implementation would have to know when an entity has been changed, so that it can write it to the underlying storage. Handling transactions and errors would also be non-trivial. This is a telling example of the underlying storage mechanism leaking into the repository abstraction.

== Persistence Oriented

Persistence oriented repositories do not try to hide the fact that the data has to be written to, and read from, some kind of external storage. They have separate methods for inserting, updating, and deleting the entity. If the repository is able to deduce whether any given entity has been persisted or not, the `insert` and `update` methods can be combined into a single `save` method. No changes to an entity are ever written to the storage without an explicit call to `save`.

A persistence oriented, generic repository interface could look like this:

[source,java]
----
public interface Repository<ID, E> {
Optional<E> findById(ID id); // <1>
E save(E entity); // <2>
void delete(ID id); // <3>
}
----
<1> You can retrieve entities by their IDs.
<2> You can save entities in the repository.
<3> You can delete entities from the repository.

Note how the method names resemble database operations, instead of in-memory collection operations.

Creating and storing new entities could look like this:

[source,java]
----
var customer = new Customer();
customer.setName("Acme Incorporated");
customer = repository.save(customer);
----

The `save` method returns an entity, which can be the same instance as the one that was passed to the method, or a new one. This makes it easier to implement the repository, as some persistence frameworks, such as <<jpa#,JPA>>, work in this way. This is another example of the underlying technology leaking into the repository abstraction.

Retrieving and updating an entity could look like this:

[source,java]
----
repository.get(CustomerId.of("XRxY2r9P")).ifPresent(customer -> {
customer.setEmail("[email protected]");
customer.setPhoneNumber("123-456-789");
repository.save(customer);
});
----

Deleting an entity could look like this:

[source,java]
----
repository.delete(CustomerId.of("XRxY2r9P"));
----

Persistence oriented repositories are easier to implement than collection oriented repositories, because their API aligns with the read and write operations of the underlying storage. Regardless of whether you are using a relational database, a flat file, or some external web service, you can write persistence oriented repositories for them all.

*Unless you have a good reason for choosing collection-based repositories, you should use persistence oriented repositories in your Vaadin applications.*

== Query Methods

Although retrieving an entity by its ID is an important operation, it is not enough in most business applications. You need to be able to retrieve more than one entity at the same time, based on different criteria. If the dataset is big, you need to be able to split it into smaller pages and load them one at a time.

You can add query methods to your repositories. For example, here is a repository with two query methods:

[source,java]
----
public interface CustomerRepository extends Repository<CustomerId, Customer> {
List<Customer> findByName(String searchTerm, int maxResult);
Page<Customer> findAll(PageRequest pageRequest);
}
----

The first method finds all customers whose names match the given search term. It is good practice to always limit the size of your query results, which is why the method also has a `maxResult` parameter. This protects your application from running out of memory in case the query result turns out to be much larger than anticipated. If too many customers are returned, the user has to tweak the search term and try again.

The second method finds all the customers in the data storage, but splits it up into pages. The `PageRequest` object contains information about where to start retrieving data, how many customers to retrieve, how to sort them, and so on.

As long as you only have a handful of query methods, keeping them in the repository is fine. However, once the number of query methods starts to grow, you may run into problems. As the query methods become more specific, they also become more difficult to reuse. Over time, your repository may be full of query methods that are _almost_ similar. When a new, similar use case shows up, it is easier to add a new query method than figure out which of the old ones to reuse.

One way of solving this problem is to introduce _query specifications_. A query specification is an object that explains which entities should be included in the query result. In the earlier example, you can replace all the query methods with a single one:

[source,java]
----
public interface CustomerRepository extends Repository<CustomerId, Customer> {
Page<Customer> findBySpecification(CustomerSpecification specification,
PageRequest pageRequest);
}
----

You would then use the query method like this:

[source,java]
----
var result = customerRepository.findBySpecification(
CustomerSpecification.nameEquals("ACME")
.and(CustomerSpecification.countryEquals(Country.US)
.or(CustomerSpecificaiton.countryEquals(Country.FI))
),
PageRequest.ofSize(10)
);
...
----

This query method would return the 10 first customers whose names match the "ACME" query string and who are located in either the U.S. or Finland.

The challenge with this approach is that it is difficult, but not impossible, to build specification objects that are not coupled to the technology used to implement the repository. However, most business applications do not change their databases, nor do they have to support multiple repository implementations. Since the repositories are already a leaky abstraction, you can make the specifications implementation specific to make things easier.

You can find examples of how to implement specification queries on the <<jpa#,JPA>> and <<jooq#,jOOQ>> documentation pages.

== Query Objects

Query specifications are useful when you are interested in fetching whole entities. However, you often need to write queries that only include a small part of the entity. For example, if you are building a customer list view that only shows the customers' names and email addresses, there is no point in fetching the complete Customer-entity. The repository now looks like this:

[source,java]
----
public interface CustomerRepository extends Repository<CustomerId, Customer> {
Page<Customer> findBySpecification(CustomerSpecification specification,
PageRequest pageRequest);
// tag::snippet[]
Page<CustomerListItem> findListItemsBySpecification(
CustomerSpecification specification,
PageRequest pageRequest);
record CustomerListItem(CustomerId id, String name, EmailAddress email) {}
// end::snippet[]
}
----

Again, if you only have a handful of these queries, you can add them to the repository interface. However, if you have many different views, and every view needs its own query, the repository interface again risks becoming unstructured and difficult to maintain.

To address this issue, you should move all query methods that don't return entities to their own _query objects_. After moving the query method from the example above to its own query object, you end up with something like this:

[source,java]
----
public interface CustomerListQuery {
Page<CustomerListItem> findBySpecification(
CustomerSpecification specification,
PageRequest pageRequest);
public record CustomerListItem(CustomerId id, String name, EmailAddress email) {}
}
----

Query objects read from the same data source as the repositories. You can create as many query objects as you need without cluttering your repositories.

The query objects do not have to be tied to a particular entity. Summary views, for example, often need complex queries that join data from different types of entities together. Putting queries like that in repositories can be difficult. Either you can't find a single repository that feels like a good candidate, or you have multiple candidates to choose from. Creating a separate query object solves this problem.

[NOTE]
If you know the Command Query Responsibility Segregation (CQRS) architectural pattern, the idea of query objects may sound familiar. However, there is a big difference: Whereas CQRS uses different data models for writing and reading, query objects and repositories operate on the same data model, using the same data source.

// TODO Add link to using CQRS in Vaadin app, when that page has been written sometime in the future.

== Building

section_outline::[]
Loading

0 comments on commit c98fca8

Please sign in to comment.