Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Background jobs #3759

Draft
wants to merge 6 commits into
base: latest
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
title: Concurrent Jobs
description: How to handle concurrent job executions.
order: 30
---

= Concurrent Jobs

// TODO Write about running the same job concurrently inside the same VM and on different VMs (but do this after you have written about server push in the presentation layer)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
264 changes: 264 additions & 0 deletions articles/building-apps/application-layer/background-jobs/index.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,264 @@
---
title: Background Jobs
description: How to handle background jobs in Vaadin applications.
order: 11
---

= Background Jobs

Many business applications need to perform in background threads. These tasks could be long-running tasks triggered by the user, or scheduled jobs that run automatically at a specific time of day, or at specific intervals.

Working with more than one thread increases the risk of bugs. Furthermore, there are many different ways of implementing background jobs. To reduce the risk, you should learn one way, and then apply it consistently in all your Vaadin applications.

== Threads

Whenever you work with background threads in a Vaadin application, you should never create new `Thread` objects directly. First, new threads are expensive to start. Second, the number of concurrent threads in a Java application is limited. An exact number is impossible to give, but typically it is measured in thousands.

Instead, you should use thread pools, or virtual threads.

A thread pool consists of a queue, and a pool of running threads. The threads pick tasks from the queue and execute them. When the thread pool receives a new job, it adds it to the queue.
The queue has an upper size limit. If the queue is full, the thread pool rejects the job, and throws an exception.

Virtual threads were added in Java 21. Whereas ordinary threads are managed by the operating system, virtual threads are managed by the Java virtual machine. They are cheaper to start and run, which means you can have a much higher number of concurrent virtual threads than ordinary threads.

See the https://docs.oracle.com/en/java/javase/21/core/virtual-threads.html[Java Documentation] for more information about virtual threads.

== Task Execution

The background jobs themselves should not need to manage their own thread pools, or virtual threads. Instead, they should use _executors_. An executor is an object that takes a `Runnable`, and executes it at some point in the future. Spring provides a `TaskExecutor`, that you should use in your background jobs.

By default, Spring Boot sets up a `ThreadPoolTaskExecutor` in your application context. You can tweak the parameters of this executor through the `spring.task.executor.*` configuration properties.

If you want to use virtual threads, you can enable them by setting the `spring.threads.virtual.enabled` configuration property to `true`. In this case, Spring Boot sets up a `SimpleAsyncTaskExecutor`, and creates a new virtual thread for every task.

You can interact with the `TaskExecutor` either directly, or declaratively through annotations.

When interacting with it directly, you inject an instance of `TaskExecutor` into your code, and submit work to it. Here is an example of a class that uses the `TaskExecutor`:

[source,java]
----
import org.springframework.core.task.TaskExecutor;

@Component
public class MyWorker {

private final TaskExecutor taskExecutor;

MyWorker(TaskExecutor taskExecutor) {
this.taskExecutor = taskExecutor;
}

public void performTask() {
taskExecutor.execute(() -> {
System.out.println("Hello, I'm running inside thread " + Thread.currentThread());
});
}
}
----

[IMPORTANT]
When you inject the `TaskExecutor`, you have to name the parameter `taskExecutor`. The application context may contain more than one bean that implements the `TaskExecutor` interface. If the parameter name does not match the name of the bean, Spring does not know which instance to inject.

If you want to use annotations, you have to enable them before you can use them. Do this by adding the `@EnableAsync` annotation to your main application class, or any other `@Configuration` class:

[source,java]
----
import org.springframework.scheduling.annotation.EnableAsync;

@SpringBootApplication
@EnableAsync
public class Application{

public static void main(String[] args) {
SpringApplication.run(Application.class, args);
}
}
----

You can now use the `@Async` annotation to tell Spring to execute your code in a background thread:

[source,java]
----
import org.springframework.scheduling.annotation.Async;

@Component
public class MyWorker {

@Async
public void performTask() {
System.out.println("Hello, I'm running inside thread " + Thread.currentThread());
}
}
----

See the https://docs.spring.io/spring-framework/reference/integration/scheduling.html[Spring Documentation] for more information about task execution.

=== Caveats

Using annotations makes the code more concise. However, they come with some caveats you need to be aware of.

First, if you forget to add `@EnableAsync` to your application, and you call an `@Async` method, it executes in the calling thread, not in a background thread.

Second, you cannot call an `@Async` method from within the bean itself. This is because Spring by default uses proxies to process `@Async` annotations, and local method calls bypass the proxy. In the following example, `performTask()` is executed in a background thread, and `performAnotherTask()` in the calling thread:

[source,java]
----
@Component
public class MyWorker {

@Async
public void performTask() {
System.out.println("Hello, I'm running inside thread " + Thread.currentThread());
}

public void performAnotherTask() {
performTask(); // This call runs in the calling thread
}
}
----

If you interact with `TaskExecutor` directly, you avoid this problem:

[source,java]
----
@Component
public class MyWorker {

private final TaskExecutor taskExecutor;

MyWorker(TaskExecutor taskExecutor) {
this.taskExecutor = taskExecutor;
}

public void performTask() {
taskExecutor.execute(() -> {
System.out.println("Hello, I'm running inside thread " + Thread.currentThread());
});
}

public void performAnotherTask() {
performTask(); // This call runs in a background thread
}
}
----

In this case, both `performTask()` and `performAnotherTask()` execute in a background thread.

== Task Scheduling

Spring also has built in support for scheduling tasks through a `TaskScheduler`. You can interact with it either directly, or through annotations. In both cases, you have to enable it by adding the `@EnableScheduling` annotation to your main application class, or any other `@Configuration` class:

[source,java]
----
import org.springframework.scheduling.annotation.EnableScheduling;

@SpringBootApplication
@EnableScheduling
public class Application{

public static void main(String[] args) {
SpringApplication.run(Application.class, args);
}
}
----

When interacting with the `TaskScheduler` directly, you inject it into your code, and schedule wok with it. Here is an example class that uses the `TaskScheduler`:

[source,java]
----
import org.springframework.boot.context.event.ApplicationReadyEvent;
import org.springframework.context.ApplicationListener;
import org.springframework.scheduling.TaskScheduler;

@Component
class MyScheduler implements ApplicationListener<ApplicationReadyEvent> {

private final TaskScheduler taskScheduler;

MyScheduler(TaskScheduler taskScheduler) {
this.taskScheduler = taskScheduler;
}

@Override
public void onApplicationEvent(ApplicationReadyEvent event) {
taskScheduler.scheduleAtFixedRate(this::performTask, Duration.ofMinutes(5));
}

private void performTask() {
System.out.println("Hello, I'm running inside thread " + Thread.currentThread());
}
}
----

This example starts to call `performTask()` every 5 minutes after the application has started up.

You can achieve the same using the `@Scheduled` annotation, like this:

[source,java]
----
import org.springframework.scheduling.annotation.Scheduled;

@Component
class MyScheduler {

@Scheduled(fixedRate = 5, timeUnit = TimeUnit.MINUTES)
public void performTask() {
System.out.println("Hello, I'm running inside thread " + Thread.currentThread());
}
}
----

See the https://docs.spring.io/spring-framework/reference/integration/scheduling.html[Spring Documentation] for more information about task scheduling.

=== Caveats

Spring uses a separate thread pool for task scheduling. The tasks themselves are also executed in this thread pool. If you have a small number of short tasks, this is not a problem. However, if you have many tasks, or long-running tasks, you may run into problems. For instance, your scheduled jobs may stop running because the thread pool has become exhausted.

To avoid problems, you should use the scheduling thread pool to schedule jobs, and then hand them over to the task execution thread pool for execution. You can combine the `@Async` and `@Scheduled` annotations, like this:

[source,java]
----
@Component
class MyScheduler {

@Scheduled(fixedRate = 5, timeUnit = TimeUnit.MINUTES)
@Async
public void performTask() {
System.out.println("Hello, I'm running inside thread " + Thread.currentThread());
}
}
----

You can also interact with the `TaskScheduler` and `TaskExecutor` directly, like this:

[source,java]
----
@Component
class MyScheduler implements ApplicationListener<ApplicationReadyEvent> {

private final TaskScheduler taskScheduler;
private final TaskExecutor taskExecutor;

MyScheduler(TaskScheduler taskScheduler, TaskExecutor taskExecutor) {
this.taskScheduler = taskScheduler;
this.taskExecutor = taskExecutor;
}

@Override
public void onApplicationEvent(ApplicationReadyEvent event) {
taskScheduler.scheduleAtFixedRate(this::performTask, Duration.ofMinutes(5));
}

private void performTask() {
taskExecutor.execute(() -> {
System.out.println("Hello, I'm running inside thread " + Thread.currentThread());
});
}
}
----

== Building

// TODO Come up with a better heading, and maybe a short intro to this section.

section_outline::[]
91 changes: 91 additions & 0 deletions articles/building-apps/application-layer/background-jobs/jobs.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
---
title: Implementing Jobs
description: How to implement backgorund jobs.
order: 10
---

= Implementing Jobs

When you implement a background job, you should decouple its implementation from how it is triggered, and where it is executed. This makes it possible to trigger the job in multiple ways.

For instance, you may want to run the job every time the application starts up. In this case, you may want to run it in the main thread, blocking the initialization of the rest of the application until the job is finished. You may also want to run the job in a background thread every day at midnight, or whenever a certain application event is published.

image::images/job-and-triggers.png[A job with three triggers]

In code, a job is a Spring bean, annotated with the `@Component` or `@Service` annotation. It contains one or more methods, that when called, execute the job in the calling thread, like this:

[source,java]
----
import org.springframework.stereotype.Component;

@Component
public class MyBackgroundJob {

public void performBackgroundJob() {
...
}
}
----

If the job is <<triggers#,triggered>> from within the same package, the class should be package private. Otherwise, it has to be public.

== Transactions

If the job works on the database, it should manage its own transactions. Because a job is a Spring bean, you can use either declarative, or programmatic transaction management. Here is the earlier example, with declarative transactions:

[source,java]
----
import org.springframework.stereotype.Component;
import org.springframework.transaction.annotation.Propagation;
import org.springframework.transaction.annotation.Transactional;

@Component
public class MyBackgroundJob {

@Transactional(propagation = Propagation.REQUIRES_NEW)
public void performBackgroundJob() {
...
}
}
----

This guarantees that the job runs inside a new transaction, regardless of how it is triggered.

== Security

Unlike <<../application-services#,application services>>, background jobs should _not_ use method security. The reason is that Spring Security uses the `SecurityContext` to access information about the current user. This context is typically thread local, which means it is not available in a background thread. Therefore, whenever the job is executed by a background thread, Spring would deny access.

If the background job needs information about the current user, this information should be passed to it by the <<triggers#,trigger>>, as an immutable method parameter.

== Batch Jobs

If you are writing a batch job that processes multiple inputs, you should consider implementing two versions of it: one that processes all applicable inputs, and another that processes a given set of inputs. For example, a batch job that generates invoices for shipped orders could look like this:

[source,java]
----
@Component
public class InvoiceCreationJob {

@Transactional(propagation = Propagation.REQUIRES_NEW)
public void createInvoicesForOrders(Collection<OrderId> orders) {
...
}

@Transactional(propagation = Propagation.REQUIRES_NEW)
public void createInvoicesForAllApplicableOrders() {
...
}
}
----

In this example, the first method creates invoices for the orders whose ID:s have been passed as parameters. The second method creates invoices for all orders that have been shipped and not yet invoiced.

Implementing batch jobs like this does not require much effort if done from the start, but allows for flexibility that may be useful. Continuing on the invoice generation example, you may discover a bug in production. This bug has caused some orders to have bad data in the database. As a result, the batch job has not been able to generate invoices for them. Fixing the bug is easy, but your users do not want to wait for the next batch run to occur. Instead, as a part of the fix, you can add a button to the user interface that allows a user to trigger invoice generation for an individual order.

== Idempotent Jobs

Whenever you build a background job that updates, or generates data, you should consider making the job _idempotent_. An idempotent job leaves the database in the same state regardless of how many times it has been executed on the same input.

For example, a job that generates invoices for shipped orders should always check that no invoice already exists before it generates a new one. Otherwise, some customers may end up getting multiple invoices because of an error somewhere.

How to make a job idempotent depends on the job itself. It is therefore outside the scope of this documentation page.
Loading
Loading