Web sites and other distributed, multi-user systems present unique challenges for concurrent access to shared state. In this post we'll take a look at a simple strategy (with one big gotcha) for achieving distributed resource synchronization in the Spring JPA environment.
Case Study: Volunteer Signup
Let us imagine we have built a web portal that enables organizers to create jobs for which users sign up on a first-come, first-serve basis. For simplicity, we will assume that a user can have at most one job. So an initial simple job signup on the database level is merely:
UPDATE user SET job_id = ? WHERE id = ?;
And our Spring/JPA implementation might look like:
@Transactional
public void assignJob(User user, Job job) {
user.setJob(job);
}
Since this wouldn't be much of a blog post if that was the extent of the requirement, let's make things a little more complicated and add an important specification: Each job has a maximum number of slots which can be filled:
@Transactional
public boolean assignJob(User user, Job job) {
int count = jobRepository.countAssignedUsers(job);
if (count >= job.getMaxCount()) {
return false;
}
user.setJob(job);
return true;
}
It is easy to see the pitfall with such an approach, however. Some jobs might of course be more desirable than others which can result in a concurrency issue: two threads arriving at this method simultaneously will get the same value for count
and we could end up with an over-booked gig. If we were developing an extremely simple, single-server web service we could simply make this method synchronized
and be done with it; this solution would not help in a multi-machine environment where two users might not hit the same VM.
Database Level Locking
Thus, we need to move the synchronization out from the VM and onto a shared resource: the data server. In MySQL this is accomplished by modifying a SELECT
query:
SELECT * FROM job WHERE id = ? FOR UPDATE;
When this statement is executed, other transactions will block on the SELECT
statement until the current transaction completes (either committed or rolled back). The syntax varies by RDBMS, but JPA has an abstraction for this functionality called LockModeType
. The various Lock Modes available are out of the scope of this blog post (a good overview is available here), but for our purposes LockModeType.PESSIMISTIC_WRITE
translates to acquiring a row level lock per the SQL statement above.
We can acquire a lock in several ways: Spring offers a @Lock
annotation to decorate JPA repository methods; for more fine-grained control it is often desired to access the JPA EntityManager
directly. In a Spring Service
class, this is as simple as injecting the EntityManager via the JPA PersistenceContext
annotation (not the usual Inject
or Autowired
):
@PersistenceContext
private EntityManager entityManager;
@Transactional
public boolean assignJob(User user, Job job) {
entityManager.refresh(job, LockModeType.PESSIMISTIC_WRITE);
// remainder of method omitted
}
This has the effect of adding a global synchronized
keyword to the method: only a single thread across a distributed system will get the lock at one time. Our volunteer coordinators can breathe a big sigh of relief!
Caveat: Repeatable Reads
There is one important caveat when using this pattern for a check-then-act operation (checking if a condition is met before continuing) in a situation such as this: the lock acquisition must be the very first operation performed in the transaction. This is premise is so important, here is that statement a second time called out as a block quote:
Lock acquisition must be the very first operation performed in a transaction
Code such as the following will still exhibit concurrency failures:
@Transactional
public boolean assignJob(Long userId, Job job) {
User user = userRepository.findOne(userId); // 1
entityManager.refresh(job, LockModeType.PESSIMISTIC_WRITE); // 2
// remainder of method omitted
}
The reason is a feature found in all enterprise-grade RDBMS: Transaction isolation. While a full treatment of RDBMS transaction isolation levels is also out of the scope of this post, the key concept is that at levels of REPEATABLE_READ
(the MySQL default) or higher, the first query in a transaction serves to capture a snapshot of the state of the entire database, which lasts for the duration of that transaction.
This means that the SELECT query to look up a user at #1
above will capture all of the database state (including the number of users assigned to every job) at that point in time. Two threads arriving at #1
simultaneously will report the same number of users assigned to a job, even though that count is not queried until after the lock is acquired at #2
. REPEATABLE_READ
isolation means that, for the duration of the transaction, commits in other transaction are not visible.
It is possible to work around this behavior by explicitly setting a lower transaction level (such as READ_COMMITTED
), but this is undesirable: even if the reduced transactional safety is acceptable, modifying the transaction level on an ad hoc basis is error prone and an ongoing maintenance challenge. Acquiring the lock at the start of the transaction ensures the code functions as expected.
In a future post, I will cover how (and why) this technique can be used to simulate table-level locking. Any questions or thoughts, please let me know in the comments!