ReentrantLock vs Synchronized - Performance comparison

Recent experience performance problems caused by high concurrency, final positioning of the problem is that LinkedBlockingQueuethe performance is not, ultimately to reduce the competitive pressure of each Queue by creating multiple Queue. For the first time in my life, I encountered a situation where the JDK's own data structure could not meet the needs, and I was determined to study why.

The pressure test is on a 40-core machine. Tomcat defaults to 200 threads. The sender sends 500 requests at about 1w QPS and requires a response of 999 quantiles at about 50ms. There is a task that writes to the database asynchronously in the code. In actual testing, more than 60% of the delay is in the write queue (in fact, the task is submitted to the ThreadPool). He began research LinkedBlockingQueueimplementation.

Equivalent plus LinkedList LinkedBlockingQueue ordinary ReentrantLocklocking during operation. ReentrantLock (and other locks in Java) internally rely on CAS to achieve atomicity. However, in high concurrency, the thread will retry continuously, so theoretically the performance will be worse than the native lock.


Tests and results

It is actually very difficult to compare CAS with native locks. There are no native Java locks, and synchronizeda variety of JDK optimization, in some cases, low concurrency also used CAS. He has been compared synchronizedand Unsafe.compareAndSwapIntfound that CAS is suspended or beaten. So finally the second best contrast ReentrantLockand Synchronizedperformance.

A thread contention ReentrantLockfails, it will be put on the waiting column, we will not participate in the follow-up of the competition, and therefore not representative of CAS ReentrantLock performance under high concurrency. However, we generally don't use CAS directly, so the test results are okay.

The test uses the JMH framework, which claims to measure to the millisecond level. The running machine is 40 cores, so it can guarantee at least 40 threads competing at the same time (if the number of CPU cores is insufficient, despite the number of threads, the amount of true simultaneous concurrency may not be much). Tested under JDK 1.8.

Increment operation

First, a test synchronizedwith ReentrantLockthe synchronization increment operator, test code is as follows:

@Benchmark
@Group ( "lock" )
@GroupThreads ( 4 )
public  void  lockedOp ()  { try {         lock.lock ();         lockCounter ++;     } finally {         lock.unlock ();     } }
 

@Benchmark
@Group ( "synchronized" )
@GroupThreads ( 4 )
public  void  synchronizedOp ()  { synchronized ( this ) {         rawCounter ++;     } }
 

The results are as follows:



List operation

The CPU time of the auto-increment operation is too short. Increase the time of each operation appropriately and insert a data into linkedList instead. code show as below:

@Benchmark
@Group ( "lock" )
@GroupThreads ( 2 )
public  void  lockedOp ()  { try {         lock.lock ();         lockQueue.add ( "event" ); if (lockQueue.size ()> = CLEAR_COUNT) {             lockQueue .clear ();         }     } finally {         lock.unlock ();     } }



@Benchmark
@Group ( "synchronized" )
@GroupThreads ( 2 )
public  void  synchronizedOp ()  { synchronized ( this ) {         rawQueue.add ( "event" ); if (rawQueue.size ()> = CLEAR_COUNT) {             rawQueue.clear ( );         }     } }
 
The results are as follows:

Result analysis

You can see that the performance of ReentrantLock is still higher than Synchronized.
The throughput is the lowest when 2 threads are used, while 3 threads are improved. It is speculated that when two threads compete, thread scheduling must occur, and when multiple threads (unfairly) compete, some threads are directly available from the current The thread took the lock in hand.
As the number of threads increases, the throughput decreases only slightly. First of all, it is speculated that because the synchronization code only has at most one thread executing, although the number of threads increases, the throughput will not increase much. Secondly, most threads are less likely to be awakened after they become waiting, so they are less likely to participate in subsequent competition.
(In linkedlist test) After the lock holding time increases, the throughput gap between ReentrantLock and Synchronized decreases, which should be able to prove that the cost of CAS thread retry is increasing.
This test gives me more confidence in ReentrantLock, but it is generally recommended to use synchronized during development. After all, the big brothers are still optimizing (see an article saying that Lock and synchronized in JDK 9 are basically the same).

Comments

Popular posts from this blog

Today Walkin 14th-Sept

Spring Elasticsearch Operations

Hibernate Search - Elasticsearch with JSON manipulation