2006-11-12

Mandatory Rules for Safe Multithreading in Java in 2006

Much of this post is inspired by the first few chapters of the jcip book. It is also based on my experience.

1) State Variables Must be Synchronized: When more than one thread access a state variable (var), and one of them might write to it, then they all must synchronize their access to it.

1.1. A state var is a variable for a single instance or all instances of a class, aka. instance var and class var respectively. A state var is also called a field. It is not a variable defined in a method; these local vars defined in a method are thread-safe because they cannot be shared with other threads (unless they are copied to some other location that can be shared). In Java, an instance var is defined as non-static (the static qualifier is absent) and a class var uses the static qualifier. It is a common and critical misconception to consider that instance variables are thread-safe and that only static variables need protection (e.g., a guard or lock).

1.2. A stateless class is thread-safe if it uses only thread-safe classes. A stateless class is one that has no fields and that does not reference any fields from other classes. It may have local variables, i.e., variables defined in a method.

2) Missing Synchronization Is a Defect: A program that is missing needed synchronization may appear to work well for years but may fail at any moment and must be considered defective and must be fixed urgently.

2.1. The occurrences of programs with missing synchronization exhibiting erroneous behavior and producing incorrect data will be greatly increased with the advent of multicore chips in desktop computers and in servers replacing single-chip machines.

2.2. Applications that have been exhibiting correct behavior on SMP machines should show less of an increase in the frequency of multithreaded defects than applications running on single chip machines. This is only about the predicted increase in frequency of multithreading defects and rule #2 still applies to applications that have been running on SMP servers, i.e., if they are missing synchronization then they are defective.

3) How to Synchronize: There are 4 basic ways to fix synchronization defects for a given state var:

3.1. Don't share the state var across threads, if possible,

or

3.2. Make the state var final, if possible,

or

3.3. Synchronize accesses to the state var; if the previous two options are not possible then this one is mandatory. The ways to synchronize access to a state var:

3.3.1. In some cases you may use a java.util.concurrent.atomic (j.u.c.a.) class instead of a regular (non atomic) primitive; this is sufficient when the class has only one state var, or the state vars are not interdependent, but is not sufficient when it has more than one state vars that need to coordinate/synchronize their updates; the atomic classes cannot be used for this coordination, therefore, in such cases, instead of atomic vars, you must use an atomic set of operations, e.g., a set of operations bounded by a lock (rule 3.3.2). In some cases, an atomic class from j.u.c.a cannot appropriately replace its corresponding primitive. The j.u.c.a. classes are primarily designed for developing non-blocking algorithms, a particular type of concurrent logic.

3.3.2. Use an atomic set of operations (e.g., a set of operations bounded by a lock) to update interdependent state variables.

3.3.3. If coordinated access is being done on a var, then synchronize with the same lock for all uses of that var and all the invariants in which it and its related vars may participate, including reading the var(s). This situation is called *guarded by a lock* and the @GuardedBy("aLock") annotation is to be used for each state var that need to be coordinated.

3.3.3.1. Use block synchronization of minimal sizes while minimizing the number of such blocks in each method. Possible strategy: use method synchronization (sync) when this does not affect performance significantly; otherwise use block synchronization within the method. In determining the minimal block size, try to not include local vars in block, because local vars (stack-based) are not shared across threads and do not require synchronization. Favor sync blocks that do not include calls to methods of other objects, particularly those that are lengthy or that involve operations that may block (e.g., I/O, network). Favor a small number of sync blocks per methods because each sync block has a cpu overhead cost. Nevertheless, thread safety must never be compromised and cpu cycles must be spent.

or

3.4. Make the Entire Class Immutable: this is my favorite solution when possible. An immutable class is one whose state cannot be seen to change by callers (aka. clients). This requires that:
- all public fields are final,
- all public final reference fields refer to other immutable objects, and
- constructors and methods do not publish references to any hidden state which is potentially changeable (mutable); in particular, the *this* reference must not escape from any of the constructors of the class.

3.4.1. Immutable objects may still have hidden mutable state for purposes of performance optimization; some state vars may be lazily computed, as long as they are computed from immutable state and that clients cannot tell the difference (they are hidden from the clients).

3.4.2. Immutable objects are thread-safe; they may be passed between threads or published without synchronization.

3.4.3. Making a static field immutable: use *final*, do not deserialize the class, and the field must contain a primitive or an instance of a class which is itself thread-safe. This will ensure atomicity and visibility for a static field.

3.4.4. Making a NON-STATIC field immutable (an instance field): use *final* and make sure that all constructors cannot let *this* escape. Immutable classes must have thread-safe constructors. The final fields may be made public, and accessor methods are not needed, which also improves performance.

3.4.5. Serialization: It may be possible to create an immutable class that can be serialized and deserialized, but it may be so complex that it may not be worth the development costs, and in general, if a class needs to be serialized and deserialized, it may be less expensive to not try to make it immutable and to make it a plain thread-safe class.

3.4.6. For similar reasons than for 3.4.5. Serialization, in general, it is usually expedient to not make immutable beans or classes that are loaded programmatically, if that bean or programmatic class has fields that require values set by the constructor(s) parameters, which is often the case.

4) How to use VOLATILE:

4.1. Try not to use volatile - use volatile only when all these conditions are met:

- writing to the shared var without using its current value or when only a single thread updates the var;

- and the var is not involved in an invariant relation;

- and the var is not of type long or double: these primitives are not atomic and such shared vars must be guarded by a lock or, for a long not involved in an invariant, an AtomicLong, instead of a volatile qualifier, and for a double not involved in an invariant, a BigDecimal may be considered (because it is immutable thus thread-safe, although a BigDecimal may be slower than using a lock); volatile cannot guarantee atomicity and also do not protect invariants;

- and you are absolutely certain that locking is not required while the var is being accessed.

4.2. Examples of typical use of volatile: Ensuring visibility (without the need for atomicity), e.g., boolean event notification, such as single change boolean (initialization or shutdown state indicator), or a status flag used to exit a loop. The above conditions in 4.1 must always be met.

4.3. When in doubt, use locking instead of volatile.

5) Java 5 or Above: Do not use pre-Java 5 versions. The above rules only apply to Java 5 or later, and earlier versions of Java must not be used for mutithreaded apps. This is because the memory model was fixed in Java 5 to ensure thread-safety and the earlier versions of Java are inherently not thread-safe (unless a vendor has rendered it's implementation of Java thread-safe before Java 5), and when using earlier versions of Java, developers cannot be guaranteed that using the above mandatory rules will ensure thread-safety.

Non-Mandatory Rules:

a) When a state var is protected by synchronization (i.e., a lock), do not use an atomic type from java.util.concurrent.atomic to store the value; use a regular type because this helps performance and code simplicity (better speed and lower maintenance costs).

b) In your code, use the annotations defined by Brian Goetz and Tim Peierls (used in the JCiP book): ThreadSafe, Immutable, GuardedBy.

c) Always use the -server option for the java command, even in development (of concurrent applications), when this option is available.

d) Do not assume that any class is thread-safe, including classes from the Java SE API, unless you analyze the source code, or the class is documented to be thread-safe and you trust the developer(s). I trust the Java SE developers but unfortunately the documentation of many of their classes do not specify whether the class is safe or not.

e) Question: what's the difference between int and AtomicInteger? Is int atomic? jcip p. 36 states that lower than 64-bit vars are atomic. If int is atomic, then why use AtomicInteger?
Answer: Assigning an int is atomic but operations on an int that depend on its previous value are not atomic, so AtomicInteger is designed for atomic operations with the previous value of the field. Also, an AtomicInteger is used in applications such as atomically incremented counters, and usually cannot be used as a replacement for an Integer. Atomic classes in java.util.concurrent.atomic are designed primarily for implementing non-blocking data structures and related infrastructure classes. The compareAndSet method that they use is not a general replacement for locking because it applies only when critical updates for an object are confined to a single variable and therefore it is not applicable to invariants.

f) Q: How come no AtomicDouble in JSR 166?
A: Because their use is very uncommon. Generally, a double field is guarded by a lock and not by an atomic class. You can easily convert an AtomicInteger and an AtomicLong into either a float or double by using their floatValue() and doubleValue() methods.

g) Values representing amounts of currency should always use BigDecimal (for precision reasons, not for concurrency) and a BigDecimal is immutable thus thread-safe.


TODO rendering thread-safe applications composed of thread-safe classes (in other posts).


TODO locking protocols, sync. policies (in other posts).

TODO safety of classes generated by xjc tool in JAXB (in other posts).

Copyright (c) 2006 Serge Masse.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation. A copy of the license is in http://www.gnu.org/licenses/fdl.txt

The Java annotations (e.g., @GuardedBy, @Immutable, @ThreadSafe) in this post are copyrighted by Brian Goetz and Tim Peierls under these terms:
Copyright (c) 2005 Brian Goetz and Tim Peierls
Released under the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.5)
Official home: http://www.jcip.net
Any republication or derived work distributed in source code form must include this copyright and license notice.

Labels: , , , , , ,

1 Comments:

Blogger nituld said...

excellent stuff and tips on concurrency, great job

Mon Sep 24, 06:58:00 AM EDT  

Post a Comment

<< Home