2006-11-29

JSR 121 App Isolation API is not the solution

JSR 121 Application Isolation API may be a good solution for some problems but not for the easier concurrent programming one.

http://jcp.org/en/jsr/detail?id=121

A recent post about JSR 121.

2006-11-25

Gradual steps or paradigm shift towards easier concurrent apps?

Herb Sutter's distinguishing argument is for a gradual evolution of programming languages and techniques (IDEs, APIs) towards easier concurrent program development.

My take on this is that we can only do gradual evolution right now because we don't really know how to do easy concurrent programs, i.e., we don't have an adequate programming language yet, but gradual is not sufficient (because still too difficult) and we will need a completely different language, asap.

Gradual steps may take us there eventually but what we will have once we get there will be a paradigm shift, a completely different type of programming language.

I propose that the new language that we try does concurrency by default and that in general a programmer will have to write specific instructions in order to bypass the built-in concurrency features and get a non-concurrent process or structure.

Herb Sutter advocates new languages for concurrency

I agree with Herb on that one and on other issues. Here are two seminal documents from Herb.

http://www.gotw.ca/publications/concurrency-ddj.htm - Updated version of The free lunch is over article.

http://irbseminars.intel-research.net/HerbSutter.pdf - Slides of Herb's presentation at Intel Research, Berkeley, California, September 25, 2006.

TODO here summary of Herb's views.

Herb is a C++ guru currently working on general concurrency issues, including a project at Microsoft. His distinguishing argument is for a gradual evolution of programming languages and techniques (IDEs, APIs) towards easier concurrent program development.

2006-11-16

CSP for Java On IBM Developerworks

This is a 3 part series. Very important.

CSP = Communicating Sequential Processes

The main questions for us about CSP are:
  1. What are the situations where CSP may be advantageous?

  2. If any, then when do we use it instead of using java.util.concurrent directly?
We will try to answer these questions in this blog, as well as other questions.

IBM Thread and Monitor Dump Analyzer for Java

Thread Threat - Vladimir Roubtsov, JavaWorld.com, Feb. 2003

2006-11-12

Characteristics of Thread-Safe Programs

1. It is usually easier to design thread safety at an early design stage than to modify existing code.

2. A program consisting of thread-safe classes may not be thread-safe and a thread-safe program may contain some classes that are not thread-safe. (jcip p.15)

3. We do not yet have a formal definition of thread safety in general and we do not have formal techniques for stating a complete definition of thread safety for a given class. This poor state of affairs in the world of so-called computer science is mainly due to the lack of a technique for formal and complete definitions of program correctness. If any such techniques exist, they are not in common use, at least not in commercial software. Techniques that I studied for this purpose, in the 1980's for example, where often more complex than the code that they were describing.

4. In theory, a correct class means that the class conforms to its specification (what it is supposed to do). And frequently, classes specifications are vague (i.e., informal and incomplete) and single-threaded correctness is often assumed when no incorrect behavior is observed. In many cases, the source code of the class is the specification. In the best cases, the correctness of a class is defined by associated testing classes and test cases (data sets for testing) that define the input and matching output of each of the methods of the class. The validity of correctness-by-testing is dependent on the correctness of the testing classes and data sets. This is like a house of cards.

5. The definition of thread-safe class from the JCiP book: A class is thread-safe when it behaves correctly for a single thread and it continues to behave correctly when used by multiple threads, and with no additional synchronization or other coordination on the part of the calling code.

6. My current definition of a thread-safe class: A class is deemed thread-safe when it complies with the Mandatory Basic Rules in this blog.

Labels: , , , , , ,

Mandatory Rules for Safe Multithreading in Java in 2006

Much of this post is inspired by the first few chapters of the jcip book. It is also based on my experience.

1) State Variables Must be Synchronized: When more than one thread access a state variable (var), and one of them might write to it, then they all must synchronize their access to it.

1.1. A state var is a variable for a single instance or all instances of a class, aka. instance var and class var respectively. A state var is also called a field. It is not a variable defined in a method; these local vars defined in a method are thread-safe because they cannot be shared with other threads (unless they are copied to some other location that can be shared). In Java, an instance var is defined as non-static (the static qualifier is absent) and a class var uses the static qualifier. It is a common and critical misconception to consider that instance variables are thread-safe and that only static variables need protection (e.g., a guard or lock).

1.2. A stateless class is thread-safe if it uses only thread-safe classes. A stateless class is one that has no fields and that does not reference any fields from other classes. It may have local variables, i.e., variables defined in a method.

2) Missing Synchronization Is a Defect: A program that is missing needed synchronization may appear to work well for years but may fail at any moment and must be considered defective and must be fixed urgently.

2.1. The occurrences of programs with missing synchronization exhibiting erroneous behavior and producing incorrect data will be greatly increased with the advent of multicore chips in desktop computers and in servers replacing single-chip machines.

2.2. Applications that have been exhibiting correct behavior on SMP machines should show less of an increase in the frequency of multithreaded defects than applications running on single chip machines. This is only about the predicted increase in frequency of multithreading defects and rule #2 still applies to applications that have been running on SMP servers, i.e., if they are missing synchronization then they are defective.

3) How to Synchronize: There are 4 basic ways to fix synchronization defects for a given state var:

3.1. Don't share the state var across threads, if possible,

or

3.2. Make the state var final, if possible,

or

3.3. Synchronize accesses to the state var; if the previous two options are not possible then this one is mandatory. The ways to synchronize access to a state var:

3.3.1. In some cases you may use a java.util.concurrent.atomic (j.u.c.a.) class instead of a regular (non atomic) primitive; this is sufficient when the class has only one state var, or the state vars are not interdependent, but is not sufficient when it has more than one state vars that need to coordinate/synchronize their updates; the atomic classes cannot be used for this coordination, therefore, in such cases, instead of atomic vars, you must use an atomic set of operations, e.g., a set of operations bounded by a lock (rule 3.3.2). In some cases, an atomic class from j.u.c.a cannot appropriately replace its corresponding primitive. The j.u.c.a. classes are primarily designed for developing non-blocking algorithms, a particular type of concurrent logic.

3.3.2. Use an atomic set of operations (e.g., a set of operations bounded by a lock) to update interdependent state variables.

3.3.3. If coordinated access is being done on a var, then synchronize with the same lock for all uses of that var and all the invariants in which it and its related vars may participate, including reading the var(s). This situation is called *guarded by a lock* and the @GuardedBy("aLock") annotation is to be used for each state var that need to be coordinated.

3.3.3.1. Use block synchronization of minimal sizes while minimizing the number of such blocks in each method. Possible strategy: use method synchronization (sync) when this does not affect performance significantly; otherwise use block synchronization within the method. In determining the minimal block size, try to not include local vars in block, because local vars (stack-based) are not shared across threads and do not require synchronization. Favor sync blocks that do not include calls to methods of other objects, particularly those that are lengthy or that involve operations that may block (e.g., I/O, network). Favor a small number of sync blocks per methods because each sync block has a cpu overhead cost. Nevertheless, thread safety must never be compromised and cpu cycles must be spent.

or

3.4. Make the Entire Class Immutable: this is my favorite solution when possible. An immutable class is one whose state cannot be seen to change by callers (aka. clients). This requires that:
- all public fields are final,
- all public final reference fields refer to other immutable objects, and
- constructors and methods do not publish references to any hidden state which is potentially changeable (mutable); in particular, the *this* reference must not escape from any of the constructors of the class.

3.4.1. Immutable objects may still have hidden mutable state for purposes of performance optimization; some state vars may be lazily computed, as long as they are computed from immutable state and that clients cannot tell the difference (they are hidden from the clients).

3.4.2. Immutable objects are thread-safe; they may be passed between threads or published without synchronization.

3.4.3. Making a static field immutable: use *final*, do not deserialize the class, and the field must contain a primitive or an instance of a class which is itself thread-safe. This will ensure atomicity and visibility for a static field.

3.4.4. Making a NON-STATIC field immutable (an instance field): use *final* and make sure that all constructors cannot let *this* escape. Immutable classes must have thread-safe constructors. The final fields may be made public, and accessor methods are not needed, which also improves performance.

3.4.5. Serialization: It may be possible to create an immutable class that can be serialized and deserialized, but it may be so complex that it may not be worth the development costs, and in general, if a class needs to be serialized and deserialized, it may be less expensive to not try to make it immutable and to make it a plain thread-safe class.

3.4.6. For similar reasons than for 3.4.5. Serialization, in general, it is usually expedient to not make immutable beans or classes that are loaded programmatically, if that bean or programmatic class has fields that require values set by the constructor(s) parameters, which is often the case.

4) How to use VOLATILE:

4.1. Try not to use volatile - use volatile only when all these conditions are met:

- writing to the shared var without using its current value or when only a single thread updates the var;

- and the var is not involved in an invariant relation;

- and the var is not of type long or double: these primitives are not atomic and such shared vars must be guarded by a lock or, for a long not involved in an invariant, an AtomicLong, instead of a volatile qualifier, and for a double not involved in an invariant, a BigDecimal may be considered (because it is immutable thus thread-safe, although a BigDecimal may be slower than using a lock); volatile cannot guarantee atomicity and also do not protect invariants;

- and you are absolutely certain that locking is not required while the var is being accessed.

4.2. Examples of typical use of volatile: Ensuring visibility (without the need for atomicity), e.g., boolean event notification, such as single change boolean (initialization or shutdown state indicator), or a status flag used to exit a loop. The above conditions in 4.1 must always be met.

4.3. When in doubt, use locking instead of volatile.

5) Java 5 or Above: Do not use pre-Java 5 versions. The above rules only apply to Java 5 or later, and earlier versions of Java must not be used for mutithreaded apps. This is because the memory model was fixed in Java 5 to ensure thread-safety and the earlier versions of Java are inherently not thread-safe (unless a vendor has rendered it's implementation of Java thread-safe before Java 5), and when using earlier versions of Java, developers cannot be guaranteed that using the above mandatory rules will ensure thread-safety.

Non-Mandatory Rules:

a) When a state var is protected by synchronization (i.e., a lock), do not use an atomic type from java.util.concurrent.atomic to store the value; use a regular type because this helps performance and code simplicity (better speed and lower maintenance costs).

b) In your code, use the annotations defined by Brian Goetz and Tim Peierls (used in the JCiP book): ThreadSafe, Immutable, GuardedBy.

c) Always use the -server option for the java command, even in development (of concurrent applications), when this option is available.

d) Do not assume that any class is thread-safe, including classes from the Java SE API, unless you analyze the source code, or the class is documented to be thread-safe and you trust the developer(s). I trust the Java SE developers but unfortunately the documentation of many of their classes do not specify whether the class is safe or not.

e) Question: what's the difference between int and AtomicInteger? Is int atomic? jcip p. 36 states that lower than 64-bit vars are atomic. If int is atomic, then why use AtomicInteger?
Answer: Assigning an int is atomic but operations on an int that depend on its previous value are not atomic, so AtomicInteger is designed for atomic operations with the previous value of the field. Also, an AtomicInteger is used in applications such as atomically incremented counters, and usually cannot be used as a replacement for an Integer. Atomic classes in java.util.concurrent.atomic are designed primarily for implementing non-blocking data structures and related infrastructure classes. The compareAndSet method that they use is not a general replacement for locking because it applies only when critical updates for an object are confined to a single variable and therefore it is not applicable to invariants.

f) Q: How come no AtomicDouble in JSR 166?
A: Because their use is very uncommon. Generally, a double field is guarded by a lock and not by an atomic class. You can easily convert an AtomicInteger and an AtomicLong into either a float or double by using their floatValue() and doubleValue() methods.

g) Values representing amounts of currency should always use BigDecimal (for precision reasons, not for concurrency) and a BigDecimal is immutable thus thread-safe.


TODO rendering thread-safe applications composed of thread-safe classes (in other posts).


TODO locking protocols, sync. policies (in other posts).

TODO safety of classes generated by xjc tool in JAXB (in other posts).

Copyright (c) 2006 Serge Masse.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation. A copy of the license is in http://www.gnu.org/licenses/fdl.txt

The Java annotations (e.g., @GuardedBy, @Immutable, @ThreadSafe) in this post are copyrighted by Brian Goetz and Tim Peierls under these terms:
Copyright (c) 2005 Brian Goetz and Tim Peierls
Released under the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.5)
Official home: http://www.jcip.net
Any republication or derived work distributed in source code form must include this copyright and license notice.

Labels: , , , , , ,

JSR-166 and the JCIP book

One of the best toolkit today for concurrent programming is the JSR-166 API in Java SE 5 and the related mailing list, concurrency-interest, and book, Java Concurrency In Practice (JCIP).

Nevertheless, the use of these tools show that it is still very difficult to build concurrent applications and that we need even better tools, such as a new programming language and/or development tools.

And until then, JSR-166, with the great support from Doug Lea and his team, are the best that we have.

first post on multithreading

only the critical programming issues, with an emphasis on Java.