1. 程式人生 > >同步與Java記憶體模型(一)序言

同步與Java記憶體模型(一)序言

作者:Doug Lea 譯者:蕭歡  校對:丁一,方騰飛

先來看如下這個簡單的Java類,該類中並沒有使用任何的同步。

final class SetCheck {
private int  a = 0;
private long b = 0;

void set() {
a =  1;
b = -1;
}

boolean check() {
return ((b ==  0) ||
(b == -1 && a == 1));
}
}

如果是在一個序列執行的語言中,執行SetCheck類中的check方法永遠不會返回false,即使編譯器,執行時和計算機硬體並沒有按照你所期望的邏輯來處理這段程式,該方法依然不會返回false。在程式執行過程中,下面這些你所不能預料的行為都是可能發生的:

  • 編譯器可能會進行指令重排序,所以b變數的賦值操作可能先於a變數。如果是一個內聯方法,編譯器可能更甚一步將該方法的指令與其他語句進行重排序。
  • 處理器可能會對語句所對應的機器指令進行重排序之後再執行,甚至併發地去執行。
  • 
記憶體系統(由快取記憶體控制單元組成)可能會對變數所對應的記憶體單元的寫操作指令進行重排序。重排之後的寫操作可能會對其他的計算/記憶體操作造成覆蓋。
  • 編譯器,處理器以及記憶體系統可能會讓兩條語句的機器指令交錯。比如在32位機器上,b變數的高位位元組先被寫入,然後是a變數,緊接著才會是b變數的低位位元組。
  • 編譯器,處理器以及記憶體系統可能會導致代表兩個變數的記憶體單元在(如果有的話)連續的check呼叫(如果有的話)之後的某個時刻才更新,而以這種方式儲存相應的值(如在CPU暫存器中)仍會得到預期的結果(check永遠不會返回false)。


在序列執行的語言中,只要程式執行遵循類似序列的語義,如上幾種行為就不會有任何的影響。在一段簡單的程式碼塊中,序列執行程式不會依賴於程式碼的內部執行細節,因此如上的幾種行為可以隨意控制程式碼。這樣就為編譯器和計算機硬體提供了基本的靈活性。基於此,在過去的數十年內很多技術(CPU的流水線操作,多級快取,讀寫平衡,暫存器分配等等)應運而生,為計算機處理速度的大幅提升奠定了基礎。這些操作的類似序列執行的特性可以讓開發人員無須知道其內部發生了什麼。對於開發人員來說,如果不建立自己的執行緒,那麼這些行為也不會對其產生任何的影響。

然而這些情況在併發程式設計中就完全不一樣了,上面的程式碼在併發過程中,當一個執行緒呼叫check方法的時候完全有可能另一個執行緒正在執行set方法,這種情況下check方法就會將上面提到的優化操作過程暴露出來。如果上述任意一個操作發生,那麼check方法就有可能返回false。例如,check方法讀取long型別的變數b的時候可能得到的既不是0也不是-1.而是一個被寫入一半的值。另一種情況,set方法中的語句的亂序執行有可能導致check方法讀取變數b的值的時候是-1,然而讀取變數a時卻依然是0。

換句話說,不僅是併發執行會導致問題,而且在一些優化操作(比如指令重排序)進行之後也會導致程式碼執行結果和原始碼中的邏輯有所出入。由於編譯器和執行時技術的日趨成熟以及多處理器的逐漸普及,這種現象就變得越來越普遍。對於那些一直從事序列程式設計背景的開發人員(其實,基本上所有的程式設計師)來說,這可能會導致令人詫異的結果,而這些結果可能從沒在序列程式設計中出現過。這可能就是那些微妙難解的併發程式設計錯誤的根本源頭吧。

在絕大部分的情況下,有一個很簡單易行的方法來避免那些在複雜的併發程式中因程式碼執行優化導致的問題:使用同步。例如,如果SetCheck類中所有的方法都被宣告為synchronized,那麼你就可以確保那麼內部處理細節都不會影響程式碼預期的結果了。

但是在有些情況下你卻不能或者不想去使用同步,抑或著你需要推斷別人未使用同步的程式碼。在這些情況下你只能依賴Java記憶體模型所闡述的結果語義所提供的最小保證。Java記憶體模型允許上面提到的所有操作,但是限制了它們在執行語義上潛在的結果,此外還提出了一些技術讓程式設計師可以用來控制這些語義的某些方面。

Java記憶體模型是Java語言規範的一部分,主要在JLS的第17章節介紹。這裡,我們只是討論一些基本的動機,屬性以及模型的程式一致性。這裡對JLS第一版中所缺少的部分進行了澄清。

我們假設Java記憶體模型可以被看作在1.2.4中描述的那種標準的SMP機器的理想化模型。

(1.2.4)

在這個模型中,每一個執行緒都可以被看作為執行在不同的CPU上,然而即使是在多處理器上,這種情況也是很罕見的。但是實際上,通過模型所具備的某些特性,這種CPU和執行緒單一對映能夠通過一些合理的方法去實現。例如,因為CPU的暫存器不能被另一個CPU直接訪問,這種模型必須考慮到某個執行緒無法得知被另一個執行緒操作變數的值的情況。這種情況不僅僅存在於多處理器環境上,在單核CPU環境裡,因為編譯器和處理器的不可預測的行為也可能導致同樣的情況。

Java記憶體模型沒有具體講述前面討論的執行策略是由編譯器,CPU,快取控制器還是其它機制促成的。甚至沒有用開發人員所熟悉的類,物件及方法來討論。取而代之,Java記憶體模型中僅僅定義了執行緒和記憶體之間那種抽象的關係。眾所周知,每個執行緒都擁有自己的工作儲存單元(快取和暫存器的抽象)來儲存執行緒當前使用的變數的值。Java記憶體模型僅僅保證了程式碼指令與變數操作的有序性,大多數規則都只是指出什麼時候變數值應該在記憶體和執行緒工作記憶體之間傳輸。這些規則主要是為了解決如下三個相互牽連的問題:

  1. 原子性:哪些指令必須是不可分割的。在Java記憶體模型中,這些規則需宣告僅適用於-—例項變數和靜態變數,也包括陣列元素,但不包括方法中的區域性變數-—的記憶體單元的簡單讀寫操作。
  2. 可見性:在哪些情況下,一個執行緒執行的結果對另一個執行緒是可見的。這裡需要關心的結果有,寫入的欄位以及讀取這個欄位所看到的值。
  3. 有序性:在什麼情況下,某個執行緒的操作結果對其它執行緒來看是無序的。最主要的亂序執行問題主要表現在讀寫操作和賦值語句的相互執行順序上。

當正確的使用了同步,上面屬性都會具有一個簡單的特性:一個同步方法或者程式碼塊中所做的修改對於使用了同一個鎖的同步方法或程式碼塊都具有原子性和可見性。同步方法或程式碼塊之間的執行過程都會和程式碼指定的執行順序保持一致。即使程式碼塊內部指令也許是亂序執行的,也不會對使用了同步的其它執行緒造成任何影響。

當沒有使用同步或者使用的不一致的時候,情況就會變得複雜。Java記憶體模型所提供的保障要比大多數開發人員所期望的弱,也遠不及目前業界所實現的任意一款Java虛擬機器。這樣,開發人員就必須負起額外的義務去保證物件的一致性關係:物件間若有能被多個執行緒看到的某種恆定關係,所有依賴這種關係的執行緒就必須一直維持這種關係,而不僅僅由執行狀態修改的執行緒來維持。

原文

Consider the tiny class, defined without any synchronization:

final class SetCheck {
  private int  a = 0;
  private long b = 0;

  void set() {
    a =  1;
    b = -1;
  }

  boolean check() {
    return ((b ==  0) ||
            (b == -1 && a == 1));
  }
}

In a purely sequential language, the method check could never return false. This holds even though compilers, run-time systems, and hardware might process this code in a way that you might not intuitively expect. For example, any of the following might apply to the execution of method set:

  • The compiler may rearrange the order of the statements, so b may be assigned before a. If the method is inlined, the compiler may further rearrange the orders with respect to yet other statements.
  • The processor may rearrange the execution order of machine instructions corresponding to the statements, or even execute them at the same time.
  • The memory system (as governed by cache control units) may rearrange the order in which writes are committed to memory cells corresponding to the variables. These writes may overlap with other computations and memory actions.
  • The compiler, processor, and/or memory system may interleave the machine-level effects of the two statements. For example on a 32-bit machine, the high-order word of b may be written first, followed by the write to a, followed by the write to the low-order word of b.
  • The compiler, processor, and/or memory system may cause the memory cells representing the variables not to be updated until sometime after (if ever) a subsequent check is called, but instead to maintain the corresponding values (for example in CPU registers) in such a way that the code still has the intended effect.

In a sequential language, none of this can matter so long as program execution obeys as-if-serial semantics. Sequential programs cannot depend on the internal processing details of statements within simple code blocks, so they are free to be manipulated in all these ways. This provides essential flexibility for compilers and machines. Exploitation of such opportunities (via pipelined superscalar CPUs, multilevel caches, load/store balancing, interprocedural register allocation, and so on) is responsible for a significant amount of the massive improvements in execution speed seen in computing over the past decade. The as-if-serial property of these manipulations shields sequential programmers from needing to know if or how they take place. Programmers who never create their own threads are almost never impacted by these issues.

Things are different in concurrent programming. Here, it is entirely possible for check to be called in one thread while set is being executed in another, in which case the check might be “spying” on the optimized execution of set. And if any of the above manipulations occur, it is possible for check to return false. For example, as detailed below, check could read a value for the long b that is neither 0 nor -1, but instead a half-written in-between value. Also, out-of-order execution of the statements in set may cause check to read b as -1 but then read a as still 0.

In other words, not only may concurrent executions be interleaved, but they may also be reordered and otherwise manipulated in an optimized form that bears little resemblance to their source code. As compiler and run-time technology matures and multiprocessors become more prevalent, such phenomena become more common. They can lead to surprising results for programmers with backgrounds in sequential programming (in other words, just about all programmers) who have never been exposed to the underlying execution properties of allegedly sequential code. This can be the source of subtle concurrent programming errors.

In almost all cases, there is an obvious, simple way to avoid contemplation of all the complexities arising in concurrent programs due to optimized execution mechanics: Use synchronization. For example, if both methods in class SetCheck are declared as synchronized, then you can be sure that no internal processing details can affect the intended outcome of this code.

But sometimes you cannot or do not want to use synchronization. Or perhaps you must reason about someone else’s code that does not use it. In these cases you must rely on the minimal guarantees about resulting semantics spelled out by the Java Memory Model. This model allows the kinds of manipulations listed above, but bounds their potential effects on execution semantics and additionally points to some techniques programmers can use to control some aspects of these semantics (most of which are discussed in �2.4).

The Java Memory Model is part of The JavaTM Language Specification, described primarily in JLS chapter 17. Here, we discuss only the basic motivation, properties, and programming consequences of the model. The treatment here reflects a few clarifications and updates that are missing from the first edition of JLS.

The assumptions underlying the model can be viewed as an idealization of a standard SMP machine of the sort described in �1.2.4:

For purposes of the model, every thread can be thought of as running on a different CPU from any other thread. Even on multiprocessors, this is infrequent in practice, but the fact that this CPU-per-thread mapping is among the legal ways to implement threads accounts for some of the model’s initially surprising properties. For example, because CPUs hold registers that cannot be directly accessed by other CPUs, the model must allow for cases in which one thread does not know about values being manipulated by another thread. However, the impact of the model is by no means restricted to multiprocessors. The actions of compilers and processors can lead to identical concerns even on single-CPU systems.

The model does not specifically address whether the kinds of execution tactics discussed above are performed by compilers, CPUs, cache controllers, or any other mechanism. It does not even discuss them in terms of classes, objects, and methods familiar to programmers. Instead, the model defines an abstract relation between threads and main memory. Every thread is defined to have a working memory (an abstraction of caches and registers) in which to store values. The model guarantees a few properties surrounding the interactions of instruction sequences corresponding to methods and memory cells corresponding to fields. Most rules are phrased in terms of when values must be transferred between the main memory and per-thread working memory. The rules address three intertwined issues:

Atomicity
Which instructions must have indivisible effects. For purposes of the model, these rules need to be stated only for simple reads and writes of memory cells representing fields – instance and static variables, also including array elements, but not including local variables inside methods.
Visibility
Under what conditions the effects of one thread are visible to another. The effects of interest here are writes to fields, as seen via reads of those fields.
Ordering
Under what conditions the effects of operations can appear out of order to any given thread. The main ordering issues surround reads and writes associated with sequences of assignment statements.

When synchronization is used consistently, each of these properties has a simple characterization: All changes made in one synchronized method or block are atomic and visible with respect to other synchronized methods and blocks employing the same lock, and processing of synchronized methods or blocks within any given thread is in program-specified order. Even though processing of statements within blocks may be out of order, this cannot matter to other threads employing synchronization.

When synchronization is not used or is used inconsistently, answers become more complex. The guarantees made by the memory model are weaker than most programmers intuitively expect, and are also weaker than those typically provided on any given JVM implementation. This imposes additional obligations on programmers attempting to ensure the object consistency relations that lie at the heart of exclusion practices: Objects must maintain invariants as seen by all threads that rely on them, not just by the thread performing any given state modification.