This paper discusses memory consistency models and their inﬂuence on software in the context of parallel machines. In the ﬁrst part we review previous work on memory consistency models. The second part discusses the issues that arise due to weakening memory consistency. We are especially interested in the inﬂuence that weakened consistency models have on language, compiler, and runtime system design. We conclude that tighter interaction between those parts and the memory system might improve performance considerably.
Department of Computer Science
The University of Arizona
Tucson, AZ 85721
This is an updated version of [Mos93]
Shared memory can be implemented at the hardware
or software level. In the latter case it is usually called
Distributed Shared Memory (DSM). At both levels work
has been done to reap the beneﬁts of weaker models. We
conjecture that in the near future most parallel machines
will be based on consistency models signiﬁcantly weaker
than SC [LLG+ 92, Sit92, BZ91, CBZ91, KCZ92].
The rest of this paper is organized as follows. In
section 2 we discuss issues characteristic to memory
consistency models. In the following section we present
several consistency models and their implications on the
programming model. We then take a look at implementation options in section 4. Finally, section 5 discusses the inﬂuence of weakened memory consistency models
on software. In particular, we discuss the interactions
between a weakened memory system and the software
Traditionally, memory consistency models were of interest only to computer architects designing parallel machines. The goal was to present a model as close as possible to the model exhibited by sequential machines.
The model of choice was sequential consistency (SC).
Sequential consistency guarantees that the result of any
execution of n processors is the same as if the operations of all processors were executed in some sequential order, and the operations of each individual processor
appear in this sequence in the order speciﬁed by the
program. However, this model severely restricts the set
of possible optimizations. For example, in an architecture with a high-latency memory, it would be beneﬁcial to pipeline write accesses and to use write buffers. None of these optimizations is possible with the strict
SC model. Simulations have shown that weaker models
allowing such optimizations could improve performance
on the order of 10 to 40 percent over a strictly sequential
model [GGH91, ZB92]. However, weakening the memory consistency model goes hand in hand with a change in the programming model. In general, the programming model becomes more restricted (and complicated) as the consistency model becomes weaker. That is, an
architecture can employ a weaker memory model only
if the software using it is prepared to deal with the new
programming model. Consequently, memory consistency models are now of concern to operating system and language designers too.
We can also turn the coin around. A compiler normally considers memory accesses to be expensive and therefore tries to replace them by accesses to registers.
In terms of a memory consistency model, this means that
certain accesses suddenly are not observable any more.
In effect, compilers implicitly generate weak memory
consistency. This is possible because a compiler knows
exactly (or estimates conservatively) the points where
memory has to be consistent. For example, compilers
typically write back register values before a function
call, thus ensuring consistency. It is only natural to attempt to make this implicit weakening explicit in order to let the memory system take advantage too. In fact,
it is anticipated that software could gain from a weak
model to a much higher degree than hardware [GGH91]
by enabling optimizations such as code scheduling...