

Milo M.K. Martin, Daniel J. Sorin, Harold W. Cain, Mark D. Hill, and Mikko H. Lipasti

Computer Sciences Department Department of Electrical and Computer Engineering University of Wisconsin—Madison

(C) 2001 Daniel Sorin

#### **Big Picture**

- · Naïve value prediction can break concurrent systems
- · Microprocessors incorporate concurrency
  - Multithreading (SMT)
  - Multiprocessing (SMP, CMP)
  - Coherent I/O
- Correctness defined by memory consistency model
  Comparing predicted value to actual value not always OK
- Different issues for different models
- · Violations can occur in practice
- · Solutions exist for detecting violations



















Sequential Consistency

No total order exists

slide 13

- Simplest memory consistency model
- Must exist total order of all operations

· Our example execution has a cycle

- Total order must respect program order at each processor











#### How to Fix SC Implementations

- · Address-based detection of violations
  - Student watches board B between prediction and verification
  - Like existing techniques for out-of-order SC processors
  - Track stores from other threads
  - If address matches speculative load, possible violation
- · Value-based detection of violations
  - Student checks grade again at verification
  - Also an existing idea
  - Replay all speculative instructions at commit
  - Can be done with dynamic verification (e.g., DIVA)

slide 19

#### Outline

- · The Issues
- · The Problem
- Value Prediction and Sequential Consistency
- · Value Prediction and Relaxed Consistency Models - Relaxed consistency models
- Value prediction and processor consistency (PC)
  - Value prediction and weakly ordered models
- · Conclusions

slide 20

# **Relaxed Consistency Models** · Relax some orderings between reads and writes · Allows HW/SW optimizations

- · Software must add memory barriers to get ordering
- · Intuition: should make value prediction easier
- Our intuition is wrong ...

slide 21

#### Processor Consistency

- · Just like SC, but relaxes order from write to read
- · Optimization: allows for FIFO store queue
- · Examples of PC models: - SPARC Total Store Order
  - IA-32
- · Bad news
  - Same VP issues as for SC
  - Intuition: VP breaks read-to-read dependence order
  - Relaxing write-to-read order doesn't change issues
- Good news
- Same solutions as for SC





### Violating Consistency Model

- Simple value prediction can break RMO, PPC, IA-64
- · How? By relaxing dependence order between reads
- Same issues as for SC and PC

slide 25

#### Solutions to Problem

- 1. Don't enforce dependence order (add memory barriers) – Changes architecture
  - Breaks backward compatibility
  - Not practical
- 2. Enforce SC or PC
  - Potential performance loss
- 3. More efficient solutions possible









## Conclusions

- Naïve value prediction can violate consistency
- Subtle issues for each class of memory model
- Solutions for SC & PC require detection mechanism
  Use existing mechanisms for enhancing SC performance
- Solutions for more relaxed memory models
   Enforce stronger model