#### ESE5320: System-on-a-Chip Architecture

Day 25: November 30, 2022 Real-Time Scheduling

Penn ESE5320 Fall 2022 -- DeHon



#### Today

#### Real Time

- Part 1: Synchronous Reactive Model
- Part 2: Interrupts and IO
  - Polling alternative
  - Timer?
- · Part 3: Resource Scheduling Graphs

enn ESE5320 Fall 2022 -- DeHon

## Message

- · Scheduling is key to real time
  - Analysis
  - Guarantees

Penn ESE5320 Fall 2022 -- DeHon

3

#### Synchronous Circuit Model

- A simple synchronous circuit is a good "model" for real-time task
  - Run at fixed clock rate
  - Take input every cycle
  - Produce output every cycle
  - Complete computation between input and output
  - Designed to run at fixed-frequency
    - Critical path meets frequency requirement

Penn ESE5320 Fall 2022 -- DeHo

4

#### Synchronous Reactive Model

- · Discipline for Real-Time tasks
- Embodies "synchronous circuit model"

E5320 Fall 2022 -- DeHon

5

#### Synchronous Reactive

- There is a rate for interaction with external world (like the clock)
- Computation scheduled around these clock ticks (or time-slices)
  - Continuously running threads
  - Each thread performs action per tick
- · Inputs and outputs processed at this rate
- · Computation can "react" to events
- Reactions finite and processed before next tick

enn ESE5320 Fall 2022 -- DeHon

#### Thread Form

while (1) { tick(); }

- tick() -- yields after doing its work
  - Until next master cycle
  - May be state machine
    - May change state and have different behavior based on state
  - May trigger actions to respond to events (inputs)

7



- 8

#### Tick Rate

- Driven by application demands of external control
  - Control loop 100 Hz
    - · Robot, airplane, car, manufacturing plant
  - Video at 33 fps
  - Game with 20ms response
  - Router with 1ms packet latency
    - 12µs

Penn ESE5320 Fall 2022 -- DeHor

9

#### Tick Rate

- · Multiple rates
  - May need master tick as least-common multiple of set of interaction rates
    - ...and lower freq. events scheduled less frequently
  - E.g. 100Hz control loop and 33Hz video
    - · Master at 10ms
    - Schedule video over 3 10ms time-slots
      - May force decompose into tasks fit into smaller time window since must schedule as the property of the pr

Penn ESE5320 Fall 2022 -- Devents at highest frequency

10

10

#### Synchronous Reactive

- Ideal model
  - Per tick reaction (task processing) instantaneous
- · Separate function from compute time
- Separate function from technology
  - Feature size, processor mapped to
- · Like synchronous circuit
  - If logic correct, works when run clock slow enough
  - Works functionally when change technology
  - Then focus on reducing critical path

ESE5320 Fall 202 making timing work

11

#### **Timing and Function**

- Why want to separate function from technology and timing?
- Move to slower processor(s):
  - What would happen if just moved?
  - What needs to happen?
- Move to faster processor(s):
  - What would happen if just moved?
  - What want to happen?

Penn ESE5320 Fall 2022 -- DeHor

12

#### Synchronous Reactive Timing

- · Once functional,
  - need to guarantee all tasks (in all states)
    - Can complete in tick time-slot
    - · On particular target architecture
- Identify WCET (worst-case execution time)
  - Like critical path in FSM circuit
  - Time of task on processor target

n ESE5320 Fall 2022 -- DeHon

13

13

15

```
Preclass 1
                                                                             ck() {
for(i=0;i<MAX_OBJECTS;i++) {
    obj[i].inputs(); // see below
    obj[i].updatePositionState(); // 1,000 cycles
    obj[i].collide(); // 9,000 cycles
    obj[i].render(); // 1,000 cycles</pre>
Worst-case object
  processing time?
                                                                              updateScreen(); // takes 10 ms
                                                                       // for object class
                                                                      // for object class
inputs() {
   int move=getMoveInput(); // 10
   int fire=getFireInput(); // 10
   isytch (nove){
   case LEFT: moveLeft(); break; // 10
   case FGMHZ: moveLeft(); break; // 10
   case FGMARD: thrustIncrease(); break; // 5,000
   case BACK: thrustDecrease(); break; // 4,000
   defulled.
                                                                              if (fire) processFire(); // 10,000
```

15

Preclass 1

Time available to process objects?

```
tick() {
   for(i=0;i<MAX_OBJECTS;i++) {</pre>
        obj[i].inputs(); // see below
        obj[i].updatePositionState(); // 1,000 cycles
        obj[i].collide(); // 9,000 cycles
        obj[i].render(); // 1,000 cycles
   updateScreen(); // takes 10 ms
 ESE5320 Fall 2022 -- DeHon
                                                       14
```

14

#### Preclass 1

· Maximum number of objects on single GHz processor?

16

#### Synchronous Reactive Timing

- · Once functional,
  - need to guarantee all tasks (in all states) can complete in tick time-slot
  - On particular target architecture
- · Identify WCET
  - Like critical path in FSM circuit
  - Time of task on processor target
- · Schedule onto platform

Threads onto processor(s)

17



17

18



#### Synchronous Reactive Model

- · Discipline for Real-time tasks
- Embodies the "synchronous circuit model"
  - Master clock rate
  - Computation decomposed per clock
  - Functionality assuming instantaneous compute
  - On platform, guarantee runs fast enough to complete critical path at "clock" rate

20

#### Interrupts and IO

Part 2

21

#### Interrupt

- · External event that redirects processor flow of control
- · Typically forces a thread switch
- · Common for I/O, Timers
  - Indicate a need for attention

22

#### Interrupts

• Why would we use interrupts for I/O?

23

21

Interrupts: Good

- Allow processor to run some other work
- · Infrequent, irregular task service with low response service latency
  - Low latency
  - Ok when low throughput inputs
    - So infrequent interrupts...

24

23

#### Interrupts: Bad

- · Time predictability
  - Real-time for computing tasks interrupted
- · Processor usage
  - Costs time to switch contexts
- Concurrency management
  - Must deal with tasks executing nonatomically
    - · Interleave of interrupted service tasks
    - · Perhaps interleave of any task

25

27

ESE5320 Fall 2022 -- DeHor 26

· Add to list

atmp=a

a=new

a=rtmp

new->next =atmp

Remove from list

rtmp=a->next

removed=a->value

#### What can happen?

- · Add to list atmp=a
  - new->next =atmp a=new
- Remove from list removed=a->value rtmp=a->next
  - <return> a=rtmp a=rtmp

What goes wrong?

27

· Sequence

atmp=a

a=new

removed=a->value

rtmp=a->next

new->next=atmp

- <interrupt>



Interrupted Task

list

· Running something

· Interrupt involves

adding to list

a - value next value next value next 18 next

26

that removes from

28





29 30



Interrupts: Bad

- · Time predictability
  - Real-time for computing tasks interrupted
- · Processor usage
  - Costs time to switch contexts
- · Concurrency management
  - Must deal with tasks executing nonatomically
    - Interleave of interrupted service tasks

**IO Thread** 

• Like tick() -- yields after doing its work

· Perhaps interleave of any task

while (1) { process\_input(); }

- Wait for next master cycle

32

#### Polling Discipline

- · Alternate to I/O interrupts
- · Every I/O task is a thread
- · Budget time and rate it needs to run
  - E.g. 10,000 cycles every 5ms
  - Likely tied to

33

- · Buffer sizes
- · Response latency
- · Schedule I/O threads as real-time tasks
  - Some can be DMA channels

33

35

34

#### Preclass 2

- · Input at 100KB/s
- · 30ms time-slot window
- · Size of buffer?
- 100 cycles/byte, GHz processor runtime of service routine?
  - Fraction of processor capacity?

Scheduling I/O Tasks

#### **Timer Interrupts**

· Why do we have timer interrupts in conventional operating systems?

- E.g. in linux?

n ESE5320 Fall 2022 -- DeHon

39

37

## Timer Interrupts?

- · Bounded-time tasks
  - E.g. reactive tasks in real-time
  - Task has guarantee to release processor within time window
  - Not need timer interrupts to regain control from task
  - (Maybe use deadline operations [Day24] for timer)

ESE5320 Fall 2022 -- DeHor

39

#### **Real-Time Tasks**

- · Interrupts less attractive
  - More disruptive
- Scheduled polling better predictability
- · Fits with Synchronous Reactive Model

41

· Best effort tasks (i.e. non-real-time tasks)

**Timer Interrupts** 

- Have no guarantee to finish in bounded time
- Timer interrupts necessary
  - · to allow other threads to run
  - · fairness
  - · to switch to real-time service tasks
- · Need timer interrupts if need to share processor with best-effort and real-time threads
  - Alternate: Easier to segregate real-time and best-effort threads onto different processors

38

### **Greedy Strategy**

- · Schedule real-time tasks
  - Scheduled based on worst-case, so may not use all time allocated
- · Run best-effort tasks at end of timeslice after complete real-time tasks
  - Timer-interrupt to recover processor in time for start of next scheduling time slot
- · (adds complexity)

ESE5320 Fall 2022 -- DeHon

40

#### Resource Scheduling Graphs

Part 3

42

## Scheduling

- Useful to think about scheduling a processor by task usage
- Useful to budget and co-schedule required resources
  - Bus

43

- Memory port
- DMA channel

nn ESE5320 Fall 2022 -- DeHon

Simple Task Model · Task requires · Uses resources Data to be - Bus/channel to transferred transfer data - Local storage state • (in and out) Space in memory on Computational accelerator cycles - Cycles on accelerator - (Result data to be transferred) nn ESE5320 Fall 2022 -- DeHon Memory

44

43



45

Resource Schedule Graph

• Extend as necessary to capture

- potentially limiting resources and usage

   Regions in memories
  - Memory ports
  - I/O channels

5320 Fall 2022 -- **DeH**on 47



47 48



Approach

- Ideal/initial look at processing requirements
  - Resource bound on processing
- · Look for bottlenecks / limits with Resource Bounds independently
  - Add buses, memories, etc.
- · Plan/schedule with Resource Schedule Graph

ESE5320 Fall 2022 -- DeHon

50

50

#### Preclass 3a

- · Resource Bound
  - Data movement over bus?
  - Compute on 2 processors?
  - Compute on 2 processors when processor must wait while local memory is written?

| Task                   | Data (bytes) | Compute cycles | Data+Compute<br>Work |
|------------------------|--------------|----------------|----------------------|
| Α                      | 1600         | 1600           |                      |
| В                      | 200          | 600            |                      |
| С                      | 800          | 3200           |                      |
| D                      | 200          | 600            |                      |
| Е                      | 400          | 400            |                      |
| SE5320 Fall 2022 DeHon |              |                |                      |

51

#### Resource Bound wait Transfer

- · Total processor cycles when processor must idle during transfer
  - $\text{Cycles}_{\text{proc}} = \sum (Comp[i] + Bytes[i])$
- RB<sub>proc</sub>=(Cycles<sub>proc</sub>)/2
- RB<sub>bus</sub>=  $\sum (Bytes[i])$
- RB=max(Rb<sub>bus</sub>, RB<sub>proc</sub>)

52

51

#### Preclass 3b Schedule

· Processor wait for data load

200 cycle intervals

# 53

53

#### **Double Buffering**

- · Common trick to overlap compute and communication
- Reserve two buffers input (output)
- · Alternate buffer use for input
- · Producer fills one buffer while consumer working from the other
- · Swap between tasks
- · Tradeoff memory for concurrency
- Sub-buffers in Vitis clEnqueueMigrateObjects



Double Buffer Schedule Consumer · How impact schedule? When can move data
 Even cycles: into buffer? · Hint: think about how impact preclass 3b schedule? What new freedom have? Odd Cycles: Impact on use of processor? Consumer read from 0 n ESE5320 Fall 2022 -- DeHon 56

56



Resource Schedule Graphs

- Useful to plan/visualize resource sharing and bottlenecks in SoC
- · Supports scheduling
- · Necessary for real-time scheduling

Penn ESE5320 Fall 2022 -- DeHo

58

60

57

#### Big Ideas:

- · Scheduling is key to real time
  - Analysis, Guarantees
- · Synchronous Reactive
  - Scheduling worst-case tasks "reactions" into master time-slice matching rate
  - Separate function from timing
- · Schedule I/O with polling threads
  - Avoid interrupts

59

- Schedule dependent resources
- Buses, memory ports, memory regions...

#### Admin

- Feedback
- · Wolf Lecture Wednesday at 3pm
  - Tsu-Jae King Liu
  - Sustaining the Semiconductor Revolution
- · Reading for Monday online
- P4 due Friday

ESE5320 Fall 2022 -- DeHon 60

59

10