System-on-a-Chip Architecture

Course: ESE5320

Units: 1.0 CU
Terms: Fall 2024
When: MW 10:15am--11:45am (First Lecture W 8/28/2024)
Where: TOWNE 303
Instructor: DeHon
TAs: Manvi Agarwal, Runlong Hu

Office Hours:	Tuesday	8:30pm-9:30pm	Ketterer Lab
	Wednesday	2pm-3pm	Ketterer Lab
	Thursday	8:30pm-10:30pm	Ketterer Lab

Prerequisite:

Undergraduate	CIS2400, ESE3500, (CIS4710 helpful)
Graduate	working knowledge of C

URL: <http://www.seas.upenn.edu/~ese5320/>

Quick Links: [Course Objectives] [Project] [Grading] [Syllabus] [Course Policies] [Previous Offerings] [Documentation] [Relation to other courses]

Catalog Level Description:

Motivation, design, programming, optimization, and use of modern System-on-a-Chip (SoC) architectures. Hands-on coverage of the breadth of computer engineering within the context of SoC platforms from gates to application software, including on-chip memories and communication networks, I/O interfacing, RTL design of accelerators, processors, concurrency, firmware and OS/infrastructure software. Formulating parallel decompositions, hardware and software solutions, hardware/software tradeoffs, and hardware/software codesign. Attention to real-time requirements.

Course Objectives

By the end of the course, you will be able to:

design, optimize, and program a modern System-on-a-Chip.
(i) analyze a computational task, (ii) characterize its computational requirements, (iii) identify performance bottlenecks, (iv) identify, explore, and evaluate a rich design space of solutions, and (v) select and implement a design that meets engineering requirements.
decompose the task into parallel components that cooperate to solve the problem.
characterize and develop real-time solutions.
implement both hardware and software solutions, formulate hardware/software tradeoffs, and perform hardware/software codesign.
understand the system on a chip from gates to application software, including on-chip memories and communication networks, I/O interfacing, RTL design of accelerators, processors, firmware and OS/infrastructure software.
understand and estimate key design metrics and requirements including area, latency, throughput, energy, power, predictability, and reliability.

Topics

Architectural building blocks and heterogeneous architecture, Hardware-Software Codesign, Embedded Software, Interfacing, Computational requirements and system analysis, Concurrency, Real Time, Design-space formulation and exploration, Costs and metrics (energy, area, runtime, reliability, predictability), Quantitative design and analysis.

Rough Syllabus Plan

Overview, scope, methodology
Metrics and bottlenecks
Memory
Computational models
Data parallel microarchitectures (SIMD, Vector, GPU)
Thread-level Parallelism and virtualization
Spatial computations, basic mapping from high-level
Fine-grained parallelism microarchitectures (FSMD, VLIW)
High-level synthesis (C-to-gates, resource selection and provisioning)
Verification
Real-time, reactive

Detailed Fall 2024 schedule on the 2024 Syllabus.

Project

This course will include a substantial project running throughout term. Students work in groups of 3. Platform will be an SoC-FPGA (e.g., Xilinx Zynq or Intel/Altera Arria), allowing the provisioning of soft-core processors, accelerators, and memory in addition to the use of the embedded SoC logic. It will start with a significant task (like video acquisition, processing, compression, networking). Course starts by running the task on single processor and identifying resource requirements. Then, it will deal with I/O for task. It then migrates the task to multiple processors to accelerate. After that, it develops custom accelerators for task and integrate with networked processor. The final half of the course is an open-ended optimization project using the techniques and design options introduced in the course.

Grading

Grading is based on:

Design Project [40%]
Weekly Assignments [20%]
Final [20%]
Midterm [10%]
Engagement [10%]

Policies

Diagnostic Assessment and Wait Lists

We give a diagnostic quiz at the beginning of the course; this is distributed to everyone registered and on the wait list. The diagnostic will help determine if you are ready for the course. This course assumes you can read and write C code. If you find the diagnostic assessment difficult or time consuming, that's an indication you should find an alternate course for this term and strengthen your C background to take ESE5320 in the future. In the past few years, anyone who could do reasonably well on the diagnostic quiz has been able to get into the course (but we cannot make 100% promises how it will work out in any given year). The diagnostic hasn't differed much from year to year, so you can look at previous years to get a good idea for what you need to be able to do (2023 diagnostic).

Writeups

Writeups must be done in electronic form and submitted through Canvas (below). Use CAD or drawing tools where appropriate. Handwritten assignments and hand-drawn figures are not acceptable.

The specific homework assignments will specify what portion of the writeup can be performed jointly and what part should be individual.

Portions of the project milestones and final will be per group. Look for specific instructions associated with the project.

ChatGPT

Work-in-Progress: You may use ChatGPT for cleaning up the prose in your homework assignments and reports.

You must acknowledge the use in your writeup.
You are responsible for all content you turn in, including any parts produced by ChatGPT. Scrutinize anything it gives you for errors. Any errors or misrepresentations that it introduces will be your responsibility and you may lose points for it.
We suspect that ChatGPT will not be useful in developing HLS code for hardware.

Homework Turnin

All assignments will be turned in electronically through the Penn Canvas website. Log in to canvas with your PennKey and password, then select ESE 5320 from the Courses and Groups dropdown menu. Select Assignments from the links on the left and select the assignment you wish to submit for. Submission should be as a single file (preferably .pdf). In some cases, there will be separate assignment submission slots for specific components of the assignment.

Late Assignments

Assignments must be turned in by the published due date to receive credit.

We will grant each student 3 free late days for the course of the entire term (homeworks) for individual turn-in assignments or assignment components. That means you could, for example, turn in three assignments one day late each or one assignment 3 days late and still receive full credit. The quantum for free late days is a day, so you cannot turn in every assignment 6 hours late and receive full credit. There are no free late days for group turn-ins.

Collaboration

Students are allowed and encouraged to help each other with the Xilinx tools (Vivado, Vivado HLS, Windows, Linux) used for the course, but are disallowed from developing collaborative design solutions (C-code, pragmas, design and analysis) outside of identified project groups. Each team must develop its own design solution; collaborating across teams is a violation of the collaboration policy. Within a project group, the assignment will specify what part should be done as a group and what part should be done individually.

Tools---We know the tools are complex and the documentation often dense or inadequate, and we won't be surprised if they are buggy. It will likely be necessary to collaborate as a class on figuring out how to best use the tools for the term. We encourage students to help each other and share what they learned. We will award bonus points for student-developed instructions and tutorials on how to solve common tasks that arise for the tools.
Design Solutions---Each team (or individual where specified) should develop their own solutions to the design problem and their own implementations. You are taking this class to develop these skills, and we believe you need to work out the solutions on your own to master the skills. You cannot share code, diagrams, specific pragma settings, plots, analysis, metrics, or other results. You cannot share problem decompositions.
HLS Pragmas---HLS Pragmas sit at the border between where collaboration is allowed and not allowed. You are allowed to help make each other aware of the existence of pragmas and the syntax for pragmas. You are not allowed to tell each other what pragma values and settings best solves the problem---you should be reasoning through what the settings mean and how they impact the code mapping, and you should be performing your own experiments in your project teams. You are allowed to say where a pragmas goes syntatically (e.g., relative to function header, relative to loop header), but are not allowed to suggest which function or loop would benefit from a specific pragma.

In general, you are expected to abide by Penn's Code of Academic Integrity. If there is any uncertainty, please ask.

Absentees

Use the absence reporting form in Path@Penn to report absences.

Preclass Worksheets

Preclass worksheets will be available for a period of time before the lecture and at least 24 hours after the lecture. After 24 hours after the lecture, we do not promise preclass worksheets are available. You are responsible for keeping up with the course as it happens, collecting them, and keeping them to use for review.

Daily Quizzes / Engagement Points

Each lecture has an associated daily quiz. The daily quiz must be completed before the next lecture for you to earn the points. The intent of the quiz is to make sure you are keeping up with the lectures. This course moves fast and there are new ideas to digest every lectures. After attending lecture it should be easy to complete the daily quiz.

Credit Adjustment

Make sure you call any problems with grading to our attention immediately and not later than the next class meeting after they are returned or posted on canvas. To submit a request for a review of a credit assignment on a lab assignment send an email to the TA stating the nature of the problem and the remedy you desire. We will not consider any requests for grade adjustments that are submitted later than the one week grace period after the grades are posted on canvas. You are responsible for checking your posted grades in a timely manner.

Documentation

Vitis Software Platform
Downloadable eBook (not Zynq MPSoc specific)
Zynq UltraScale+ MPSoC Overview, Technical Reference Manual
Ultra96 Board Documentation
Vivado HLS User's Guide
Parallel Programming for FPGAs -- using C and Vivado HLS to program FPGAs

Previous Offerings

Comparison to ESE534

This course inherited less than 25% of the material from the last offering of ESE534. This course does not go deep into how to design a spatial substrate (compute, interconnect), nor go deep into processor--FPGA continuum and instruction design. If offered again (no current plans), ESE534 would likely evolve to take this course as a pre-requisite. Possibly ESE534 and 535 will merge into a single advanced, follow-on course. Note that ESE534 did not have the kind of hands-on project that becomes a key component of this course.

Comparison to CIS5710 (formerly CIS501, CIS571)

This course is complementary to CIS5710. This course is more focused on custom, application-oriented design with real-time concerns, while CIS5710 focuses on ISA compatibility and best-effort designs. This course assumes you are willing to recompile and, typically, rewrite your application code; as a result, it does not touch upon the ISA abstraction and compatibility and will have almost nothing on dynamic ILP and pipelining of a general-purpose processor. This course will be driven more by real-time concerns rather than best-effort tasks, whereas CIS5710 is more focused on best-effort. This course will spend one day on the high-level benefits of memory hierarchy, but will not dive deep into automatically hardware-managed cache-design and cache-hierarchies, which is a major component of CIS5710. This course will mostly look at non-shared memory models and architectures with, at most, a small nod to the existence and challenges in shared memory, whereas CIS5710 is mostly focused on shared-memory models and architectures.

Last modified: Fri Sep 6 09:30:38 EDT 2024