Developing FPGA-accelerated cloud applications with SDAccel: Practice
Developing FPGA-accelerated cloud applications with SDAccel: Practice
This course, which is the fourth one of a “series”, is for anyone passionate about practicing how to develop FPGA-accelerated applications with SDAccel.
Introduction to FPGA systems
This series of courses introduces students to the concept of Reconfigurable FPGA-based Systems, by discussing their overall architecture and companion design flows. The goal is to present to the students the methodological approaches for the design of such systems, showing also real industrial tools, examples and common practices.
See the full seriesCourse description
This course will present several scenarios where the workloads require more performance than can be obtained even by using the fastest CPUs. This scenario is turning cloud and data center architectures toward accelerated computing.
Within this course we are going to show you how to gain benefits by using Xilinx SDAccel to program Amazon EC2 F1 instances.
We are going to do this through a working example of an algorithm used in computational biology.
The huge amount of data they need to process and the complexity of these algorithms, raised the problem of increasing the amount of computational power needed to perform the computation. In this scenario, hardware accelerators revealed to be efficient in achieving a speed-up in the computation while, at the same time, saving power consumption.
Among the algorithms used in computational biology, the Smith-Waterman algorithm is a dynamic programming algorithm, guaranteed to find the optimal local alignment between two strings that could be nucleotides or proteins. In the following classes we presented an analysis and successive FPGA-based hardware acceleration of the Smith-Waterman algorithm used to perform pairwise alignment of DNA sequences.
Within this context, this course is focusing on distributed, heterogeneous cloud infrastructures, providing you details on how to use Xilinx SDAccel, through working examples, to bring your solutions to life by using the Amazon EC2 F1 instances.
Total workload of the course: 40 hours
This MOOC is provided by Politecnico di Milano.
Intended Learning Outcomes
By actively participating in this MOOC, you will be able to:
- Describe the main concepts of cloud computing architectures and the characteristics of AWS F1 instances.
- Summarize the main technologies, vendors, and the hardware/software stacks behind FPGA-based cloud computing.
- Explain the AFI creation flow and identify applications suitable for FPGA-based acceleration.
- Analyze the Smith-Waterman algorithm and evaluate its performance using the Roofline model.
- Evaluate the benefits of accelerating an algorithm and create an optimized FPGA-based version of Smith-Waterman.
- Create and deploy an accelerated version of Smith-Waterman on AWS F1 instances.
ESCO: computer technology, ESCO: electronics, ESCO: use ICT systems, ESCO: computational biology, ESCO: oversee development of software, ESCO: develop translation memory software, ESCO: system design, ESCO: cloud technologies
Prerequisites
This course follows the previous ones “FPGA computing systems: A Bird’s Eye View on Reconfigurable Computing”, “FPGA computing systems: Partial Dynamic Reconfiguration” and “Developing FPGA-accelerated cloud applications with SDAccel: Theory”. Within this context no specific background knowledge is requested. Anyone with moderate computer experience should be able to master the materials in this course.
Activities
Over and above consulting the content, in the form of videos and other web-based resources, you will have the opportunity to discuss course topics and to share ideas with your peers in the Forum of this MOOC. The forum of this MOOC is freely accessible, and participation is not guided; you can use it to compare yourself with other participants, or to discuss course contents with them.
Section outline
-
-
-
Week 1 will talk about Distributed Systems, data center and cloud architectures, which are facing the exponential growth in computing requirements and the impossibility for CPU-based solutions to keep pace.
-
Week 2 is going to have a first taste on how to gain the best out of the combination of the F1 instances with SDAccel providing some few practical instructions on how to develop accelerated applications on Amazon F1 by using the Xilinx SDAccel development environment. Then, we are going to present what it is necessary to create FPGA kernels, assemble the FPGA program and to compile the Amazon FPGA Image, or AFI. Finally, we will describe the steps and tasks involved in developing a host application accelerated on the F1 FPGA.
-
Week 3 is going to introduce you to the Smith-Waterman algorithm that we have chosen to demonstrate how to create an hardware implementation of a system based on FPGA technologies using the Xilinx SDAccel design framework. We are going to dig into the details of the algorithm from its data structures to the computation flow. Then we are going to introduce the Roofline model and we are going to use it to analyze the theoretical peak performance and the operational intensity of the Smith-Waterman algorithm.
-
Week 4 is going to dig deeper in the Smith-Waterman algorithm. We are going to implement a first version of the algorithm on a local server with the Xilinx SDAccel design framework. Then we are going to introduce some optimizations to improve performance, in particular we will add more parallelism in the implementation and we will introduce systolic arrays. Moreover, we will explore how we can perform data compression and then we will leverage multiple memory ports to improve memory access speed. Finally, we are going to port our implementation of the Smith-Waterman algorithm on the AWS F1 instances.
-
Week 5 is concluding this course but posing interesting questions towards possible future research directions that may also point the students to other Coursera courses on FPGAs. We are working at the edge of the research in the area of reconfigurable computing. FPGA technologies are not used only as standalone solutions/platforms but are now included into cloud infrastructures. They are now used both to accelerate infrastructure/backend computations and exposed as-a-Service that can be used by anyone. Within this context we are facing the definition of new research opportunities and technology improvements and the time cannot be better under this perspective.
-
Assessment
Your final grade for the course will be based on the results of your answers to the assessed quizzes. You have an unlimited number of attempts at each quiz, but you must wait 15 minutes before you can try again. You will have successfully completed the course if you score 60% (or higher) in each one of the assessed quizzes. The maximum score possible for each quiz is given at the beginning of the quiz. You can view your score in the quiz on your last attempt or on the 'Grades' page.
Certificate
You can achieve a certificate in the form of an Open Badge for this course, if you reach at least 60% of the total score in each one of the assessed quizzes and fill in the final survey.
Once you have completed the required tasks, you will be able to access ‘Get the Open Badge’ and start issuing the badge. Instructions on how to access the badge will be sent to your e-mail address.
The Badge does not confer any academic credit, grade or degree.
Information about fees and access to materials
You can access the course completely online and absolutely free of charge.
Course faculty

Lorenzo Di Tucci
Teacher
Lorenzo is a Ph.D. Student at Politecnico di Milano and co-founder of Huxelerate.
He received his Bachelor’s and Master’s degrees in Computer Engineering from Politecnico di Milano in 2013 and 2016 respectively. In 2016 he received a Master of Science in Computer Science from the University of Illinois at Chicago.
Lorenzo has been a Research Assistant at the University of Illinois at Chicago, Visiting Researcher at Lawrence Berkeley National Laboratory in Berkeley (CA) and Rocca Fellow at Massachusetts Institute of Technology in Cambridge (MA). His research interests float around FPGA design, High-Performance Computing, and Hardware Architectures.

Marco Domenico Santambrogio
Teacher
He is an Associate professor at Politecnico di Milano and a Research Affiliate with the CSAIL at MIT. He received his laurea (M.Sc. equivalent) degree in Computer Engineering from the Politecnico di Milano (2004), his second M. Sc. degree in Computer Science from the University of Illinois at Chicago (UIC) in 2005 and his PhD degree in Computer Engineering from the Politecnico di Milano (2008). Dr. Santambrogio was a postdoc fellow at CSAIL, MIT, and he has also held visiting positions at the Department of Electrical Engineering and Computer Science of the Northwestern University (2006 and 2007) and Heinz Nixdorf Institut (2006).
Marco D. Santambrogio is a senior member of the IEEE. Marco D. Santambrogio is a senior member of both the IEEE and ACM, he is member of the IEEE Computer Society (CS) and the IEEE Circuits and Systems Society (CAS). He is or has been member of different program committees of electronic design automation conferences, among which: DAC, DATE, CODES+ISSS, FPL, RAW, EUC, IFIP VLSI Conference.
He has been with the Micro Architectures Laboratory at the Politecnico di Milano, where he founded the Dynamic Reconfigurability in Embedded System Design (DRESD) project in 2004. In 2011, he founded the Novel, Emerging Computing System Technologies Laboratory (NECSTLab), merging together the two previously existing labs: MicroLab and VPLab, and he is, since then, in charge of the laboratory.
Contact details
If you have any enquiries about the course or if you need technical assistance please contact pok@polimi.it. For further information, see FAQ page.