Skip to main content
Completed 0%
0 / 53
You are currently viewing this course as Guest. Please log in to check how to enrol into the course and get full access.

This course, which is the fourth one of a “series”, is for anyone passionate about practicing how to develop FPGA-accelerated applications with SDAccel.

Introduction to FPGA systems

This series of courses introduces students to the concept of Reconfigurable FPGA-based Systems, by discussing their overall architecture and companion design flows. The goal is to present to the students the methodological approaches for the design of such systems, showing also real industrial tools, examples and common practices.

See the full series

Course description

This course will present several scenarios where the workloads require more performance than can be obtained even by using the fastest CPUs. This scenario is turning cloud and data center architectures toward accelerated computing.

Within this course we are going to show you how to gain benefits by using Xilinx SDAccel to program Amazon EC2 F1 instances.

We are going to do this through a working example of an algorithm used in computational biology.

The huge amount of data they need to process and the complexity of these algorithms, raised the problem of increasing the amount of computational power needed to perform the computation. In this scenario, hardware accelerators revealed to be efficient in achieving a speed-up in the computation while, at the same time, saving power consumption.

Among the algorithms used in computational biology, the Smith-Waterman algorithm is a dynamic programming algorithm, guaranteed to find the optimal local alignment between two strings that could be nucleotides or proteins. In the following classes we presented an analysis and successive FPGA-based hardware acceleration of the Smith-Waterman algorithm used to perform pairwise alignment of DNA sequences.

Within this context, this course is focusing on distributed, heterogeneous cloud infrastructures, providing you details on how to use Xilinx SDAccel, through working examples, to bring your solutions to life by using the Amazon EC2 F1 instances.

Intended Learning Outcomes


This course follows the previous ones “FPGA computing systems: A Bird’s Eye View on Reconfigurable Computing”, “FPGA computing systems: Partial Dynamic Reconfiguration” and “Developing FPGA-accelerated cloud applications with SDAccel: Theory”. Within this context no specific background knowledge is requested. Anyone with moderate computer experience should be able to master the materials in this course.


The forum of this MOOC is freely accessible and participation is not guided; you can use it to compare yourself with other participants, or to discuss course contents with them.

Topic outline

  • Week 0 - Introduction to the course

    Not available unless: You are a(n) Student
  • Week 1 - Reconfigurable cloud infrastructure

    Week 1 will talk about Distributed Systems, data center and cloud architectures, which are facing the exponential growth in computing requirements and the impossibility for CPU-based solutions to keep pace.

    Not available unless: You are a(n) Student
  • Week 2 - On how to accelerate the cloud with SDAccel

    Week 2 is going to have a first taste on how to gain the best out of the combination of the F1 instances with SDAccel providing some few practical instructions on how to develop accelerated applications on Amazon F1 by using the Xilinx SDAccel development environment. Then, we are going to present what it is necessary to create FPGA kernels, assemble the FPGA program and to compile the Amazon FPGA Image, or AFI. Finally, we will describe the steps and tasks involved in developing a host application accelerated on the F1 FPGA.

    Not available unless: You are a(n) Student
  • Week 3 - Summing things up: the Smith_Waterman algorithm

    Week 3 is going to introduce you to the Smith-Waterman algorithm that we have chosen to demonstrate how to create an hardware implementation of a system based on FPGA technologies using the Xilinx SDAccel design framework. We are going to dig into the details of the algorithm from its data structures to the computation flow. Then we are going to introduce the Roofline model and we are going to use it to analyze the theoretical peak performance and the operational intensity of the Smith-Waterman algorithm.

    Not available unless: You are a(n) Student
  • Week 4 - The Smith-Waterman example in details

    Week 4 is going to dig deeper in the Smith-Waterman algorithm. We are going to implement a first version of the algorithm on a local server with the Xilinx SDAccel design framework. Then we are going to introduce some optimizations to improve performance, in particular we will add more parallelism in the implementation and we will introduce systolic arrays. Moreover, we will explore how we can perform data compression and then we will leverage multiple memory ports to improve memory access speed. Finally, we are going to port our implementation of the Smith-Waterman algorithm on the AWS F1 instances.

    Not available unless: You are a(n) Student
  • Week 5 - Course conclusions

    Week 5 is concluding this course but posing interesting questions towards possible future research directions that may also point the students to other Coursera courses on FPGAs. We are working at the edge of the research in the area of reconfigurable computing. FPGA technologies are not used only as standalone solutions/platforms but are now included into cloud infrastructures. They are now used both to accelerate infrastructure/backend computations and exposed as-a-Service that can be used by anyone. Within this context we are facing the definition of new research opportunities and technology improvements and the time cannot be better under this perspective.

    Not available unless: You are a(n) Student
  • Additional Resources


Your final grade for the course will be based on the results of your answers to the graded quizzes. You have unlimited attempts at each quiz, but you must wait 5 minutes before you can try again. You will have successfully completed the course if you achieve 60% (or more) of the total course score. The maximum score possible for each quiz is given at the top of the quiz. You can see your score in the quiz on your last attempt or on the 'Grades' page.

Certificate of accomplishment

You must be registered in POK through Politecnico di Milano personal account to obtain the Certificate of Accomplishment. It will be released to anyone who successfully completed the course by achieving at least 60% of the total score in the graded quizzes and filling the final survey. 

You will be able to download the Certificate of Accomplishment directly from Politecnico di Milano web services.

The Certificate of Accomplishment does not confer any academic credit, grade or degree.

Information about fees and access to materials

You can access the course completely online and absolutely free of charge.

Course faculty

Lorenzo Di Tucci

Lorenzo Di Tucci


Lorenzo is a Ph.D. Student at Politecnico di Milano and co-founder of Huxelerate.

He received his Bachelor’s and Master’s degrees in Computer Engineering from Politecnico di Milano in 2013 and 2016 respectively. In 2016 he received a Master of Science in Computer Science from the University of Illinois at Chicago.

Lorenzo has been a Research Assistant at the University of Illinois at Chicago, Visiting Researcher at Lawrence Berkeley National Laboratory in Berkeley (CA) and Rocca Fellow at Massachusetts Institute of Technology in Cambridge (MA). His research interests float around FPGA design, High-Performance Computing, and Hardware Architectures.

Marco Domenico Santambrogio

Marco Domenico Santambrogio


He is an Associate professor at Politecnico di Milano and a Research Affiliate with the CSAIL at MIT. He received his laurea (M.Sc. equivalent) degree in Computer Engineering from the Politecnico di Milano (2004), his second M. Sc. degree in Computer Science from the University of Illinois at Chicago (UIC) in 2005 and his PhD degree in Computer Engineering from the Politecnico di Milano (2008). Dr. Santambrogio was a postdoc fellow at CSAIL, MIT, and he has also held visiting positions at the Department of Electrical Engineering and Computer Science of the Northwestern University (2006 and 2007) and Heinz Nixdorf Institut (2006).

Marco D. Santambrogio is a senior member of the IEEE. Marco D. Santambrogio is a senior member of both the IEEE and ACM, he is member of the IEEE Computer Society (CS) and the IEEE Circuits and Systems Society (CAS). He is or has been member of different program committees of electronic design automation conferences, among which: DAC, DATE, CODES+ISSS, FPL, RAW, EUC, IFIP VLSI Conference.

He has been with the Micro Architectures Laboratory at the Politecnico di Milano, where he founded the Dynamic Reconfigurability in Embedded System Design (DRESD) project in 2004. In 2011, he founded the Novel, Emerging Computing System Technologies Laboratory (NECSTLab), merging together the two previously existing labs: MicroLab and VPLab, and he is, since then, in charge of the laboratory.

Contact details

If you have any enquiries about the course or if you need technical assistance please contact For further information, see FAQ page.