From NA-Wiki

Jump to: navigation, search


High Performance Computing using GPUs

A summer course giving in 2008 focusing on high performance computing on streaming architectures.

We are announcing a summer course for PhD students and researchers interested in high-performance and parallel computing. Specifically, the course will introduce and give a hands-on experience with the massively parallel computing hardware found in a modern graphics card (the GPU).

However, we will not be programming games. There is real practical benefit today, and even more potential, for running scientific computing applications on GPUs. The most striking one is the raw floating point computing power: Up to 400 GFlop/s from a card that can be bought for around 4000 SEK and put in an ordinary workstation. Programming a GPU can today be done using C, but the underlying architecture is very different from the Intel x86 that most of us are familiar with.

First meeting: June 3rd at 10:00. Erik Lindahl (SU Bioinformatics) will give an introductory lecture and share some of his expertise. There will then be about one meeting per week in June where we will look at the details of present hardware (which will be made available to participants) and examples of how computing applications have been developed for it. In order to receive course credits, one is expected to work on a project.

Course administrators are: Dag Lindbo, Henrik Holst and Tomas Oppelstrup.

Course examinator: Johan Hoffman.


  • The address to the mailing list is All participants

and sponsors are on this list. Use the list to ask questions about anything related to the course.

  • Some informaion on the first meeting, including homework is

available here: GPU08-session1

  • Some informaion on the second meeting, including examples using

CUBLAS and CUFFT and CUDPP can be found here: GPU08-session2

  • There are both Linux and Window login instructions on the file area on You have the password in your mail.

  • The Fortran example codes are found in


  • Introduction to streaming, massively concurrent architecture (stream processing)
  • Formulating suitable data parallel algorithms
  • Use of CUDA SDK.
  • After the course the students will be able to implement computational kernels on the CUDA architecture.


There will be four lectures during June.

  1. Tuesday June 3, at 10.15am, in the PDC seminar room (Teknikringen 14, 3'rd floor).
    Outline: Introduction to GPU computing (Dr. Erik Lindahl)
    There is a page devoted to the first meeting: GPU08-session1
    It contains an introductory computer lab and suggested homework.
  2. Tuesday June 10, at 10.15am, in the PDC seminar room (Teknikringen 14, 3'rd floor).
    Outline: Introduction to CUDA SDK (Dag), CUBLAS (Tomas) and Fortran interoperability (Henrik). We will present different application examples, and how to solve common problems in massive multi-threading. We will also discuss avaiable libraries, and show how to use some of them.
    Link to page about second meeting: GPU08-session2
  3.  ?Tuesday June 17?, at 10.15am, in the PDC seminar room (Teknikringen 14, 3'rd floor).
    Outline: Data structures and Advanced examples. Prof. Jesper Oppelstrup will give retrospect on massively concurrent architectures and vectgor machines from the past. After this presentation, we will discuss problems and questions from the participants, to give feedback on the project work.
    Link to page about third meeting: GPU08-session3
  4.  ?Tuesday June 23?, at 10.15am, in the PDC seminar room (Teknikringen 14, 3'rd floor).
    Outline: Presentation of course participants projects.

Example student problems

  • Monte Carlo for option pricing.
  • Iterative method for linear system of equations - Either standalone or as a kernel of a bigger code.
  • Explicit finite difference methods - Choose you favourite PDE or ODE and solve it.
  • Image processing. Implement some filtering methods of your choice.
  • ...Your own ideas!

Finished projects

Here is a list of presentations and report of those who have completed their projects and agreed to have them put on this website.



We thank the University of Houston and Texas Learning & Computation Center (TLC2), KTH Computational Science and Engineering Centre (KCSE) and Prof. Johan Hoffman for suppling the necessary hardware.

Personal tools