Yi Qiu

Ph.D. student in Physics Department of Pennsylvania State University.

This is a note about the Kokkos, a C++ programming model for performance portability.


  • Implemented as a template library on top CUDA, HIP,

    OpenMP, ...

  • Aims to be descriptive not prescriptive.

Why do we need Kokkos?

A full time software engineer writes 10 lines of production code per hour: 20k LOC/year. While typical HPC production app: 300k-600k lines. Just switching Programming Models costs multiple person-years per app!

Screen Shot 2023-09-10 at 10.14.14 PM
Screen Shot 2023-09-10 at 10.31.07 PM

Kokkos tools:

  1. KernelLogger: print kernel logs in runtime.
  2. SimpleKernelTimer: print time consuming information after the run.