Coding and understanding someone else codes will be a major effort especially at the beginning..

The structure of my code follows this outline:
NewCode directory :

Makefile                  // to compile the codes

Add directory: mat-addkernels.c/h    // matrix addition, matrix comparison,

Error: doubly_compensated_sum.c   // Error Analysis/Estimation

Examples:Example.3.c, example.4.c, example.6.c


MAT-ADD-Generator: addgen.c   // matrix addition kernel generator


architecture.h            // architecture specific macros

mat-operands.h         // specify how we store and access matrices, row/column major …

Mul: mat-mulkernels.c/h    // multiplication kernels
Scaling: scaling.c/h                 // for processor allowing frequency/voltage scaling

Sort: quicksort.h/c // this is used for the error analysis

Please, read the INSTALL.txt  to get an introduction how to install the codes

You will need to install the BLAS library you like and modify the Makefile
Goto: GotoBLAS directory Linux_P4SSE2.

ATLAS pre-built library for P4, you may use any one (either pre-built or not) as you wish.

There is not much more into it. The files example*, as the name says, offer examples how to call the matrix multiplication routines.

Unfortunately, this package is not self installing and it will require some work in the understanding of its structure, installation and use. Nothing major.
The High performance MM routines should be installed separately. Then my code can be built and used. Some tuning for the Matrix Addition (MA) is advised but I have found that the optimized version available fits most of the architecture I have used.

Every architecture has its compiler and libraries. My code will use Matrix Multiplication (MM) routines that can be from ATLAS, GotoBLAS or your preferred vendor library. At this time, I have experimented (heavily) with ATLAS, GotoBLAS, in the past I used SGI BLAS and recently MKL BLAS.

The file with macros specifying what library I am going to use. For example, the macro  mm_leaf_computation is used to identify the leaf computation (when Strassen/Winograd yield control to the fancy library routines).

The matrices are defined here and basic routines for they manipulation, division and definitions are here as well. For example, how to get the sub-matrix A0 from the matrix A …

Matrix addition for matrices in row and column major.

Strassen, oblivious + Strassen, dynamic Strassen, Winograd, and Oblivious + Winograd are all defined here. The LEAF constant is the recursion point (defined in the mat-mulkernels.h)

  • Free = no responsibility
  • We distribute the code using the GNU Lesser General Public License, because our algorithms build on top of BLAS GEMM implementation, and the user may want to use proprietary codes