Eigen  3.4.90 (git rev 9589cc4e7fd8e4538bedef80dd36c7738977a8be)
 
Loading...
Searching...
No Matches
Eigen::internal::unrolls Namespace Reference

Classes

class  gemm
 
class  transB
 
class  trsm
 

Detailed Description

This namespace contains various classes used to generate compile-time unrolls which are used throughout the trsm/gemm kernels. The unrolls are characterized as for-loops (1-D), nested for-loops (2-D), or triple nested for-loops (3-D). Unrolls are generated using template recursion

Example, the 2-D for-loop is unrolled recursively by first flattening to a 1-D loop.

for(startI = 0; startI < endI; startI++) for(startC = 0; startC < endI*endJ; startC++) for(startJ = 0; startJ < endJ; startJ++) ----> startI = (startC)/(endJ) func(startI,startJ) startJ = (startC)%(endJ) func(...)

The 1-D loop can be unrolled recursively by using enable_if and defining an auxiliary function with a template parameter used as a counter.

template <endI, endJ, counter> std::enable_if_t<(counter <= 0)> <-— tail case. aux_func {}

template <endI, endJ, counter> std::enable_if_t<(counter > 0)> <-— actual for-loop aux_func { startC = endI*endJ - counter startI = (startC)/(endJ) startJ = (startC)%(endJ) func(startI, startJ) aux_func<endI, endJ, counter-1>() }

Note: Additional wrapper functions are provided for aux_func which hides the counter template parameter since counter usually depends on endI, endJ, etc...

Conventions: 1) endX: specifies the terminal value for the for-loop, (ex: for(startX = 0; startX < endX; startX++))

2) rem, remM, remK template parameters are used for deciding whether to use masked operations for handling remaining tails (when sizes are not multiples of PacketSize or EIGEN_AVX_MAX_NUM_ROW)