float* raw_data = malloc(...); Map<MatrixXd> M(raw_data, rows, cols); // use M as a MatrixXd M = M.inverse();
from Eigen
1 2 3 4
MatrixXd M; float* raw_data = M.data(); int stride = M.outerStride(); raw_data[i+j*stride]
一些预备知识
template programming
4 levels 并行
cluster of PCs --MPI
multi/many-cores -- OpenMP
SIMD -- intrinsics for vector instructions
pipelining -- needs non dependent instructions
Peak Performance
Example: Intel Core2 Quad CPU Q9400 @ 2.66GHz (x86_64)
* pipelining → 1 mul + 1 add / cycle (ideal case)
* SSE → x 4 single precision ops at once
* frequency → x 2.66G
* peak performance: 21,790 Mflops (for 1 core)
my_program: path/to/eigen/Eigen/src/Core/DenseStorage.h:44: Eigen::internal::matrix_array<T, Size, MatrixOptions, Align>::internal::matrix_array() [with T = double, int Size = 2, int MatrixOptions = 2, bool Align = true]: Assertion `(reinterpret_cast<size_t>(array) & (sizemask)) == 0 && "this assertion is explained here: http://eigen.tuxfamily.org/dox-devel/group__TopicUnalignedArrayAssert.html READ THIS WEB PAGE !!! ****"' failed. There are 4 known causes for this issue. Please read on to understand them and learn how to fix them.