#include <unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h>
Process all the data with a single cpu thread, using vectorized instructions.