Oleg Zabluda's blog
Tuesday, January 08, 2019
 
Strassen’s Algorithm Reloaded (2016) Jianyu Huang, et al, Intel Corporation
Strassen’s Algorithm Reloaded (2016) Jianyu Huang, et al, Intel Corporation
"""
Abstract—We dispel with “street wisdom” regarding the practical implementation of Strassen’s algorithm for matrix-matrix multiplication (DGEMM). Conventional wisdom: it is only practical for very large matrices. Our implementation is practical for small matrices. Conventional wisdom: the matrices being multiplied should be relatively square. Our implementation is practical for rank-k updates, where k is relatively small (a shape of importance for libraries like LAPACK). Conventional wisdom: it inherently requires substantial workspace. Our implementation requires no workspace beyond buffers already incorporated into conventional high-performance DGEMM implementations. Conventional wisdom: a Strassen DGEMM interface must pass in workspace. Our implementation requires no such workspace and can be plug-compatible with the standard DGEMM interface. Conventional wisdom: it is hard to demonstrate speedup on multi-core architectures. Our implementation demonstrates speedup over conventional DGEMM even on an Intel Xeon Phi coprocessor utilizing 240 threads. We show how a distributed memory matrix-matrix multiplication also benefits from these advances.
"""
http://jianyuhuang.com/papers/sc16.pdf

http://jianyuhuang.com/papers/sc16.pdf

Labels:



Powered by Blogger