Oleg Zabluda's blog
Wednesday, March 01, 2017
 
DeepVoice: Real-time Neural Text-to-Speech (2017) Sercan O. Arik, Mike Chrzanowski, Adam Coates, Gregory Diamos [...]
DeepVoice: Real-time Neural Text-to-Speech (2017) Sercan O. Arik, Mike Chrzanowski, Adam Coates, Gregory Diamos [...]
"""
On CPU, a single Haswell or Broadwell core has a peak single-precision throughput of approximately 77 × 10^9 FLOPs and an L2-to-L1 cache bandwidth of approximately 140GB/s. The model must be loaded from cache once per timestep, which requires a bandwidth of 100 GB/s. Even if the model were to fit in L2 cache, the implementation would need to utilize 70% of the maximum bandwidth and 70% of the peak FLOPS in order to do inference in realtime on a single core. Splitting the calculations across multiple cores reduces the difficulty of the problem, but nonetheless it remains challenging as inference must operate at a significant fraction of maximum memory bandwidth and peak FLOPs and while keeping threads synchronized.

A GPU has higher memory bandwidth and peak FLOPs than a CPU but provides a more specialized and hence restrictive computational model. A naive implementation that launches a single kernel for every layer or timestep is untenable, but an implementation based on the persistent RNN technique (Diamos et al., 2016) may be able to take advantage of the throughput offered by GPUs.

We implement high-speed optimized inference kernels for both CPU and GPU and demonstrate that WaveNet inference at faster-than-real-time speeds is achievable. Table 2 lists the CPU and GPU inference speeds for different models. In both cases, the benchmarks include only the autoregressive, high-frequency audio generation and do not include the generation of linguistic conditioning features which can be done in parallel for the entire utterance). Our CPU kernels run at real-time or faster-than-real-time for a subset of models, while the GPU models do not yet match this performance.
"""
https://arxiv.org/abs/1702.07825

https://arxiv.org/abs/1702.07825

Labels:


| |

Home

Powered by Blogger