Deep learning-based design is particularly challenging for developers of real-time vision and speech systems. Traditional simulation methods, using modest artificial test stimulus sets, are increasingly inadequate for three reasons:
1. The computational complexity of deep-learning inference is high, so large, complex arithmetic engines must be deployed in silicon. Slow turnaround time for simulation of these structures inhibits the innovation rate.
2. Deep-learning algorithm "correctness" is often judged by statistical measures, not bit-exactness, so large data streams must be characterized. Near real-time emulation gives increased confidence in the system's real-world behavior.
3. These smart real-time systems involved sophisticated, usually asynchronous interactions between raw input data streams, deep-learning inference engines, high-level system controls, and fault-tolerance and error-recovery monitors. Modern emulation systems can faithfully implement actual system behavior even in obscure corner-case coincidences.