Skip to Main Content
white paper

Veloce prototyping solutions accelerate verification of HPC AI-enabled SoCs

The Veloce prototyping system solutions offer the most efficient tools and flow for IP, subsystem, SoC design and software verification teams to accelerate their SoC verification.
This white paper goes through the journey of understanding how to meet quality requirements and accelerate time-to-market for your company’s latest flagship high performance computing (HPC) artificial intelligence (AI)-enabled system-on-chip (SoC) design. The starting point in the journey explores the use cases for designs illustrating the impact HPC AI-enabled systems and resources have on our world. Part two of the journey identifies the basic architecture of an HPC AI-enabled SoC design, what matters and how to select and articulate verification objectives and the verification approach that works best. Lastly, a conclusion is made about how to select the best FPGA prototyping solution for the task to ensure improved hardware and software verification productivity.

Milestones 1 and 2: IP block and subsystem in-circuit emulation (ICE) verification

IP blocks are typically small designs, most below 40M gates, and IP block software driver verification can start as soon as the IP RTL becomes stable.

IP examples:

  • Ethernet interface

  • DDR5 memory interface

  • Deep-learning accelerator (DLA)

IP blocks are organized and assembled into a subsystem design implementing a macro-level functionality, which can typically fit in four or fewer FPGAs, although larger blocks are possible. Again, subsystem software driver verification can start as soon as the subsystem RTL becomes stable.

Subsystem examples:

  • Wired subsystem: PCIe + Ethernet

  • Memory subsystem: DDR5 + HBM memories

For small designs with an ICE (protocol or peripheral interface) verification requirement, Veloce proFPGA offers a desktop, modular, and scalable multi-FPGA ASIC prototyping solution for IP and subsystems verification and software development.

An IP block can fit on a single-FPGA, Veloce proFPGA uno system. The IP block can run at a very highspeed, about 100MHz or more depending on the FPGA friendliness of the IP block design, achieving “at-speed” performance necessary for compliance testing and accurate interoperability testing (figure 3).

A subsystem generally fits on a multi-FPGA Veloce proFPGA duo or quad system. Multi-FPGA designs break logic data paths across several FPGAs, reducing the maximum achievable speed. Careful design partitioning is required to minimize the operating frequency drop (figure 4).

Veloce prototyping software does automatic design partitioning and auto multi-giga bit pinmuxing IP insertion to achieve the best performance without any RTL design changes from the user. Each FPGA module can interface its own ICE accessories: PCIe, Ethernet, DDR and HBM for real-world I/O connections.

Milestones 3 through 6: SoC verification with in-circuit emulation verification

Now that the subsystems and IPs are validated, it’s time for the SoC design team to assemble all the subsystems together and validate the final SoC as well as the software team to develop system level applications. The scale of such designs is massive; it may reach multiple billions of gates. Any issues need to be analyzed by multiple teams, which, in most cases, are at different sites spread around the world.

To accelerate the SoC verification, the Veloce Primo solution offers an enterprise prototyping system. Veloce Primo can scale up to 320 FPGA (12 billion gates), can be remotely accessed by multiple users concurrently, and offers virtual interfaces and virtual lab test equipment such as PCIe, Ethernet, and DDR. Access to virtual interfaces removes the need for physical interaction with the prototyping platform and lab test equipment. The design can run at about 10MHz depending on how the FPGA partitioning is done.

Figure 5 illustrates an efficient enterprise FPGA prototyping system. It includes Veloce Primo hardware, the VPS Software for compilation and runtime execution control, Ethernet and PCIe VirtuaLAB (virtual protocol generator/analyzer), visualization apps for waveform visualization and enterprise server application for multi-user access management and Veloce Primo hardware diagnostics.

At any time, the number of FPGAs can be dynamically allocated to a specific number of users to schedule their design and software verification workloads of the IP block, subsystem and SoC design without compromising the productivity of other users.