A developer of microprocessors featuring a proprietary architecture says its Prodigy Universal Processors can outperform chips from Intel and Nvidia on HPC and AI workloads. Furthermore, they can run code designed for other architectures using dynamic binary translator without any performance degradation.
Emulating x86, Arm, and PowerPC using general-purpose CPU hardware is something that chipmakers has done for years, but with substantial performance degradation that was prohibitively high when compared to execution on native hardware.
In fact, in the consumer space only Apple managed to successfully emulate PowerPC using Intel’s x86 processors in the second half of 2000s, but to a large degree Apple’s Rosetta dynamic binary translator was so successful because Intel’s processors at the time were considerably more advanced than those based on the PowerPC architecture.
One CPU for every workload
But Tachyum claims that its software emulation technology is so efficient that its Prodigy Universal Processor can run Arm and RISC-V code better when compared to modern processors based on these architectures. Furthermore, it can run x86 code well enough to run legacy applications, says Tachyum.
Tachyum’s Prodigy are homogeneous CPUs with up to 128 cores based on proprietary architecture that can run different types of workloads (AI, HPC, datacenter, etc.) seamlessly, reducing complexity of software development and hardware architecture.
According to the developer, when Prodigy runs native code, it can outperform the fastest Intel Xeon processors at 10 times lower power as well as leave behind Nvidia’s A100 GPUs in HPC, AI training, and inference tasks. Tachyum says that 125 HPC Prodigy racks can deliver 32 tensor EXAFLOPS of performance, though does not disclose the number of Prodigy processors required as well as their expected power consumption.
One of the interesting implications of universal processors like Tachuym’s Prodigy that can be used for different workloads is that hyperscalers like Amazon Web Services or Google can use them more broadly than systems based on traditional processors and hardware accelerators such as GPUs. As a result, they can earn more money and reduce their maintenance costs.
Tachuym’s Prodigy Universal Processors are not quite here. The company uses field programmable gate arrays to emulate the chips and currently does not have a fully-functional FPGA prototype.
At present, Tachyum lists a lineup of CPU models at its website, though it does not seem that any of the chips can be purchased. Tachyum expects to tape out its Prodigy later this year (i.e., send photomasks to the fab) and then begin volume production of the CPUs sometimes in 2021.