Building on the ARMv9 architecture, ARM today introduces the components of the next generation of systems on a chip as they will be used in upcoming smartphones, tablets and PCs. Arm Cortex-X3, A715 and A510 refresh deliver more performance, reduced consumption and allow new configurations.
TCS22 delivers 28 percent more gaming performance
Cortex-X3, Cortex-A715 and Cortex-A510 Refresh are Arm’s new 2022 CPU cores and are part of the Compute Performance section of Total Compute Solutions 2022 (TCS22), which also includes the Developer Access and “Security” include. That’s what Arm calls the holistic solution of his own IP this year. Corresponding products, i.e. processors, should not come onto the market until 2023. In a typical 1+3+4 configuration, the TCS22 promises an average 28 percent increase in gaming performance over the TCS21 with Cortex-X2, Cortex-A710 and Cortex-A510.
However, not only the new Cortex cores are included in this consideration, but also in particular the new Immortalis G715 GPU and optimizations on the “DynamIQ Shared Unit”. In addition to the Immortalis-G715 with hardware ray tracing, Arm also introduced the Mali-G715 and Mali-G615 as new graphics units today. ComputerBase covers all three new GPUs based on the Valhall architecture in a separate article.
Evolution of last year’s Armv9 premiere
With the TCS21, the Armv9 premiere took place about a year ago with the associated presentations of Cortex-X2, Cortex-A710 and Cortex-A510 as well as DSU-110 and Mali-G710. The TCS22 builds on this and thus remains ISA-compatible. In addition to the increased gaming performance of 28 percent on average, the TCS22 is said to reduce DRAM traffic by up to 23 percent and energy consumption by up to 16 percent in this scenario compared to a typical configuration for high-end smartphones. With the new IP, the support for components such as the Cortex-M85 and software optimizations, the TCS22 is also making significant gains in the area of machine learning. Arm also presented percentage changes in individual components of the TCS22 compared to the previous year, which will be discussed later in the article.
DSU is ready for fast arm notebooks
New to the range of possible configurations is support for up to twelve cores within the cluster of the “DynamIQ Shared Unit-110” (DSU-110) that Arm introduced last year. This change allows new high-end configurations, which are necessary for future Arm PCs. As part of the Client Tech Day, Arm mentioned a setup with eight Cortex-X3 and four Cortex-A715, but without a Cortex-A510 refresh, as an example. Likewise, an SoC with twelve exclusively economical small cores could be realized with it. With the TCS22, Arm not only wants to appeal to gamers, but also in particular, and has therefore designed the new cores for consistently high performance in this area without throttling reducing gaming enjoyment.
The microarchitecture of the DSU-110 has not changed significantly, Arm speaks of a necessary “tuning” to prepare the design for the additional cores. There were also updates for areas that depend on the number of cores, such as the memory mapped registers per core, as well as test methods for the physical implementation with a new floor plan.
In theory, partners with the updated DSU-110 can now also install up to twelve cores from last year’s IP. However, if the latest ISA features such as Asymmetric MTE and EPAN (explained in the following paragraph) are to be used, the latest IP must be used.
Arm expands security features
In terms of security, the new compute cluster brings support for Asymmetric MTE as an extension of the Memory Tagging Extension (MTE) introduced last year with ARMv9. Memory areas and associated pointers are provided with the same tag and checked by the CPU for a match. If there is a discrepancy, the CPU interrupts processing. With Asymmetric MTE, the CPU can trigger this fault during a load instruction and asynchronously update a memory area during a store instruction.
With EPAN (“Enhanced PAN”), Arm also follows the previous PAN (“Privileged Access Never”), which is intended to prevent access to less privileged memory areas in user mode, for example at kernel level. The security function is intended to prevent a user-mode attack from taking place via a tricked kernel. However, a bug in the ARM specs did not prevent access to user-mode memory pages marked as “execute-only”. “Enhanced PAN” is intended to correct exactly this fact.
Immortalis GPU masters hardware ray tracing
Arm introduces new graphics units for the flagship and premium segment with the TCS22. The new flagship is the Immortalis G715, which can be integrated with up to 16 GPU cores. The two major innovations of the Immortalis-G715 are hardware ray tracing via a new RTU (“Ray Tracing Unit”) in the shader core and support for “Variable Rate Shading” (VRS). In addition to the Immortalis-G715, the Mali-G715 is also new and offers the same changes as VRS and optimizations to the execution engine, except for the RTU. The same applies to the Mali-G615, which can be configured with fewer shader cores and optionally has fewer L2 caches.
CoreLink CI-700 and NI-700 remain almost the same
Arm has made cache optimizations for the CoreLink CI-700 and NI-700 interconnects, which are used to connect system IPs and third-party IPs developed in-house by other manufacturers. For example, when using (asymmetric) MTE for a to ensure better scalability. Between one and eight DSU clusters and eight memory controllers can still be connected to the CoreLink CI-700 – one component of each is common in smartphones. A CoreLink CI-700 is configured with so-called XP nodes, which can exist in a mesh of up to 4 × 3 points, each of which can have a maximum of eight system-level cache slices of up to 4 MB. The SLC is also used by Arm to store the MTE tags.
Optimizations for modern manufacturing processes
Arm has optimized the physical IP so that it can be better used in combination with state-of-the-art manufacturing processes such as 5 nm and 4 nm, such as those offered by Samsung and TSMC. The TCS22 now also supports the connection of the Cortex-M85, which is suitable for embedded solutions such as smart speakers, but can also act as an always-on processor with its DSP and machine learning functions to process voice commands on a smartphone, for example. If the smaller Cortex-M55 is used, the Arm Custom Instructions, which were limited to the Cortex-M33 when they were introduced, can now also be run on it.
New tools for developers
In the area of software and tools for developers, Arm offers TCS22 support for the latest Android, which is expected to be version 13 in the fall, and will soon be making a new Fixed Virtual Platform (FVP) available with which a complete arm system based on the new components should be able to be simulated at approximately the speed of real hardware. The FVP provides ARM for the Windows and Linux operating systems. The tools for developers also include the Hardware Success Kit and the Software Success Kit, which Arm intends to make available shortly in a new generation adapted for the TCS22.