PAGE INDEX
Kaveri Architecture
The Kaveri APUs sit on a 245mm squared die and have 2.41 billion transistors. They are built on the same 28nm process we are used to seeing. The biggest difference you are likely to notice about the Kaveri architecture is the much more massive GPU on the die. The GPU in Kaveri now takes up around 47% of the die. The GPU is paired with two Dual Core x86 CPU modules to make up the up to 12 compute cores available on Kaveri APUs. With the emphasis on the GPU cores, Kaveri can experience an astounding theoretical performance of up to 856GFLOPS.The Kaveri architecture fully supports the HSA features we discussed earlier, as well as AMD TrueAudio Technology and PCI Express Gen 3.
The Kaveri APUs keep a lot of the architecture we found in previous APUs, such as the shared L2 cache per module (one module = two cores) as well as support for the latest ISA instructions (FMA4, AVX, AES, XOP). One thing you’ll probably notice about Kaveri is the lower clock speeds when compared to Richland. The AMD A10-7850K, the flagship Kaveri APU, runs at 3.7GHz with a max turbo clock of 4GHz. The A10-7700K runs at 3.4GHz with turbo up to 3.8GHz, and the A8-7600 runs at 3.3GHz and turbos up to 3.8GHz. AMD admitted that the lower clocks at high TDPs were in compromise for the better GPU performance.
Even though they are clocked slower, the new Steamroller cores do experience up to a 20% increase in instructions per cycle (IPC) over the Piledriver cores, according to AMD. The 20% increase is not typical itself, with most IPC increases landing around 10% for the Steamroller cores. Those IPC increases come due to a 30% reduction in i-Cache misses, a 20% reduction in mispredicted branches, an increase in schedulers (from 40 to 48), two integer schedulers, a 25% increase in max-width dispatches per thread, and improvements in store handling.
The up to 8 GCN-based GPU cores on the Kaveri GPUs support the latest graphics technologies found in Hawaii, including TrueAudio, Eyefinity, UVD 4, and VCE 2. The GPU cores have up to 512 shaders, support for system and device flat addressing, and the new MQSAD instruction with 32b accumulation and saturation. The Kaveri GPU also increases performance by allowing the local data store to buffer data rather than going off GPU for it, reducing off chip bandwidth usage. Kaveri’s up to 8 asynchronous compute engines can work independently or simultaneously for faster context switching. Kaveri further adds a second bus through the IOMMU (input/output memory management unit) for a total of one coherent and one non-coherent bus. The Kaveri GPU enhances support for H.265 4K accelerated playback, accelerated video and image editing, and realism in gaming with physics and AI co-processing.
The Kaveri APUs use the socket FM2+ architecture, rather than the socket FM2 used by the Trinity and Richland APUs. This means that you will need to upgrade your motherboard if you upgrade to a Kaveri APU. The new motherboard chipsets, however, are fully backwards compatible. That means that a Trinity or Richland APU will work in an A88X, A78, or A68 motherboard.
To conclude the Kaveri Architecture summary, here is a list of the new Kaveri APUs.
A10-7850K |
A10-7700K |
A8-7600 |
|
Compute Cores |
12 (4 CPU + 8 GPU) |
10 (4 CPU + 6 GPU) |
10 (4 CPU + 6 GPU) |
Max Turbo / CPU Frequency |
4 / 3.7 GHz |
3.8 / 3.4 GHz |
3.8 / 3.3 GHz |
L2 Cache |
4 MB |
4 MB |
4 MB |
GPU Frequency |
720 MHz |
720 MHz |
720 MHz |
HSA Features |
Yes |
Yes |
Yes |
AMD TrueAudio Technology |
Yes |
Yes |
Yes |
Mantle Support |
Yes |
Yes |
Yes |
AMD Configurable TDP |
Yes |
Yes |
Yes – Optimized |
Most Recent Comments