Skip to content

Commit

Permalink
[Proofreading] Chapter 3. part4
Browse files Browse the repository at this point in the history
  • Loading branch information
dendibakh committed Jan 22, 2024
1 parent 3611c53 commit 11e7459
Showing 1 changed file with 6 additions and 6 deletions.
12 changes: 6 additions & 6 deletions chapters/3-CPU-Microarchitecture/3-5 Exploiting TLP.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,16 +36,16 @@ There is also a security concern with certain simultaneous multithreading implem

### Hybrid Architectures

Computer architects also developed a hybrid CPU design, where two types of cores (or more) are put in the same processor. Typically, more powerful cores are coupled with relatively slower cores to address different goals. In such a system, big cores are used for latency-sensitive task and small cores are used for better battery-saving. But also, both types of cores can be utilized at the same time to improve multithreaded performance. All cores have access to the same memory, so workloads can migrate from big to small cores and back on the fly. The intention is to create a multicore processor that can adapt better to dynamic computing needs and use less power. For example, video games have parts of single-core burst performance as well as parts where they can scale to many cores.
Computer architects also developed a hybrid CPU designin which two (or more) types of core are put in the same processor. Typically, more powerful cores are coupled with relatively slower cores to address different goals. In such a system, big cores are used for latency-sensitive tasks and small cores provide reduced power consumption. But also, both types of cores can be utilized at the same time to improve multithreaded performance. All cores have access to the same memory, so workloads can migrate from big to small cores and back on the fly. The intention is to create a multicore processor that can adapt better to dynamic computing needs and use less power. For example, video games have parts of single-core burst performance as well as parts where they can scale to many cores.

The first mainstream hybrid architecture was ARM's big.LITTLE, which was introduced in October 2011. Other vendors followed this approach. Apple introduced its M1 chip in 2020 that has four high-performance "Firestorm" and four energy-efficient "Icestorm" cores. Intel introduced its Alderlake hybrid architecture in 2021 with eight P- and eight E-cores in the top configuration.

Hybrid architectures combine the best sides of both core types, but it comes with its own set of challenges. First of all, it requires cores to be fully ISA-compatible, i.e., they should be able to execute the same set of instructions. Otherwise, the schedulling becomes restricted. For example, if a big core features some fancy instructions that are not available on small cores, than you can only assign big cores to run workloads that use such instructions. That's why usualy vendors use the "greatest common denominator" approach when choosing the ISA for a hybrid processor.
Hybrid architectures combine the best sides of both core types, but it comes with its own set of challenges. First of all, it requires cores to be fully ISA-compatible, i.e., they should be able to execute the same set of instructions. Otherwise, schedulling becomes restricted. For example, if a big core features some fancy instructions that are not available on small cores, than you can only assign big cores to run workloads that use such instructions. That's why usualy vendors use the "greatest common denominator" approach when choosing the ISA for a hybrid processor.

Even with ISA-compatible cores, schedulling becomes challenging. Different types of workloads call for specific schedulling scheme, e.g., bursty execution vs. steady execution, low IPC vs. high IPC, low improtance vs. high importance, etc. It becomes non-trivial very quickly. Here are a few considerations for optimal scheduling:
Even with ISA-compatible cores, schedulling becomes challenging. Different types of workloads call for a specific schedulling scheme, e.g., bursty execution vs. steady execution, low IPC vs. high IPC, low importance vs. high importance, etc. It becomes non-trivial very quickly. Here are a few considerations for optimal scheduling:

* Leverage small cores to conserve power. Do not wake up big cores for the background work.
* Leverage small cores to conserve power. Do not wake up big cores for background work.
* Recognize candidates (low importance, low IPC) for offloading to smaller cores. Similarly, promote high importance, high IPC tasks to big cores.
* When assigning a new task, use an idle big core first. In case SMT, use big cores with both logical threads idle. After that, use idle small cores. After that, use sibling logical threads of big cores.
* When assigning a new task, use an idle big core first. In case of SMT, use big cores with both logical threads idle. After that, use idle small cores. After that, use sibling logical threads of big cores.

From a programmer's perspective, no code changes are needed to make use of hybrid systems. This approach became very popular in client-facing devices, especially in smartphones. We will take a look at Intel's Alderlake design later in this chapter.
From a programmer's perspective, no code changes are needed to make use of hybrid systems. This approach became very popular in client-facing devices, especially in smartphones.

0 comments on commit 11e7459

Please sign in to comment.