The Linux Foundation

 
CGLGapProposalCGOS.6.1

From The Linux Foundation

Contents

Title

Discovery of Platform CPU Architecture

Description

To allow the discovery of the topology and other details of a platform CPU architecture, such as the number and the sizes of the caches, to facilitate and optimize SMP programming.

The CGOS needs to allow an application to discover platform CPU architecture topology and details, such as the number of caches and the sizes of the caches, to facilitate the optimization of the use of multiple CPUs, the memory hierarchy and the interconnect fabric. The CGOS needs to provide such architectural information in a format that is uniform across platforms.

Many forms of SMP are available today in CG environments ranging from SMT to NUMA and combinations in-between. The CPU configurations have a profound effect on performance and stability. The scheduler in most operating systems (Linux 2.6, for example) dynamically builds a view of the system based on the CPU topology, including caches and threads. This view must be exported to application programmers (to a certain degree some of that information is already available in /sys/devices/system/cpu directory).

The understanding of shared vs. private caches and threads is key to writing high performance software. This architectural topology information must be available on demand (system call or from /proc) to facilitate the necessary application partitioning early in the design stage. This approach can be taken a step further, so that an application can determine the topology dynamically and optimize its operations for the specific topology.

Priority

Low

Use Cases

The applications are best suited to determining their resource requirements and how to optimize their own behaviour in resource-constrained situations. Provided with information regarding the number of cores, CPUs, cache size and location, memory topology, etc, it is possible for the application to reconfigure itself to meet optimal performance requirements.

Scenarios

An application may choose to spawn more or less threads, each with differing CPU affinity based on the total number of CPUs or cores with shared L1 / L2 cache within the system.

References

SCOPE Alliance Carrier Grade Operating Systems, Gap Analysis v2.0, CGOS-6.1


Internal Use Only

State: [OPEN] Section: [Serviceability] Date opened: [2008.09.09] Submitter: [SCOPE Alliance] Owner: [SCOPE Alliance] Date integrated: [] Integrated into: [] Resolution Comments: [] Proof of Concept: []


[Article] [Discussion] [View source] [History]