LSPR FAQ: z/OS V2R4 and z/VM
What are the major changes to the z/OS V2R4 LSPR?
The LSPR ratios reflect the range of performance between IBM Z servers as measured using a wide variety of application benchmarks. The latest release of LSPR continues with the methodology introduced with the z/OS V1R11 LSPR. Prior to that version, workloads had been categorized by their application type or software characteristics (for example, CICS®, OLTP-T, LoIO-mix). With the introduction of CPU MF (SMF 113) data starting with the z10 processor, insight into the underlying hardware characteristics that influence performance was made possible. The LSPR defines three workload categories, LOW, AVERAGE, HIGH, based on the metric called “Relative Nest Intensity (RNI)” which reflects a workload’s use of a processor’s memory hierarchy. For details on RNI and the workload categories, please reference the LSPR documentation or go to https://www.ibm.com/servers/resourcelink/lib03060.nsf/pages/lsprindex.
What is the multi-image table in the LSPR?
Typically, IBM Z processors are configured with multiple images of z/OS. Thus, the LSPR continues to include a table of performance ratios based on average multi-image z/OS configurations for each processor model as determined from the profiling data. The multi-image table is used as the basis for setting MIPS and MSUs for IBM Z processors.
What multi-image configurations are used to produce the LSPR multi-image table?
A wide variety of multi-image configurations exist. The main variables in a configuration typically are: 1) number of images, 2) size of each image (number of logical engines), 3) relative weight of each image, 4) overall ratio of logical engines to physical engines, 5) the number of books, and 6) the number of ICFs/IFLs. The configurations used for the LSPR multi-image table are based on the average values for these variables as observed across a processor family. It was found that the average number of images ranged from five at low-end models to nine at the high end. Most systems were configured with two major images (those defined with >20% relative weight). On low- to mid-range models, at least one of the major images tended to be configured with a number of logical engines close to the number of physical engines. On high-end boxes, the major images were generally configured with a number of logical engines well below the count of physical engines reflecting the more common use of these processors for consolidation. The overall ratio of logical to physical engines (often referred to as “the level of processor over-commitment” in a virtualized environment) averaged as high as 3:1 on the smallest models, hovered around 2:1 across the majority of models, and dropped to 1.3:1 on the largest models. The majority of models were configured with one book more than necessary to hold the enabled processing engines, and an average of 3 ICFs/IFLs were installed.
Can I use the LSPR multi-image table for capacity sizing?
For high-level sizing, the multi-image table may be used. However, the most accurate sizing requires using the zPCR tool’s LPAR Configuration Capacity Planning function, which can be customized to exactly match a specific multi-image configuration rather than the average configuration reflected in the multi-image LSPR table.
What model is used as the "base" or "reference" processor in the z/OS V2R4 LSPR table?
The 2094-701 processor model is used as the base in the z/OS V2R4 table. Thus, the ITRR for the 2094-701 appears as 1.00.
Note that in zPCR the reference processor may be set at the user’s discretion.
What "capacity scaling factors" are commonly used?
The LSPR provides capacity ratios among various processor families. It has become common practice to assign a capacity scaling value to processors as a high-level approximation of their capacities. The commonly used scaling factors can change based on the version of LSPR. For z/OS V2R4 studies, the capacity scaling factor commonly associated with the reference processor set to a 2094-701 is 593 which is unchanged from that used originally with z/OS V1R11. This value reflects a 2094-701 configured with a single image of z/OS - no complex LPAR configuration (i.e., multiple z/OS images) effects are included. For the z/OS V2R4 multi-image table the commonly used scaling factor is 0.944x593=559.792. Note the 0.944 factor reflects the fact that the multi-image table has processors configured based on the average client LPAR configuration; on a 2094-701, the cost to run this complex configuration is approximately 5.6%. The commonly used capacity scaling values associated with each model of a processor may be approximated by multiplying the AVERAGE column of ITRRs in the LSPR z/OS V2R4 multi-image table by 559.792. The PCI (Processor Capacity Index) column in the z/OS V2R4 multi-image table shows the result of this calculation. Note that the PCI column was actually calculated using zPCR, thus the full precision of each ITRR is reflected in the values. Minor differences in the resulting PCI calculation may be observed when using the rounded values from the LSPR table.
Of course, using a table of values based on a capacity scaling factor only allows for a gross approximation of the relative capacities among the processor models A more accurate analysis may be conducted by using zPCR to perform a detailed LPAR configuration assessment to develop the capacity ratio between a "before" and "after" configuration.
How much variability in performance should I expect when moving a workload to an IBM z16 processor?
As with the introduction of any new server, workloads with differing characteristics will see variation in performance when moved to an IBM z16. The performance ratings for a server are determined by the performance of a reference workload that represents what we understand to be the major components of our customers' production environments. While we feel the ratings provide good "middle-of-the-road" values, we also recognize some customers' workloads will differ somewhat from the reference workload we used. The IBM z16 has improvements in its microprocessor design and in its memory hierarchy. However, workloads with different characteristics will see varying performance values from these changes. It is expected that the range of variation in performance of workloads will be similar to that seen in recent processor generations.
Once my workload is up and running on an IBM z16, how much variability in performance will I see?
Minute-to-minute, hour-to-hour and day-to-day performance variability generally grows with the size (capacity) of the server and the complexity of the LPAR configuration. With its improved microprocessor and memory hierarchy design and support for larger numbers of engines, the IBM z16 provides an increase in capacity over the largest previous server in each family. Continued enhancements to z/OS HiperDispatch have been made to help reduce the potential for increased performance variability. In the spirit of autonomic computing, PR/SM™ and the z/OS dispatcher cooperate to automatically place and dispatch logical partitions to help optimize the performance of the hardware and minimize the interference of one partition to another. However, while the average performance of workloads is expected to remain reasonably consistent when viewed at small increments of time or by individual jobs or transactions, some variation in performance might be seen simply due to the expected larger and more complex LPAR configurations that can be supported by the IBM z16.
How do I get performance information for my TPF products running on an IBM z16?
TPF provides "Workload Specifics ITRRs" separately from the LSPR tables. For more information please contact your TPF Support Representative or send a request to firstname.lastname@example.org
What is z/OS HiperDispatch and how does it impact performance?
z/OS HiperDispatch is the z/OS exploitation of PR/SM’s Vertical CPU Management (VCM) capabilities and is exclusive to IBM Z processors since the IBM System z10®. Rather than dispatching tasks randomly across all logical processors in a partition, z/OS will tie tasks to small queues of logical processors and dispatch work to a “high priority” subset of the logical processors. PR/SM provides processor topology information and updates to z/OS and ties the high priority logical processors to physical processors. HiperDispatch can lead to improved efficiency in both the hardware and software in the following two manners: 1) work may be dispatched across fewer logical processors therefore reducing the “multi-processor (MP) effects” and lowering the interference among multiple partitions; 2) specific z/OS tasks may be dispatched to a small subset of logical processors which PR/SM will tie to the same physical processors thus improving the hardware cache re-use and locality of reference characteristics such as reducing the rate of cross-book communication. Note the value of HiperDispatch is higher on the IBM zEnterprise 196 (z196) and later processors due to their sensitivity to the chip-level shared cache topology.
A white paper is available concerning z/OS HiperDispatch at: https://www.ibm.com/support/pages/zos-planning-considerations-hiperdispatch-mode.
What is z/VM HiperDispatch and how does it impact performance?
z/VM HiperDispatch is the z/VM exploitation of PR/SM's Vertical CPU Management (VCM) capabilities. z/VM HiperDispatch improves CPU efficiency by causing the z/VM Control Program to run virtual servers in a manner that recognizes and exploits IBM Z machine topology to increase the effectiveness of physical machine memory cache. This includes: a) requesting PR/SM to handle the partition's logical processors in a manner that exploits physical machine topology, b) dispatching virtual servers in a manner that tends to reduce their movement within the partition's topology and c) dispatching multiprocessor virtual servers in a manner that tends to keep the server's virtual CPUs close to one other within the partition's topology. z/VM HiperDispatch can also improve performance by automatically tuning the LPAR's use of its logical CPUs to try to use only those logical CPUs to which it appears PR/SM will be able to deliver a full physical processor's worth of computing power. This includes: a) sensing and forecasting key indicators of workload intensity and b) autonomically configuring the z/VM system not to use underpowered logical CPUs.
An article is available concerning z/VM HiperDispatch at: http://www.vm.ibm.com/perf/tips/zvmhd.html.
What is the performance improvement a z/VM customer might experience on the IBM z16?
The performance ratios a z/VM customer workload might experience when migrating to IBM z16 from older processors will vary. For the z/VM LSPR curves, a single workload having characteristics similar to the AVERAGE relative nest intensity workload was used. However, customer workloads have been shown to cover the full range from LOW to HIGH RNI workloads. Thus, it is suggested that you consider the full range of LSPR workloads.
Where can I read more about the performance of z/VM?
The z/VM Performance Resources Page, located at http://www.vm.ibm.com/perf/, contains information on z/VM performance.
What is the performance improvement a z/VSE customer might experience on the IBM z16?
The performance ratios that a z/VSE customer workload might experience when migrating to an IBM z16 are represented by the range of ratios for a comparable z/OS migration. For example, the published ratio in the LSPR between the z15 702 and the IBM z16 702 is approximately 9% to 12%. z/VSE workloads should expect this same range of performance for this migration. Consult the LSPR for other examples of moves to IBM z16.