Document Number: 310712-009US
This document describes Vector Math Library (VML), which is designed to compute elementary functions on vector arguments. VML is an integral part of the Intel® Math Kernel Library and the VML terminology is used here for simplicity in discussing this group of functions.
VML includes a set of highly optimized implementations of certain computationally expensive core mathematical functions (power, trigonometric, exponential, hyperbolic, etc.) that operate on vectors. VML may significantly improve performance for such applications as nonlinear software, computations of integrals, and many others.
Each vector function from VML (for each data format) can work in three modes: High Accuracy (HA), Low Accuracy (LA), and Enhanced Performance (EP). For many functions, using the LA version improves performance at the cost of slight reduction in accuracy (1-2 least significant bits). In contrast to the LA accuracy flavor, the EP flavor further enhances the performance at the cost of significant reduction in accuracy. In both single and double precision, about half bits of floating-point mantissa are correct. Moreover, subtle argument paths for certain functions (for example, large arguments in trigonometric functions) may be calculated with even less accuracy.
Despite the fact that default accuracy is HA, LA is more than sufficient in most cases. For certain applications
that are not very demanding for accuracy (for example, media applications, some Monte Carlo simulations, etc.) you
may find the EP accuracy flavor sufficient. The accuracy flavor can be controlled by vmlSetMode function.
Please refer to the MKL Reference Manual for further details.
Accuracy behavior is processor specific, so results might slightly differ across different processor families and even within a processor family, for example, between some processor models of the family, or between 64-bit and 32-bit libraries. Also results might slightly differ from release to release. Nevertheless these differences are within specified error bounds.
Error and special value behaviors are identical for HA and LA functions and do not depend on the processor on which the software runs. Correct error and special value behavior is not guaranteed for the EP flavor.
This document refers to a more detailed description of performance and accuracy properties of VML functions, which you can find at the product web page. There are several issues considered (performance, accuracy, special values processing) and two levels of details (brief information for all functions in one table and more detailed information for every function on a separate page).
Performance issues: Performance numbers in the respective tables are shown for so-called "working" intervals arguments. Performance behavior may be different for other intervals. For example, it is quite expensive to compute trigonometric functions on "huge" arguments. Therefore, to obtain needed accuracy, performance is sacrificed. Each function lists the working interval over which performance is measured. The same page contains graphs that demonstrate how the performance behavior depends on vector length. There are two extreme cases: so-called "short" and "long" vectors (logarithmic scale is used to show both cases). For short vectors there are loop organization and initialization overheads. The cost of such overheads is amortized with increasing vector length, and for vectors longer than a few dozens of elements the performance remains quite flat until the L2 cache size is exceeded with the length of the vector.
Data prefetching with the Intel® Pentium® III processor (explicit data prefetch in software) and Pentium 4 processor (implicit data prefetch in hardware) greatly reduce the out-of-cache problem.
See a comprehensive table for the performance data on all VML functions.
Accuracy issues: The design requirement for the HA functions is less than 1.0 ulp error with all special values being processed correctly. A measured error in the LA version does not exceed 4 ulp. For more details see the web-placed accuracy table with ulp errors for all functions.
Special Values processing issues: Special Values are processed in accordance with C9X standard. For full lists of special values, see the corresponding tables for real and complex functions.
For more details on individual functions see the list of VML functions
at the product web page.
To ensure a correct display of this document, use the following recommended browser versions: Internet Explorer* 5.5 or higher (on Windows*), Netscape* 4.79, or Mozilla* 1.4 or higher (on Linux*).
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL(R) PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED,
BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN
INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS
ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES
RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER
INTELLECTUAL PROPERTY RIGHT. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must
not rely on the absence or characteristics of any features or instructions marked "reserved" or
"undefined". Intel
reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities
arising from future changes to them. The information here is subject to change without notice. Do not finalize a
design with this information. The products described in this document may contain design defects or errors known as errata which may cause the
product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing
your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature,
may be obtained by calling 1-800-548-4725, or by visiting Intel's Web Site. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each
processor family, not across different processor families. See http://www.intel.com/products/processor_number for
details. BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino Atom, Centrino Inside, Centrino logo, Core Inside,
FlashFile, i960, InstantIP, Intel, Intel logo, Intel386, Intel486, IntelDX2, IntelDX4, IntelSX2, Intel Atom,
Intel Core, Intel Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo, Intel NetBurst,
Intel NetMerge, Intel NetStructure, Intel SingleDriver, Intel SpeedStep, Intel StrataFlash, Intel Viiv,
Intel vPro, Intel XScale, Itanium, Itanium Inside, MCS, MMX, Oplus, OverDrive, PDCharm, Pentium, Pentium Inside,
skoool, Sound Mark, The Journey Inside, Viiv Inside, vPro Inside, VTune, Xeon, and Xeon Inside are trademarks of
Intel Corporation in the U.S. and other countries.
UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION
IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR.
*Other names and brands may be claimed as the property of others.
Copyright © 2000-2008, Intel Corporation. All rights reserved.