After MMX, SSE (2, 3, 4, etc..) And AVX, AVX2 here. This extension of the x86 instruction set will be given on the next generation of processors, Ivy Bridge. In New features include a "homecoming" of SIMD - single instruction multiple data may change - can work on integers is 256 bits. Back to basics, because the first set of SIMD instructions on x86 was the MMX, which allowed just this type of manipulation, while the following were focused primarily on floating point data.
There are also instructions that work on manipulation of bits, which is useful in cryptography, especially when not using AES, which already has dedicated his instructions. The most interesting is the appearance of the FMA (Fused Multiply-Add), a technique that allows for a multiplication and addition in a single instruction, the type A = A x B + C.
AMD is expected to integrate the first such instruction, with a technique different from Intel, but Intel is surely the implementation will be most used, as is usually the case. Remains the classic problem of instruction sets like: support in the real world. Indeed, if the demonstrations show gains can be huge, "real" software is rarely optimized instruction sets lately.
The reason is simple: the installed base is generally low and developers use the instructions that are available on most processors. This explains that currently the majority of software is - at best - optimized for SSE2 instructions, present in the majority of processors on the market (including Atom).
Of course, nothing prevents developers to integrate other instruction sets by detecting the CPU type, but it usually requires extra work since it must encode multiple versions of algorithms to optimize.
There are also instructions that work on manipulation of bits, which is useful in cryptography, especially when not using AES, which already has dedicated his instructions. The most interesting is the appearance of the FMA (Fused Multiply-Add), a technique that allows for a multiplication and addition in a single instruction, the type A = A x B + C.
AMD is expected to integrate the first such instruction, with a technique different from Intel, but Intel is surely the implementation will be most used, as is usually the case. Remains the classic problem of instruction sets like: support in the real world. Indeed, if the demonstrations show gains can be huge, "real" software is rarely optimized instruction sets lately.
The reason is simple: the installed base is generally low and developers use the instructions that are available on most processors. This explains that currently the majority of software is - at best - optimized for SSE2 instructions, present in the majority of processors on the market (including Atom).
Of course, nothing prevents developers to integrate other instruction sets by detecting the CPU type, but it usually requires extra work since it must encode multiple versions of algorithms to optimize.
No comments:
Post a Comment