Monday, February 21, 2011

NEON, Tegra, Flash and WebM Explained

After our visit at MWC, we were able to chat with a few builders in the interest of SIMD instruction set "Neon" and its integration into the current ARM chips. Explain. NEON (aka MPE Media Processing Engine) is a SIMD instruction set - Single Input Multiple Data - designed for ARM chips. By simplifying the SIMD instruction can be applied on multiple data at once, which speeds up certain treatments significantly, such as video decoding.

An example of SIMD instruction sets is the ESS, available in various versions in x86 processors for ten years. NEON is usually built into the ARM chip type ARMv7: The Cortex A8 and Cortex A9, Scorpio Qualcomm, using all NEON. All? No. Tegra 2, SoC NVIDIA, does not NEON. In practice, NEON is widely used.

The reasons are manifold, but one is interesting: SIMD instructions are used to overcome the weaknesses of the FPU Cortex A8. Indeed, if the Cortex A9 has a unit about right in terms of performance, that of the much used Cortex A8 is very slow cons. The main reason is that it is not pipelined, which severely limits the number of instructions executed per cycle.

In fact, NEON is used to decode video or 2D (SIMD is very effective for filtering an image) but also to replace the classical FPU. The ARM has explained that such routines WebM decoding, video format from Google, contained over 10,000 lines of code using NEON. Android and implementation also requires Flash 10.1 NEON to work.

The dilemma was the following NVIDIA Tegra-2: integrating NEON, which increases the size of the core of about 30%, or do without some software to optimize and Tegra 2 and UPF Cortex A9 at the expense of compatibility. This is the second choice has been done, and NVIDIA has been working with Adobe to make some parts of the decoding on the graphics card in Flash for Android, but the company has apparently changed his mind since.

Indeed, "Tegra 3", which still bears the code name Kal-El, will be compatible NEON is a small difference between the chip and its current successor. The size of the core is expected to increase, but performance in some cases be better. Interestingly, we met with NVIDIA before submission of Kal-El and the manager has clearly indicated that the ratio up / interest NEON made its use obsolete ...

Still, the original argument stands NVIDIA: the company pushes developers to use the video decoding capabilities of the GPU for calculations and floating, the Cortex A9 FPU is more efficient than neon. But, as often in computing the backward wins, even if it involves some sacrifices.

1 comment:

  1. Not everyone who uses NEON needs floating point. NEON is very good at integer calculations, and this is what my company uses it for. Tegra 2 has an extra A9, but we had to write a special version without NEON.

    I look forward to NEON returning on NVIDIA chips.

    ReplyDelete