Exploiting Register Structure
•Intel 32-bit architecture has 8 additional 64-bit registers called MMX and 8 128-bit registers called XMM.
•Can load 4 single precision floating point numbers or 2 double precision floating point numbers.
•A single operation like
•   add xmm1 xmm2 xmm1
•Will simultaneously add the numbers in xmm1 to xmm2 and store it in xmm2.
•Can give in principle spped up by 4 for single precision and 2 for double precision.