Opcode/Instruction | Op/En | 64/32 bit Mode Support | CPUID Feature Flag | Description |
---|---|---|---|---|
F3 0F 6F /r MOVDQU xmm1, xmm2/m128 | A | V/V | SSE2 | Move unaligned packed integer values from xmm2/m128 to xmm1. |
F3 0F 7F /r MOVDQU xmm2/m128, xmm1 | B | V/V | SSE2 | Move unaligned packed integer values from xmm1 to xmm2/m128. |
VEX.128.F3.0F.WIG 6F /r VMOVDQU xmm1, xmm2/m128 | A | V/V | AVX | Move unaligned packed integer values from xmm2/m128 to xmm1. |
VEX.128.F3.0F.WIG 7F /r VMOVDQU xmm2/m128, xmm1 | B | V/V | AVX | Move unaligned packed integer values from xmm1 to xmm2/m128. |
VEX.256.F3.0F.WIG 6F /r VMOVDQU ymm1, ymm2/m256 | A | V/V | AVX | Move unaligned packed integer values from ymm2/m256 to ymm1. |
VEX.256.F3.0F.WIG 7F /r VMOVDQU ymm2/m256, ymm1 | B | V/V | AVX | Move unaligned packed integer values from ymm1 to ymm2/m256. |
EVEX.128.F2.0F.W0 6F /r VMOVDQU8 xmm1 {k1}{z}, xmm2/m128 | C | V/V | AVX512VL AVX512BW | Move unaligned packed byte integer values from xmm2/m128 to xmm1 using writemask k1. |
EVEX.256.F2.0F.W0 6F /r VMOVDQU8 ymm1 {k1}{z}, ymm2/m256 | C | V/V | AVX512VL AVX512BW | Move unaligned packed byte integer values from ymm2/m256 to ymm1 using writemask k1. |
EVEX.512.F2.0F.W0 6F /r VMOVDQU8 zmm1 {k1}{z}, zmm2/m512 | C | V/V | AVX512BW | Move unaligned packed byte integer values from zmm2/m512 to zmm1 using writemask k1. |
EVEX.128.F2.0F.W0 7F /r VMOVDQU8 xmm2/m128 {k1}{z}, xmm1 | D | V/V | AVX512VL AVX512BW | Move unaligned packed byte integer values from xmm1 to xmm2/m128 using writemask k1. |
EVEX.256.F2.0F.W0 7F /r VMOVDQU8 ymm2/m256 {k1}{z}, ymm1 | D | V/V | AVX512VL AVX512BW | Move unaligned packed byte integer values from ymm1 to ymm2/m256 using writemask k1. |
EVEX.512.F2.0F.W0 7F /r VMOVDQU8 zmm2/m512 {k1}{z}, zmm1 | D | V/V | AVX512BW | Move unaligned packed byte integer values from zmm1 to zmm2/m512 using writemask k1. |
EVEX.128.F2.0F.W1 6F /r VMOVDQU16 xmm1 {k1}{z}, xmm2/m128 | C | V/V | AVX512VL AVX512BW | Move unaligned packed word integer values from xmm2/m128 to xmm1 using writemask k1. |
EVEX.256.F2.0F.W1 6F /r VMOVDQU16 ymm1 {k1}{z}, ymm2/m256 | C | V/V | AVX512VL AVX512BW | Move unaligned packed word integer values from ymm2/m256 to ymm1 using writemask k1. |
EVEX.512.F2.0F.W1 6F /r VMOVDQU16 zmm1 {k1}{z}, zmm2/m512 | C | V/V | AVX512BW | Move unaligned packed word integer values from zmm2/m512 to zmm1 using writemask k1. |
EVEX.128.F2.0F.W1 7F /r VMOVDQU16 xmm2/m128 {k1}{z}, xmm1 | D | V/V | AVX512VL AVX512BW | Move unaligned packed word integer values from xmm1 to xmm2/m128 using writemask k1. |
EVEX.256.F2.0F.W1 7F /r VMOVDQU16 ymm2/m256 {k1}{z}, ymm1 | D | V/V | AVX512VL AVX512BW | Move unaligned packed word integer values from ymm1 to ymm2/m256 using writemask k1. |
EVEX.512.F2.0F.W1 7F /r VMOVDQU16 zmm2/m512 {k1}{z}, zmm1 | D | V/V | AVX512BW | Move unaligned packed word integer values from zmm1 to zmm2/m512 using writemask k1. |
EVEX.128.F3.0F.W0 6F /r VMOVDQU32 xmm1 {k1}{z}, xmm2/mm128 | C | V/V | AVX512VL AVX512F | Move unaligned packed doubleword integer values from xmm2/m128 to xmm1 using writemask k1. |
EVEX.256.F3.0F.W0 6F /r VMOVDQU32 ymm1 {k1}{z}, ymm2/m256 | C | V/V | AVX512VL AVX512F | Move unaligned packed doubleword integer values from ymm2/m256 to ymm1 using writemask k1. |
EVEX.512.F3.0F.W0 6F /r VMOVDQU32 zmm1 {k1}{z}, zmm2/m512 | C | V/V | AVX512F | Move unaligned packed doubleword integer values from zmm2/m512 to zmm1 using writemask k1. |
EVEX.128.F3.0F.W0 7F /r VMOVDQU32 xmm2/m128 {k1}{z}, xmm1 | D | V/V | AVX512VL AVX512F | Move unaligned packed doubleword integer values from xmm1 to xmm2/m128 using writemask k1. |
EVEX.256.F3.0F.W0 7F /r VMOVDQU32 ymm2/m256 {k1}{z}, ymm1 | D | V/V | AVX512VL AVX512F | Move unaligned packed doubleword integer values from ymm1 to ymm2/m256 using writemask k1. |
EVEX.512.F3.0F.W0 7F /r VMOVDQU32 zmm2/m512 {k1}{z}, zmm1 | D | V/V | AVX512F | Move unaligned packed doubleword integer values from zmm1 to zmm2/m512 using writemask k1. |
EVEX.128.F3.0F.W1 6F /r VMOVDQU64 xmm1 {k1}{z}, xmm2/m128 | C | V/V | AVX512VL AVX512F | Move unaligned packed quadword integer values from xmm2/m128 to xmm1 using writemask k1. |
EVEX.256.F3.0F.W1 6F /r VMOVDQU64 ymm1 {k1}{z}, ymm2/m256 | C | V/V | AVX512VL AVX512F | Move unaligned packed quadword integer values from ymm2/m256 to ymm1 using writemask k1. |
EVEX.512.F3.0F.W1 6F /r VMOVDQU64 zmm1 {k1}{z}, zmm2/m512 | C | V/V | AVX512F | Move unaligned packed quadword integer values from zmm2/m512 to zmm1 using writemask k1. |
EVEX.128.F3.0F.W1 7F /r VMOVDQU64 xmm2/m128 {k1}{z}, xmm1 | D | V/V | AVX512VL AVX512F | Move unaligned packed quadword integer values from xmm1 to xmm2/m128 using writemask k1. |
EVEX.256.F3.0F.W1 7F /r VMOVDQU64 ymm2/m256 {k1}{z}, ymm1 | D | V/V | AVX512VL AVX512F | Move unaligned packed quadword integer values from ymm1 to ymm2/m256 using writemask k1. |
EVEX.512.F3.0F.W1 7F /r VMOVDQU64 zmm2/m512 {k1}{z}, zmm1 | D | V/V | AVX512F | Move unaligned packed quadword integer values from zmm1 to zmm2/m512 using writemask k1. |
Op/En | Tuple Type | Operand 1 | Operand 2 | Operand 3 | Operand 4 |
A | NA | ModRM:reg (w) | ModRM:r/m (r) | NA | NA |
B | NA | ModRM:r/m (w) | ModRM:reg (r) | NA | NA |
C | Full Mem | ModRM:reg (w) | ModRM:r/m (r) | NA | NA |
D | Full Mem | ModRM:r/m (w) | ModRM:reg (r) | NA | NA |
Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.
EVEX encoded versions:
Moves 128, 256 or 512 bits of packed byte/word/doubleword/quadword integer values from the source operand (the second operand) to the destination operand (first operand). This instruction can be used to load a vector register from a memory location, to store the contents of a vector register into a memory location, or to move data between two vector registers.
The destination operand is updated at 8-bit (VMOVDQU8), 16-bit (VMOVDQU16), 32-bit (VMOVDQU32), or 64-bit (VMOVDQU64) granularity according to the writemask.
VEX.256 encoded version:
Moves 256 bits of packed integer values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load a YMM register from a 256-bit memory location, to store the contents of a YMM register into a 256-bit memory location, or to move data between two YMM registers.
Bits (MAXVL-1:256) of the destination register are zeroed.
Moves 128 bits of packed integer values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM register from a 128-bit memory location, to store the contents of an XMM register into a 128-bit memory location, or to move data between two XMM registers.
128-bit Legacy SSE version: Bits (MAXVL-1:128) of the corresponding destination register remain unchanged.
When the source or destination operand is a memory operand, the operand may be unaligned to any alignment without causing a general-protection exception (#GP) to be generated
VEX.128 encoded version: Bits (MAXVL-1:128) of the destination register are zeroed.
(KL, VL) = (16, 128), (32, 256), (64, 512) FOR j←0 TO KL-1 i←j * 8 IF k1[j] OR *no writemask* THEN DEST[i+7:i]←SRC[i+7:i] ELSE IF *merging-masking* THEN *DEST[i+7:i] remains unchanged* ELSE DEST[i+7:i]←0 ; zeroing-masking FI FI; ENDFOR DEST[MAXVL-1:VL] ← 0
(KL, VL) = (16, 128), (32, 256), (64, 512) FOR j←0 TO KL-1 i←j * 8 IF k1[j] OR *no writemask* THEN DEST[i+7:i]← SRC[i+7:i] ELSE *DEST[i+7:i] remains unchanged* ; merging-masking I ; ENDFOR;
(KL, VL) = (16, 128), (32, 256), (64, 512) FOR j←0 TO KL-1 i←j * 8 IF k1[j] OR *no writemask* THEN DEST[i+7:i]←SRC[i+7:i] ELSE IF *merging-masking* THEN *DEST[i+7:i] remains unchanged* ELSE DEST[i+7:i]←0 ; zeroing-masking FI FI; ENDFOR DEST[MAXVL-1:VL] ← 0
(KL, VL) = (8, 128), (16, 256), (32, 512) FOR j←0 TO KL-1 i←j * 16 IF k1[j] OR *no writemask* THEN DEST[i+15:i]←SRC[i+15:i] ELSE IF *merging-masking* THEN *DEST[i+15:i] remains unchanged* ELSE DEST[i+15:i]←0 ; zeroing-masking FI FI; ENDFOR DEST[MAXVL-1:VL] ← 0
(KL, VL) = (8, 128), (16, 256), (32, 512) FOR j←0 TO KL-1 i←j * 16 IF k1[j] OR *no writemask* THEN DEST[i+15:i]← SRC[i+15:i] ELSE *DEST[i+15:i] remains unchanged* ; merging-masking I ; ENDFOR;
(KL, VL) = (8, 128), (16, 256), (32, 512) FOR j←0 TO KL-1 i←j * 16 IF k1[j] OR *no writemask* THEN DEST[i+15:i]←SRC[i+15:i] ELSE IF *merging-masking* THEN *DEST[i+15:i] remains unchanged* ELSE DEST[i+15:i]←0 ; zeroing-masking FI FI; ENDFOR DEST[MAXVL-1:VL] ← 0
(KL, VL) = (4, 128), (8, 256), (16, 512) FOR j←0 TO KL-1 i←j * 32 IF k1[j] OR *no writemask* THEN DEST[i+31:i]←SRC[i+31:i] ELSE IF *merging-masking* THEN *DEST[i+31:i] remains unchanged* ELSE DEST[i+31:i]←0 ; zeroing-masking FI FI; ENDFOR DEST[MAXVL-1:VL] ← 0
(KL, VL) = (4, 128), (8, 256), (16, 512) FOR j←0 TO KL-1 i←j * 32 IF k1[j] OR *no writemask* THEN DEST[i+31:i]← SRC[i+31:i] ELSE *DEST[i+31:i] remains unchanged* ; merging-masking I ; ENDFOR;
(KL, VL) = (4, 128), (8, 256), (16, 512) FOR j←0 TO KL-1 i←j * 32 IF k1[j] OR *no writemask* THEN DEST[i+31:i]←SRC[i+31:i] ELSE IF *merging-masking* THEN *DEST[i+31:i] remains unchanged* ELSE DEST[i+31:i]←0 ; zeroing-masking FI FI; ENDFOR DEST[MAXVL-1:VL] ← 0
(KL, VL) = (2, 128), (4, 256), (8, 512) FOR j←0 TO KL-1 i←j * 64 IF k1[j] OR *no writemask* THEN DEST[i+63:i]←SRC[i+63:i] ELSE IF *merging-masking* THEN *DEST[i+63:i] remains unchanged* ELSE DEST[i+63:i]←0 ; zeroing-masking FI FI; ENDFOR DEST[MAXVL-1:VL] ← 0
(KL, VL) = (2, 128), (4, 256), (8, 512) FOR j←0 TO KL-1 i←j * 64 IF k1[j] OR *no writemask* THEN DEST[i+63:i]←SRC[i+63:i] ELSE *DEST[i+63:i] remains unchanged* ; merging-masking FI; ENDFOR;
(KL, VL) = (2, 128), (4, 256), (8, 512) FOR j←0 TO KL-1 i←j * 64 IF k1[j] OR *no writemask* THEN DEST[i+63:i]←SRC[i+63:i] ELSE IF *merging-masking* THEN *DEST[i+63:i] remains unchanged* ELSE DEST[i+63:i]←0 ; zeroing-masking FI FI; ENDFOR DEST[MAXVL-1:VL] ← 0
DEST[255:0] ← SRC[255:0] DEST[MAXVL-1:256] ← 0
DEST[255:0] ← SRC[255:0] VMOVDQU (VEX.128 encoded version) DEST[127:0] ← SRC[127:0] DEST[MAXVL-1:128] ← 0
DEST[127:0] ← SRC[127:0] DEST[MAXVL-1:128] (Unmodified)
DEST[127:0] ← SRC[127:0]
VMOVDQU16 __m512i _mm512_mask_loadu_epi16(__m512i s, __mmask32 k, void * sa);
VMOVDQU16 __m512i _mm512_maskz_loadu_epi16( __mmask32 k, void * sa);
VMOVDQU16 void _mm512_mask_storeu_epi16(void * d, __mmask32 k, __m512i a);
VMOVDQU16 __m256i _mm256_mask_loadu_epi16(__m256i s, __mmask16 k, void * sa);
VMOVDQU16 __m256i _mm256_maskz_loadu_epi16( __mmask16 k, void * sa);
VMOVDQU16 void _mm256_mask_storeu_epi16(void * d, __mmask16 k, __m256i a);
VMOVDQU16 __m128i _mm_mask_loadu_epi16(__m128i s, __mmask8 k, void * sa);
VMOVDQU16 __m128i _mm_maskz_loadu_epi16( __mmask8 k, void * sa);
VMOVDQU16 void _mm_mask_storeu_epi16(void * d, __mmask8 k, __m128i a);
VMOVDQU32 __m512i _mm512_loadu_epi32( void * sa);
VMOVDQU32 __m512i _mm512_mask_loadu_epi32(__m512i s, __mmask16 k, void * sa);
VMOVDQU32 __m512i _mm512_maskz_loadu_epi32( __mmask16 k, void * sa);
VMOVDQU32 void _mm512_storeu_epi32(void * d, __m512i a);
VMOVDQU32 void _mm512_mask_storeu_epi32(void * d, __mmask16 k, __m512i a);
VMOVDQU32 __m256i _mm256_mask_loadu_epi32(__m256i s, __mmask8 k, void * sa);
VMOVDQU32 __m256i _mm256_maskz_loadu_epi32( __mmask8 k, void * sa);
VMOVDQU32 void _mm256_storeu_epi32(void * d, __m256i a);
VMOVDQU32 void _mm256_mask_storeu_epi32(void * d, __mmask8 k, __m256i a);
VMOVDQU32 __m128i _mm_mask_loadu_epi32(__m128i s, __mmask8 k, void * sa);
VMOVDQU32 __m128i _mm_maskz_loadu_epi32( __mmask8 k, void * sa);
VMOVDQU32 void _mm_storeu_epi32(void * d, __m128i a);
VMOVDQU32 void _mm_mask_storeu_epi32(void * d, __mmask8 k, __m128i a);
VMOVDQU64 __m512i _mm512_loadu_epi64( void * sa);
VMOVDQU64 __m512i _mm512_mask_loadu_epi64(__m512i s, __mmask8 k, void * sa);
VMOVDQU64 __m512i _mm512_maskz_loadu_epi64( __mmask8 k, void * sa);
VMOVDQU64 void _mm512_storeu_epi64(void * d, __m512i a);
VMOVDQU64 void _mm512_mask_storeu_epi64(void * d, __mmask8 k, __m512i a);
VMOVDQU64 __m256i _mm256_mask_loadu_epi64(__m256i s, __mmask8 k, void * sa);
VMOVDQU64 __m256i _mm256_maskz_loadu_epi64( __mmask8 k, void * sa);
VMOVDQU64 void _mm256_storeu_epi64(void * d, __m256i a);
VMOVDQU64 void _mm256_mask_storeu_epi64(void * d, __mmask8 k, __m256i a);
VMOVDQU64 __m128i _mm_mask_loadu_epi64(__m128i s, __mmask8 k, void * sa);
VMOVDQU64 __m128i _mm_maskz_loadu_epi64( __mmask8 k, void * sa);
VMOVDQU64 void _mm_storeu_epi64(void * d, __m128i a);
VMOVDQU64 void _mm_mask_storeu_epi64(void * d, __mmask8 k, __m128i a);
VMOVDQU8 __m512i _mm512_mask_loadu_epi8(__m512i s, __mmask64 k, void * sa);
VMOVDQU8 __m512i _mm512_maskz_loadu_epi8( __mmask64 k, void * sa);
VMOVDQU8 void _mm512_mask_storeu_epi8(void * d, __mmask64 k, __m512i a);
VMOVDQU8 __m256i _mm256_mask_loadu_epi8(__m256i s, __mmask32 k, void * sa);
VMOVDQU8 __m256i _mm256_maskz_loadu_epi8( __mmask32 k, void * sa);
VMOVDQU8 void _mm256_mask_storeu_epi8(void * d, __mmask32 k, __m256i a);
VMOVDQU8 __m128i _mm_mask_loadu_epi8(__m128i s, __mmask16 k, void * sa);
VMOVDQU8 __m128i _mm_maskz_loadu_epi8( __mmask16 k, void * sa);
VMOVDQU8 void _mm_mask_storeu_epi8(void * d, __mmask16 k, __m128i a);
MOVDQU __m256i _mm256_loadu_si256 (__m256i * p);
MOVDQU _mm256_storeu_si256(_m256i *p, __m256i a);
MOVDQU __m128i _mm_loadu_si128 (__m128i * p);
MOVDQU _mm_storeu_si128(__m128i *p, __m128i a);
None
Non-EVEX-encoded instruction, see Exceptions Type 4;
EVEX-encoded instruction, see Exceptions Type E4.nb.
#UD | If EVEX.vvvv != 1111B or VEX.vvvv != 1111B. |