Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for the missing SIMD levels from the 53rd Intel ISA Ext. Guide: AMX-FP16, AMX-COMPLEX, AVX-VNNI, AVX-VNNI-INT16 #88

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

InstLatx64
Copy link

@InstLatx64 InstLatx64 commented Jul 14, 2024

  • AMX-FP16 support, according to Intel® Architecture Instruction Set Extensions and Future Features 319433-053, p. 158

  • AMX-COMPLEX support, according to Intel® Architecture Instruction Set Extensions and Future Features 319433-053, p. 155

  • AVX-VNNI support, according to Intel SDM 325462-084US, p. 2413-2419

  • AVX-VNNI-INT16 support, according to Intel® Architecture Instruction Set Extensions and Future Features 319433-053, p. 123-124:

  • New test files:

    • avx-ne-convert.asm, avx-ne-convert-64.asm
    • avx-vnni-int8.asm, avx-vnni-int8-64.asm
    • avx-vnni.asm, avx-vnni-64.asm
    • avx-ifma.asm, avx-ifma-64.asm
    • avx512bf16.asm, avx512bf16-64.asm
    • avx512vp2intersect.asm, avx512vp2intersect-64.asm
  • fix: AMX sample

    • Similar to GATHER instructions, AMX-INT8 and AMX-BF16 instructions cannot have the same operand more than once (Intel SDM 325462-084US, p. 585)
  • fix: AVX-NE-CONVERT instructions

    • VBCSTNEBF16PS -> VBCSTNEBF162PS
    • VCVTNEPS2BF16 target always xmm
    • only for VCVTNEPS2BF16 required explicit size-operator
    • no AVX512 EVEX version except VCVTNEPS2BF16 , so no LATEVEX
  • fix: AVX-VNNI-8 instructions

    • no AVX512 EVEX version, so no LATEVEX
    • unsized operands allowed
  • fix: AVX-IFMA unsized operands

    • unsized operands allowed
  • fix: AVX512_BF16 support

    • finished according to Intel SDM 325462-084US, p. 1987
  • fix: AVX512_VP2INTERSECT support

    • finished according to Intel SDM 325462-084US, p. 2368
  • fix: AVX512_FP16 FMA mnemonics

    • according to Intel SDM 325462-084US, p. 2155, 2172, 2204, 2219

Checked with XED version: [v2024.04.01]
part of the Intel Software Development Emulator 9.38.0
https://www.intel.com/content/www/us/en/developer/articles/tool/software-development-emulator.html

-- Similar to GATHER instructions, AMX-INT8 and AMX-BF16 instructions cannot have the same operand more than once
Intel SDM 325462-084US, p. 585
Checked with XED version: [v2024.04.01]
-- According to Intel® Architecture Instruction Set Extensions and Future Features 319433-053, p. 158
Checked with XED version: [v2024.04.01]
-- According to Intel® Architecture Instruction Set Extensions and Future Features 319433-053, p. 155
Checked with XED version: [v2024.04.01]
- According to Intel® Architecture Instruction Set Extensions and Future Features 319433-053, p. 112-118:
-- VBCSTNEBF16PS -> VBCSTNEBF162PS
-- VCVTNEPS2BF16 target always xmm
-- no AVX512 EVEX version, so no LATEVEX
-- avx-ne-convert.asm, avx-ne-convert-64.asm test files
Checked with XED version: [v2024.04.01]
- According to Intel® Architecture Instruction Set Extensions and Future Features 319433-053, p. 120:
-- no AVX512 EVEX version, so no LATEVEX
-- avx-vnni-int8.asm, avx-vnni-int8-64.asm test files
-- unsized operands allowed
Checked with XED version: [v2024.04.01]
- According to Intel SDM 325462-084US, p. 2413-2419
-- VEX encoded version of the AVX512_VNNI xmm, ymm instructions
-- there is AVX512 EVEX version, so latevex flag required
-- avx-vnni.asm, avx-vnni-64.asm test files
-- unsized operands allowed
Checked with XED version: [v2024.04.01]
- According to Intel® Architecture Instruction Set Extensions and Future Features 319433-053, p. 126-127:
-- there is AVX512 EVEX version, so latevex flag required
-- avx-ifma.asm, avx-ifma-64.asm test files
-- unsized operands allowed
Checked with XED version: [v2024.04.01]
6,
AVX-VNNI-INT16 support
- According to Intel® Architecture Instruction Set Extensions and Future Features 319433-053, p. 123-124:
-- no AVX512 EVEX version, so LATEVEX not required
-- avx-vnni-int16.asm, avx-vnni-int16-64.asm test files
-- unsized operands allowed
Checked with XED version: [v2024.04.01]
-- According to Intel SDM 325462-084US, p. 1989, VCVTNEPS2BF16 has EVEX form too, so LATEVEX required
-- only for VCVTNEPS2BF16 required explicit size-operator
-- LATEVEX for VCVTNEPS2BF16 in avx-ne-convert.asm, avx-ne-convert-64.asm test files
Checked with XED version: [v2024.04.01]
-- finished according to Intel SDM 325462-084US
-- avx512bf16.asm, avx512bf16-64.asm test files
Checked with XED version: [v2024.04.01]
-- finished according to Intel SDM 325462-084US
-- avx512vp2intersect.asm, avx512vp2intersect-64.asm test files
Checked with XED version: [v2024.04.01]
-- VF[N]M[ADD|SUB][132|213|213][S|P]H
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant