Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable vortex style fp8 as an option in evo2 #12464

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions nemo/collections/llm/gpt/model/hyena.py
Original file line number Diff line number Diff line change
Expand Up @@ -216,6 +216,9 @@ class HyenaConfig(TransformerConfig, io.IOMixin):
use_te: bool = True
to_upper: str = "normalized_weighted" # choose between "weighted" and "normalized_weighted"
use_short_conv_bias: bool = False
# Use this if you want to turn FP8 on for the linear layer in the mixer only. When using this, do not set
# Fp8 in the mixed precision plugin.
vortex_style_fp8: bool = False

def __post_init__(self):
"""
Expand Down
19 changes: 17 additions & 2 deletions nemo/collections/llm/gpt/model/megatron/hyena/hyena_mixer.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,20 @@
logger = logging.getLogger(__name__)

try:
import transformer_engine.pytorch as te
from transformer_engine.common.recipe import DelayedScaling, Format
except ImportError:

def DelayedScaling(*args, **kwargs):
raise ImportError("transformer_engine not installed. Using default recipe.")

def Format(*args, **kwargs):
raise ImportError("transformer_engine not installed. Using default recipe.")

class te:
def __getattribute__(self, name: str) -> None:

Check notice

Code scanning / CodeQL

Non-standard exception raised in special method Note

Function always raises
builtin-class ImportError
; raise AttributeError instead

Copilot Autofix AI about 10 hours ago

To fix the problem, we need to modify the __getattribute__ method to raise an AttributeError instead of an ImportError. This change will ensure that the method conforms to the standard protocol for attribute access in Python. The rest of the functionality should remain the same, and the warning message about the missing transformer_engine module will still be logged.

Suggested changeset 1
nemo/collections/llm/gpt/model/megatron/hyena/hyena_mixer.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/nemo/collections/llm/gpt/model/megatron/hyena/hyena_mixer.py b/nemo/collections/llm/gpt/model/megatron/hyena/hyena_mixer.py
--- a/nemo/collections/llm/gpt/model/megatron/hyena/hyena_mixer.py
+++ b/nemo/collections/llm/gpt/model/megatron/hyena/hyena_mixer.py
@@ -62,3 +62,3 @@
             """Not imported: te. An error will be raised if this is called like a module."""
-            raise ImportError("transformer_engine not installed. Using default recipe.")
+            raise AttributeError(f"'_te' object has no attribute '{name}'")
 
EOF
@@ -62,3 +62,3 @@
"""Not imported: te. An error will be raised if this is called like a module."""
raise ImportError("transformer_engine not installed. Using default recipe.")
raise AttributeError(f"'_te' object has no attribute '{name}'")

Copilot is powered by AI and may make mistakes. Always verify output.
Positive Feedback
Negative Feedback

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Please select one or more of the options
raise ImportError("transformer_engine not installed. Using default recipe.")

logger.warning("WARNING: transformer_engine not installed. Using default recipe.")


Expand Down Expand Up @@ -241,8 +253,11 @@
_proj_use_cp = True
else:
_proj_use_cp = False

features, _ = self.dense_projection(x)
if self.transformer_config.vortex_style_fp8:
with te.fp8_autocast(enabled=True, fp8_recipe=set_format_recipe()):
features, _ = self.dense_projection(x)
else:
features, _ = self.dense_projection(x)
features = rearrange(features, "l b d -> b l d").contiguous()
features_L_last = features.permute(0, 2, 1)
features_D_last = self.hyena_proj_conv(features_L_last, _use_cp=_proj_use_cp).permute(0, 2, 1)
Expand Down
Loading