Fetch from nvidia Megatron-LM #5

RaymondLi0 · 2022-08-03T20:32:25Z

No description provided.

test: Update nightly values See merge request ADLR/megatron-lm!2618

Co-authored-by: Mikołaj Błaż <[email protected]>

Reuse global metadata for first saves See merge request ADLR/megatron-lm!2517

Co-authored-by: Selvaraj Anandaraj <[email protected]>

Added CP support for partial DistOpt See merge request ADLR/megatron-lm!2497

…VRx implementation

Bring in-job restart up-to-date with latest NVRx implementation See merge request ADLR/megatron-lm!2560

Co-authored-by: Dennis Liu <[email protected]>

Uneven Virtual Pipeline Parallelism See merge request ADLR/megatron-lm!1961

Co-authored-by: Zijie Yan <[email protected]>

Add aux loss free routing. Closes #356 See merge request ADLR/megatron-lm!2026

…el load Co-authored-by: Mikołaj Błaż <[email protected]>

…into 'main' Broadcast sharded objects during fully parallel load See merge request ADLR/megatron-lm!2417

Co-authored-by: Ryan Wolf <[email protected]>

Support CP + EP with DP last rank ordering See merge request ADLR/megatron-lm!2586

ci: update nightly values See merge request ADLR/megatron-lm!2635

Ensure CPU tensors are cloned See merge request ADLR/megatron-lm!2604

Co-authored-by: Oliver Koenig <[email protected]>

ci: Release results See merge request ADLR/megatron-lm!2636

Cudagraphable RNG and cudagraph memory fixes See merge request ADLR/megatron-lm!2503

… TensorRT Model Optimizer

Support MCore MambaModel quantization through TensorRT Model Optimizer See merge request ADLR/megatron-lm!2527

…rch FSDP2

Disable the FP8 transpose cache when using torch FSDP2 See merge request ADLR/megatron-lm!2508

Port multimodal inference to MCore API See merge request ADLR/megatron-lm!2445

Co-authored-by: Selvaraj Anandaraj <[email protected]>

Added option for parallel cross entropy See merge request ADLR/megatron-lm!2707

Fix multi-rank inference See merge request ADLR/megatron-lm!2776

Fix RNG tracker for inference See merge request ADLR/megatron-lm!2781

…ontiguous in `prepare_input_tensors_for_wgrad_compute` Co-authored-by: Jennifer Chen <[email protected]>

NeMo SFT QAT fix: Make `all_gathered_input` contiguous in `prepare_input_tensors_for_wgrad_compute` See merge request ADLR/megatron-lm!2822

…ring inference

Only materialize logits for the last token during inference See merge request ADLR/megatron-lm!2624

Co-authored-by: Mcore Bot <[email protected]>

barebones radio g support See merge request ADLR/megatron-lm!2622

build: Better caching See merge request ADLR/megatron-lm!2818

…1.0)

chore: Benchmark for PyTorch 24.10 (Mcore 0.11.0) See merge request ADLR/megatron-lm!2788

build: Bisect depedencies See merge request ADLR/megatron-lm!2779

Configurable FSDP modules See merge request ADLR/megatron-lm!2765

Co-authored-by: Rahul Kandu <[email protected]> Co-authored-by: Rahul Kandu <[email protected]>

Workload Inspector on-demand profiling feature See merge request ADLR/megatron-lm!2714

…rence

Change Mamba textgen server to use MCore inference See merge request ADLR/megatron-lm!2621

ci: Publish analytics See merge request ADLR/megatron-lm!2839

ci: Small improvements to release tests See merge request ADLR/megatron-lm!2835

ci: Upload statistics only for MRs See merge request ADLR/megatron-lm!2842

ko3n1g and others added 30 commits February 3, 2025 02:44

ADLR/megatron-lm!2618 - test: Update nightly values

6016692

Merge branch 'ko3n1g/ci/update-nightly-values' into 'main'

2a9793d

test: Update nightly values See merge request ADLR/megatron-lm!2618

ADLR/megatron-lm!2517 - Reuse global metadata for first saves

5d609e4

Co-authored-by: Mikołaj Błaż <[email protected]>

Merge branch 'saharon/reuse_global_metadata_for_first_saves' into 'main'

53634e9

Reuse global metadata for first saves See merge request ADLR/megatron-lm!2517

ADLR/megatron-lm!2497 - Added CP support for partial DistOpt

4a156cb

Co-authored-by: Selvaraj Anandaraj <[email protected]>

Merge branch 'partial_distopt_with_cp' into 'main'

284ed81

Added CP support for partial DistOpt See merge request ADLR/megatron-lm!2497

ADLR/megatron-lm!2560 - Bring in-job restart up-to-date with latest N…

0ed0f70

…VRx implementation

Merge branch 'fault_tolerance_v03' into 'main'

4727616

Bring in-job restart up-to-date with latest NVRx implementation See merge request ADLR/megatron-lm!2560

ADLR/megatron-lm!1961 - Uneven Virtual Pipeline Parallelism

ef49083

Co-authored-by: Dennis Liu <[email protected]>

Merge branch 'uneven_vpp' into 'main'

05949f1

Uneven Virtual Pipeline Parallelism See merge request ADLR/megatron-lm!1961

ADLR/megatron-lm!2026 - Add aux loss free routing.

c6e3b0c

Co-authored-by: Zijie Yan <[email protected]>

Merge branch 'denliu/aux-free-routing' into 'main'

c045c05

Add aux loss free routing. Closes #356 See merge request ADLR/megatron-lm!2026

ADLR/megatron-lm!2417 - Broadcast sharded objects during fully parall…

8d6c9eb

…el load Co-authored-by: Mikołaj Błaż <[email protected]>

Merge branch 'saharon/broadcast_sharded_objects_fully_parallel_load' …

6e211f4

…into 'main' Broadcast sharded objects during fully parallel load See merge request ADLR/megatron-lm!2417

ADLR/megatron-lm!2586 - Support CP + EP with DP last rank ordering

3e7ceda

Co-authored-by: Ryan Wolf <[email protected]>

Merge branch 'rywolf/dp-last' into 'main'

ca46c53

Support CP + EP with DP last rank ordering See merge request ADLR/megatron-lm!2586

ADLR/megatron-lm!2635 - ci: update nightly values

10654a4

Merge branch 'ko3n1g/ci/fix-nightlies' into 'main'

550512a

ci: update nightly values See merge request ADLR/megatron-lm!2635

ADLR/megatron-lm!2604 - Ensure CPU tensors are cloned

b01ae5f

Merge branch 'mblaz/ensure-cpu-clone' into 'main'

1b4a0a8

Ensure CPU tensors are cloned See merge request ADLR/megatron-lm!2604

ADLR/megatron-lm!2636 - ci: Release results

44d11cb

Co-authored-by: Oliver Koenig <[email protected]>

Merge branch 'ko3n1g/ci/release-0.10' into 'main'

0ae1d14

ci: Release results See merge request ADLR/megatron-lm!2636

ADLR/megatron-lm!2503 - Cudagraphable RNG and cudagraph memory fixes

d41666d

Merge branch 'cudagraph_single_mempool' into 'main'

3b9035c

Cudagraphable RNG and cudagraph memory fixes See merge request ADLR/megatron-lm!2503

ADLR/megatron-lm!2527 - Support MCore MambaModel quantization through…

f575d3f

… TensorRT Model Optimizer

Merge branch 'chenhany/mamba_modelopt_support' into 'main'

0dd78dd

Support MCore MambaModel quantization through TensorRT Model Optimizer See merge request ADLR/megatron-lm!2527

ADLR/megatron-lm!2508 - Disable the FP8 transpose cache when using to…

6213cff

…rch FSDP2

Merge branch 'fsdp2_fp8_cache' into 'main'

6219d96

Disable the FP8 transpose cache when using torch FSDP2 See merge request ADLR/megatron-lm!2508

ADLR/megatron-lm!2445 - Port multimodal inference to MCore API

c5d8bfd

Merge branch 'multimodal_mcore_inference' into 'main'

a200b93

Port multimodal inference to MCore API See merge request ADLR/megatron-lm!2445

sanandaraj5597 and others added 30 commits March 8, 2025 04:37

ADLR/megatron-lm!2707 - Added option for parallel cross entropy

0f60adb

Co-authored-by: Selvaraj Anandaraj <[email protected]>

Merge branch 'parallel_cross_entropy' into 'main'

eaf9408

Added option for parallel cross entropy See merge request ADLR/megatron-lm!2707

ADLR/megatron-lm!2776 - Fix multi-rank inference

d82bf95

Merge branch 'helenn-fix-textgen-broadcast' into 'main'

b1efb3c

Fix multi-rank inference See merge request ADLR/megatron-lm!2776

ADLR/megatron-lm!2781 - Fix RNG tracker for inference

cabbf74

Merge branch 'rng_tracker_fix' into 'main'

32f8f3c

Fix RNG tracker for inference See merge request ADLR/megatron-lm!2781

ADLR/megatron-lm!2822 - NeMo SFT QAT fix: Make all_gathered_input c…

e14ae44

…ontiguous in `prepare_input_tensors_for_wgrad_compute` Co-authored-by: Jennifer Chen <[email protected]>

Merge branch 'jennifchen/input_reshape_fix' into 'main'

203e30a

NeMo SFT QAT fix: Make `all_gathered_input` contiguous in `prepare_input_tensors_for_wgrad_compute` See merge request ADLR/megatron-lm!2822

ADLR/megatron-lm!2624 - Only materialize logits for the last token du…

a2cbe79

…ring inference

Merge branch 'logits_optimization' into 'main'

9914c8f

Only materialize logits for the last token during inference See merge request ADLR/megatron-lm!2624

ADLR/megatron-lm!2622 - barebones radio g support

5bd16bc

Co-authored-by: Mcore Bot <[email protected]>

Merge branch 'tpoon/mcore_radio_g_mr' into 'main'

d49d027

barebones radio g support See merge request ADLR/megatron-lm!2622

ADLR/megatron-lm!2818 - build: Better caching

7734006

Merge branch 'ko3n1g/build/better-caching-2' into 'main'

da2f6cf

build: Better caching See merge request ADLR/megatron-lm!2818

ADLR/megatron-lm!2788 - chore: Benchmark for PyTorch 24.10 (Mcore 0.1…

a66f7ca

…1.0)

Merge branch 'ko3n1g/chore/0.11.0-PyT24.10' into 'main'

832f1d2

chore: Benchmark for PyTorch 24.10 (Mcore 0.11.0) See merge request ADLR/megatron-lm!2788

ADLR/megatron-lm!2779 - build: Bisect depedencies

cb43575

Merge branch 'ko3n1g/build/bisect-dependencies' into 'main'

49c0712

build: Bisect depedencies See merge request ADLR/megatron-lm!2779

ADLR/megatron-lm!2765 - Configurable FSDP modules

12bce5b

Merge branch 'trintamaki/vlm-hf-fsdp-modules' into 'main'

311bb18

Configurable FSDP modules See merge request ADLR/megatron-lm!2765

ADLR/megatron-lm!2714 - Workload Inspector on-demand profiling feature

53fd8e4

Co-authored-by: Rahul Kandu <[email protected]> Co-authored-by: Rahul Kandu <[email protected]>

Merge branch 'main' into 'main'

2e41028

Workload Inspector on-demand profiling feature See merge request ADLR/megatron-lm!2714

ADLR/megatron-lm!2621 - Change Mamba textgen server to use MCore infe…

94197e2

…rence

Merge branch 'helenn-mamba-textgen-server-mcore' into 'main'

8349451

Change Mamba textgen server to use MCore inference See merge request ADLR/megatron-lm!2621

ADLR/megatron-lm!2839 - ci: Publish analytics

846d015

Merge branch 'ko3n1g/ci/publish-dashboard' into 'main'

5218614

ci: Publish analytics See merge request ADLR/megatron-lm!2839

ADLR/megatron-lm!2835 - ci: Small improvements to release tests

cc49525

Merge branch 'ko3n1g/ci/improve-release-tests-2' into 'main'

477c40c

ci: Small improvements to release tests See merge request ADLR/megatron-lm!2835

ADLR/megatron-lm!2842 - ci: Upload statistics only for MRs

cd0502b

Merge branch 'ko3n1g/ci/fix-workflow' into 'main'

9e23679

ci: Upload statistics only for MRs See merge request ADLR/megatron-lm!2842

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fetch from nvidia Megatron-LM #5

Fetch from nvidia Megatron-LM #5

RaymondLi0 commented Aug 3, 2022

Fetch from nvidia Megatron-LM #5

Are you sure you want to change the base?

Fetch from nvidia Megatron-LM #5

Conversation

RaymondLi0 commented Aug 3, 2022