[tt-train] Add RMSNorm module #16991

jaykru-tt · 2025-01-22T21:56:55Z

Problem description

We need RMSNorm to train Llama 3 and some other exciting open source models.

What's changed

Added sqrt and matmul ops with backward to support RMS
Added RMS op (defined as composite of existing ops and the new ops mentioned above)
Added RMS module

Checklist

Post commit CI passes
Blackhole Post commit (if applicable)
Model regression CI testing passes (if applicable)
Device performance regression CI testing passes (if applicable)
(For models and ops writers) Full new models tests passes
New/Existing tests provide coverage for changes

dmakoviichuk-tt · 2025-01-23T20:51:03Z

tt-train/sources/ttml/modules/rms_norm_module.hpp

+
+class RMSNormLayer : public autograd::ModuleBase {
+private:
+    float m_epsilon;


please don't forgeet default initialization.

dmakoviichuk-tt · 2025-01-23T20:52:18Z

tt-train/sources/ttml/modules/rms_norm_module.hpp

+
+public:
+    void initialize_tensors(uint32_t features);
+    explicit RMSNormLayer(uint32_t features, std::optional<float> epsilon = std::nullopt);


I think we always need optional. Overall I am not a fun of active using of the std::optional if not really needed.

dmakoviichuk-tt · 2025-01-23T20:53:32Z

tt-train/sources/ttml/ops/binary_ops.cpp

+        /* program_config */ std::nullopt,
+        /* activation */ std::nullopt,
+        /* compute_kernel_config */ core::ComputeKernelConfig::matmul(),
+        /* core_grid */ std::nullopt,  // NOTE: I believe matmul will use the


let better reuse our default grid parameters for now. Also is it a copypaste of the same function in the linear? if yes you probably can put it somewhere else to avoid copypasting.

dmakoviichuk-tt · 2025-01-23T20:54:24Z

tt-train/sources/ttml/ops/binary_ops.cpp

+        auto grad_a = ttnn_matmul(
+            out->get_grad(),
+            b->get_value(),
+            /* transpose_a */ false,


are you sure you don't need to change this params depending on transpose_a, transpose_b

dmakoviichuk-tt · 2025-01-23T20:55:42Z

tt-train/sources/ttml/ops/rmsnorm_op.cpp

+        /* transpose_a */ false,
+        /* transpose_b */ true);
+    auto eps_tensor =
+        autograd::create_tensor(core::from_xtensor(xt::xarray<float>{epsilon}, &autograd::ctx().get_device()));


better to implement our op which takes tensor and scalar then create an even small tensor every step.

rfurko-tt · 2025-01-24T00:03:24Z

tt-train/sources/ttml/ops/binary_ops.cpp

+        /* output_tile */ std::nullopt);
+}
+
+autograd::TensorPtr matmul(


why do we need matmul here? :)

rfurko-tt · 2025-01-28T17:47:09Z

tt-train/sources/ttml/ops/binary_ops.cpp

+            auto a_shape = a.get_logical_shape();
+            auto b_shape = b.get_logical_shape();
+
+            auto suffix_len = std::min(a_shape.size(), b_shape.size());


isn't it unsigned type? if yes, -suffix_len looks suspicious. Anyway, i would advise to cast it signed in this case (even if size returns signed value for now, because in future it should change)

jaykru-tt added 5 commits January 22, 2025 21:54

#0: Fix broken build due to taskflow change

13307ed

#0: remove redundant taskflow package alias

9ecfc43

#0: add sqrt and matmul with backward

1f9179d

#0: add rmsnorm op

e29bafd

#0: add rmsnorm module

7d38670

jaykru-tt changed the title ~~Jkruer/rmsnorm~~ [tt-train] Add RMSNorm module Jan 22, 2025

dmakoviichuk-tt reviewed Jan 23, 2025

View reviewed changes

rfurko-tt reviewed Jan 24, 2025

View reviewed changes

jaykru-tt added 15 commits January 24, 2025 21:45

#0: tests for sqrt and sum

ee3da62

#0: address some comments

8eb9654

#0: more generic overload for moreh_sum, add sum op using moreh sum.

57c1dfe

#0: fix matmul grad to account for transpose on inputs

b86e42e

#0: add test for rmsnorm

9c7d8ac

#0: rmsnorm fixes

5d0b6e2

#0: Add float division ops to ttml

a2c82a8

#0: Add a test for broadcasted tensor add

afc754f

#0: Add tests for eltwise tensor mul and div by float

03a5e73

#0: add squeeze/unsqueeze utils

3110486

#0: add squeeze to all_includes

0748e52

#0: rmsnorm fixes

14f97b4

#0: sum grad fix

53ca8b5

#0: fix copyright year

753f4bc

#0: provisional broadcast support for mul backward

2b6df8f

rfurko-tt reviewed Jan 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tt-train] Add RMSNorm module #16991

[tt-train] Add RMSNorm module #16991

jaykru-tt commented Jan 22, 2025 •

edited

Loading

dmakoviichuk-tt Jan 23, 2025

dmakoviichuk-tt Jan 23, 2025

dmakoviichuk-tt Jan 23, 2025

dmakoviichuk-tt Jan 23, 2025

dmakoviichuk-tt Jan 23, 2025

rfurko-tt Jan 24, 2025

rfurko-tt Jan 28, 2025

[tt-train] Add RMSNorm module #16991

Are you sure you want to change the base?

[tt-train] Add RMSNorm module #16991

Conversation

jaykru-tt commented Jan 22, 2025 • edited Loading

Problem description

What's changed

Checklist

dmakoviichuk-tt Jan 23, 2025

Choose a reason for hiding this comment

dmakoviichuk-tt Jan 23, 2025

Choose a reason for hiding this comment

dmakoviichuk-tt Jan 23, 2025

Choose a reason for hiding this comment

dmakoviichuk-tt Jan 23, 2025

Choose a reason for hiding this comment

dmakoviichuk-tt Jan 23, 2025

Choose a reason for hiding this comment

rfurko-tt Jan 24, 2025

Choose a reason for hiding this comment

rfurko-tt Jan 28, 2025

Choose a reason for hiding this comment

jaykru-tt commented Jan 22, 2025 •

edited

Loading