Added TruncatedSVD module #302

norm4nn · 2024-09-28T22:07:52Z

I've added the TruncatedSVD transformer. Currently, the only implemented option for the algorithm parameter is randomized. The arpack algorithm is yet to be implemented and it can be added in the future. I believe this module will be a valuable addition, as the PaCMAP algorithm, which is referenced in issue #238, utilizes this method.

For reference, here's the Scikit-learn implementation: Scikit-learn Truncated SVD implementation.

lib/scholar/decomposition/pca.ex

Co-authored-by: José Valim <[email protected]>

msluszniak

Looks really good, just small comments

msluszniak · 2024-09-30T21:22:26Z

lib/scholar/decomposition/truncated_svd.ex

+
+  defnp fit_transform_n(x, opts) do
+    module = fit_n(x, opts)
+    Nx.dot(x, Nx.transpose(module.components))


Use here Nx.dot/4 which automatically transposes particular tensors, depending on axes you provided

msluszniak · 2024-09-30T21:24:22Z

lib/scholar/decomposition/truncated_svd.ex

+    {q, _a, _a_t, _i, _n_iter} =
+      while {q, a, a_t, i = Nx.tensor(1), n_iter}, Nx.less(i, n_iter) do
+        {q, _} = Nx.LinAlg.qr(Nx.dot(a, q))
+        {q, _} = Nx.LinAlg.qr(Nx.dot(a_t, q))
+        {q, a, a_t, i + 1, n_iter}
+      end


Suggested change

{q, _a, _a_t, _i, _n_iter} =

while {q, a, a_t, i = Nx.tensor(1), n_iter}, Nx.less(i, n_iter) do

{q, _} = Nx.LinAlg.qr(Nx.dot(a, q))

{q, _} = Nx.LinAlg.qr(Nx.dot(a_t, q))

{q, a, a_t, i + 1, n_iter}

end

{q, _} =

while {q, {a, a_t, i = Nx.tensor(1), n_iter}}, Nx.less(i, n_iter) do

{q, _} = Nx.LinAlg.qr(Nx.dot(a, q))

{q, _} = Nx.LinAlg.qr(Nx.dot(a_t, q))

{q, {a, a_t, i + 1, n_iter}}

end

msluszniak · 2024-09-30T21:26:02Z

lib/scholar/decomposition/truncated_svd.ex

+    {u, sigma, vt} = randomized_svd(x, opts)
+    {_u, vt} = Scholar.Decomposition.PCA.flip_svd(u, vt)
+
+    x_transformed = Nx.dot(x, Nx.transpose(vt))


Same as below, use Nx.dot/4 here

krstopro

Looks good so far. I added few comments, will probably have another look soon.

krstopro · 2024-10-02T10:34:55Z

lib/scholar/decomposition/pca.ex

@@ -471,7 +471,8 @@ defmodule Scholar.Decomposition.PCA do
    end
  end

-  defnp flip_svd(u, v) do
+  @doc false
+  defn flip_svd(u, v) do


I would pull this out in a separate module in the same folder, e.g. Scholar.Decomposition.Utils or Scholar.Decomposition.Shared. See how it's done in Scholar.Neighbors.Utils.

krstopro · 2024-10-02T10:37:46Z

lib/scholar/decomposition/truncated_svd.ex

+  ]
+
+  tsvd_schema = [
+    n_components: [


I would rename this to num_components to be consistent with the rest of Scholar.

krstopro · 2024-10-02T10:38:00Z

lib/scholar/decomposition/truncated_svd.ex

+      type: :pos_integer,
+      doc: "Desired dimensionality of output data."
+    ],
+    n_iter: [


Same here, num_iters.

krstopro · 2024-10-02T10:38:10Z

lib/scholar/decomposition/truncated_svd.ex

+      type: :pos_integer,
+      doc: "Number of iterations for randomized SVD solver."
+    ],
+    n_oversamples: [


And same here, num_oversamples.

krstopro

One more suggestion.

You should also check that num_components is less than or equal to num_samples.

krstopro · 2024-10-03T19:47:24Z

lib/scholar/decomposition/truncated_svd.ex

+      type: :pos_integer,
+      doc: "Number of oversamples for randomized SVD solver."
+    ],
+    seed: [


I think you should pass key here, which is of type {:custom, Scholar.Options, :key}.
See how it's done in e.g. Scholar.Clustering.KMeans.

polvalente · 2024-10-03T21:44:22Z

lib/scholar/decomposition/truncated_svd.ex

+    q_t = Nx.transpose(q)
+    b = Nx.dot(q_t, m)


Since q_t is being used only for the Nx.dot call below, you can skip calculating it by using Nx.dot/4: b = Nx.dot(q, [-2], m, [-2])

josevalim · 2024-10-04T15:03:01Z

💚 💙 💜 💛 ❤️

norm4nn added 5 commits September 28, 2024 21:37

Added TruncatedSVD module

c395798

Merge branch 'main' of https://github.com/elixir-nx/scholar into tsvd

d6101d5

Removed transpose option

371e27f

removed duplicated svd_flip fun

2648420

added TruncatedSVD

d580c53

josevalim reviewed Sep 29, 2024

View reviewed changes

lib/scholar/decomposition/pca.ex Outdated Show resolved Hide resolved

josevalim requested review from krstopro and msluszniak September 29, 2024 21:09

Update lib/scholar/decomposition/pca.ex

aa30ade

Co-authored-by: José Valim <[email protected]>

msluszniak reviewed Sep 30, 2024

View reviewed changes

norm4nn added 2 commits October 1, 2024 23:53

Addressed review comments

6b57e1a

Merge branch 'tsvd' of https://github.com/norm4nn/scholar into tsvd

90856f5

krstopro reviewed Oct 2, 2024

View reviewed changes

norm4nn added 2 commits October 2, 2024 20:30

refactored option names, added utils

44b479f

run mix format

936d3f7

josevalim approved these changes Oct 3, 2024

View reviewed changes

msluszniak approved these changes Oct 3, 2024

View reviewed changes

krstopro reviewed Oct 3, 2024

View reviewed changes

polvalente reviewed Oct 3, 2024

View reviewed changes

Fixed code according to comments

8e06b8d

josevalim merged commit 2a601cc into elixir-nx:main Oct 4, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added TruncatedSVD module #302

Added TruncatedSVD module #302

norm4nn commented Sep 28, 2024

msluszniak left a comment

msluszniak Sep 30, 2024

msluszniak Sep 30, 2024

msluszniak Sep 30, 2024

krstopro left a comment

krstopro Oct 2, 2024

krstopro Oct 2, 2024 •

edited

Loading

krstopro Oct 2, 2024

krstopro Oct 2, 2024

krstopro left a comment •

edited

Loading

krstopro Oct 3, 2024

polvalente Oct 3, 2024

josevalim commented Oct 4, 2024

Added TruncatedSVD module #302

Added TruncatedSVD module #302

Conversation

norm4nn commented Sep 28, 2024

msluszniak left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krstopro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krstopro Oct 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krstopro left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josevalim commented Oct 4, 2024

krstopro Oct 2, 2024 •

edited

Loading

krstopro left a comment •

edited

Loading