Weights Metrics #340

ElliotStein · 2024-06-03T11:28:07Z

Implemented:

Framework to compute metrics based on layer weights using existing mergekit infrastructure (run_measure is based on run_merge, metric_methods based on merge_methods etc).
plot_tools.MetricsHandler to load metrics output, process and interact with statistics.
plot_tools.ModelGraph to generate a graph to represent the model structure, with node level statistics visible by hovering over a node, and more detailed stats (histograms, rather than means) available by clicking on a node.
run_metrics.py ties everything together and generates an interactive dashboard displaying the ModelGraph graph.

Not Implemented:

Split layers into individual heads
Activation based metrics
Unit tests

…methods

… to plot (simple) interactive graph

…yers. heatmaps for other metrics.

…yer name

cg123 · 2024-06-16T22:26:39Z

run_metrics.py

This file should probably be in mergekit/scripts.

Also would be good to use click to turn the hardcoded values into arguments.

cg123 · 2024-06-16T22:27:50Z

mergekit/plot_tools/plot_tools.py

+from typing import List, Dict, Optional, Any, Tuple
+from mergekit.graph import Task
+import networkx as nx
+import plotly.graph_objects as go


We should capture these new dependencies in pyproject.toml. Probably under a feature, so headless installs don't need to bring them in.

cg123 · 2024-06-16T22:29:45Z

mergekit/plan.py

            )
-        finalize = FinalizeModel(


Totally fine to not do the finalize task when we're doing metrics, but this is needed for merges - I think as is this makes merges not write out correctly.

cg123 · 2024-06-16T22:30:37Z

mergekit/metric_methods/all_metrics.py

+        **_kwargs,
+    ) -> Task:
+
+        if 'self_attn' in output_weight.name:


Down the line we probably want this split to be done based on new fields in ArchitectureInfo but this is good for now!

cg123 · 2024-06-16T22:33:12Z

mergekit/metric_methods/all_metrics.py

+
+    res = {}
+
+    scale_diff = torch.abs(norm_0 - norm_1) / ((norm_0 + norm_1) / 2)


Should we be doing something here to guard against dividing by zero?

yep - norms are non-negative so adding small epsilon will be fine

cg123 · 2024-06-16T22:36:53Z

mergekit/architecture.py

@@ -53,6 +57,9 @@ class WeightInfo(BaseModel, frozen=True):
    aliases: Optional[Tuple[str, ...]] = None
    force_dtype: Optional[str] = None

+    GQA_groups: Optional[int] = None # None if not GQA, 1 if MQA, >1 if GQA


Should be gqa_groups

cg123 · 2024-06-16T22:38:48Z

mergekit/metric_methods/all_metrics.py

+                    num_heads=32 # hard-coded for now
+                )
+                self.block_count += 1
+                return AttnTask(weights=weights, weight_infos=infos, weight_info=weight_info)


Does this end up creating N AttnTasks for each block? I don't think it's actually a problem as the tasks will be deduplicated downstream - should be fine

Should only be one AttnTask for each block - the if statement on line 351 is only satisfied once all the tensors (K,Q,V,O) have been collected. Then self.attn_weight_dict is reset to {} and the (one) AttnTask is created. I might also add individual tensor metrics for comparing just the Qs, Vs etc, which would be simpler.

cg123 · 2024-06-16T22:40:32Z

mergekit/plan.py

-        self._method = merge_methods.get(config.merge_method)
+        if getattr(config, "merge_method", None):
+            self._method = merge_methods.get(config.merge_method)
+        elif getattr(config, "metric_method", None):


Would be good to add a validator to MergeConfig that checks that exactly one of these fields is set.

cg123 · 2024-06-16T22:41:58Z

mergekit/measure.py

+    )
+
+    res = []
+    for _task, value in exec.run(quiet=options.quiet):


Looking this over, I kinda think we might not need a separate file here - maybe it should just early out in merge.py if there's a metric_method set instead of merge_method?

cg123 · 2024-06-16T22:44:12Z

mergekit/graph.py

@@ -37,6 +37,7 @@ class Task(ABC, BaseModel, Generic[ValueT], frozen=True):
    Abstract base class representing a task in a computational graph.

    This class should be extended to define specific tasks. Each task can have arguments (dependencies) and a defined execution strategy.
+    Note that PyDantic BaseModel requires that all attributes are defined in the class initialisation, and cannot be changed after. 


Super nitpick here: I think the official capitalization is Pydantic, not PyDantic.

…rge OR Metri, not both.

…guments

… to separate case

…eralised substitute function in architecture

…del weights)

[email protected] added 13 commits May 30, 2024 15:32

introduced metric_methods, closely following implementation of merge_…

c2b0f06

…methods

measure.py closely follows metric.py to apply chosen metrics

64e54bf

plot tools include class to handle output of run_metrics, and a class…

7e37d69

… to plot (simple) interactive graph

only minor changes required to existing mergekit code

2effd41

only minor changes required to existing mergekit

ed59d9b

Implemented interactive dashboard for metrics visualisation

564d45c

Remove single-model stats for now. Bring MSE into all_metrics

e530568

Introduce attention weights and restructure dashboard

207e874

refine implementation of attention metrics, add line plots to dashboard

b30175e

More restructuring, more seamless integration of attention and mlp la…

dfc0603

…yers. heatmaps for other metrics.

vectorise heatmap computation

f88f904

Address issue with lexicographical sort by adding leading zeros to la…

f89df57

…yer name

remove unnecessary import from last commit

74c5d33

cg123 reviewed Jun 16, 2024

View reviewed changes

[email protected] added 16 commits June 17, 2024 11:46

add validation check to ensure MergeConfiguration method is either Me…

7c209d2

…rge OR Metri, not both.

moved run_metrics and use click to enable commandline control over ar…

db65f83

…guments

rename example config

3fe66a8

correct case for gqa_group name

62692d1

replace measure with merge + early out

2a0c520

guard against divide by zero

f71e360

restore plan_to_disk functionality for merging. Move metrics planning…

7e14266

… to separate case

add optional interactive plot packages

0f3430f

minor cleanup

8d68e39

Merge remote-tracking branch 'upstream/main'

59e23fe

Merge branch 'main' into weights_metrics

89ecbf5

Add GQA info to (llama) architecture and refactor

404e395

Pass GQA info from architecture json all the way to attn metrics. Gen…

8e3c861

…eralised substitute function in architecture

re-organised and simplified dashboard view

a0e8c27

colour-categorise lineplot points by layertime

7e2b552

restructure and refactor results and plotting

69c3b15

ElliotStein added 30 commits July 9, 2024 10:22

abstracted and add skip block analysis

9c87514

refactor and restructure

4ce67b0

clean up imports and remove hard coding

88ed18e

tidy up tqdm

1041298

improve robustness of load and save using pathlib

01d5a2b

reintroduced heatmap functionality

f64a71e

allow for plot keyworks to be passed into Heatmap object

8b05c2a

example config

b8f7e32

improve implementation consistency

c4df572

experimental linearity score metric

5e051a8

add matplotlib to optional dependencies

bb8fce2

Merge remote-tracking branch 'upstream/main' into weights_metrics

9c92efd

remove quantisation and update environment reqs

e1c1ecd

tidy up and add missing dependency

9db9d3b

Major restructuring of Results, Results handling, metrics

af2bf1a

address alphanumeric representation layer name ordering issue

6045980

MAJOR RESTRUCTURE of results, results handler and representation metrics

cb5e31a

tidy up

dfdcb00

further restructuring

7bc8001

further refinements to representations experiment

a5c5811

added necessary imports and modules

3d3fd33

minor restructuring and refactoring

2dfa4f9

Bug fixes

60726d4

visualisation fixes

891bbb4

address model path naming issue (but only for representations! Not mo…

04b64e8

…del weights)

minor fix

1b5a3c0

More fixes, end-to-end tested and working

6bd12e7

refactor folder name

5ac4c78

Implement CKNNA and PCA visualisation

68a6a30

fixes to CKNNA and tidy up

20dd0f6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weights Metrics #340

Weights Metrics #340

ElliotStein commented Jun 3, 2024

cg123 Jun 16, 2024

cg123 Jun 16, 2024

cg123 Jun 16, 2024

cg123 Jun 16, 2024

cg123 Jun 16, 2024

cg123 Jun 16, 2024

ElliotStein Jun 17, 2024

cg123 Jun 16, 2024

cg123 Jun 16, 2024

ElliotStein Jun 17, 2024

cg123 Jun 16, 2024

cg123 Jun 16, 2024

cg123 Jun 16, 2024


		res = {}

		scale_diff = torch.abs(norm_0 - norm_1) / ((norm_0 + norm_1) / 2)

Weights Metrics #340

Are you sure you want to change the base?

Weights Metrics #340

Conversation

ElliotStein commented Jun 3, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment