Initial stab at string based config parser #1774

drisspg · 2025-02-25T04:54:47Z

Stacked PRs:

->Initial stab at string based config parser #1774

Configuration String Parser for TorchAO Quantization

This document explains how this prototype example string-based configuration system could* work for TorchAO quantization, including the parsing process and how to extend it for future needs.

Configuration String Schema

The configuration strings follow this general pattern:
<base_config>_<param1>_<param2>_...
Where:

<base_type> defines the fundamental quantization config (e.g., Int8WeightOnlyConfig, Float8DynamicActivationFloat8WeightConfig)
Each <param> adds specific configurations like bit width, group size, or data type

Base Types

The following base types are supported:

Base Type	Configuration Class
`int4wo`	`Int4WeightOnlyConfig`
`int8wo`	`Int8WeightOnlyConfig`
`int8dqint4`	`Int8DynamicActivationInt4WeightConfig`
`int8dqint8`	`Int8DynamicActivationInt8WeightConfig`
`int4dqint4`	`Int4DynamicActivationInt4WeightConfig`
`float8wo`	`Float8WeightOnlyConfig`
`float8dqfloat8`	`Float8DynamicActivationFloat8WeightConfig`
`uintxwo`	`UIntXWeightOnlyConfig`

Parameter Tokens

WIP: Need to finalize on what params we think we should support and which ones we shouldn't as well as naming scheme

Parameters are specified as tokens after the base type, separated by underscores. Each parameter has its own format:

Token Pattern	Example	Description	Parameter Name
`<N>bit`	`4bit`	Specifies bit width	`bits`
`g<N>`	`g32`	Specifies group size	`group_size`
`sym` or `asym`	`sym`	Symmetry type	`mapping_type`
`int4`, `int8`, `uint4`, `uint8`, `e4m3`, `e5m2`	`int8`	Data type	`dtype` or `weight_dtype`
`per_row`	`per_row`	Per-row quantization	`per_row`

The Parsing Process

The parsing process follows these steps:

The input string is split by underscores (_) into tokens
The first token determines the base quantization configuration type
Each subsequent token is matched against regex patterns to extract parameter values
The configuration object is instantiated with the extracted parameters
Error handling provides informative messages about invalid or unrecognized parameters

Example Parsing

Example configuration string: int8dqint4_g32_sym

Split into tokens: ["int8dqint4", "g32", "sym"]
First token int8dqint4 maps to Int8DynamicActivationInt4WeightConfig
Token g32 matches the pattern g(\d+) and sets group_size=32
Token sym matches the pattern (sym|asym) and sets mapping_type=MappingType.SYMMETRIC
Instantiate Int8DynamicActivationInt4WeightConfig(group_size=32, mapping_type=MappingType.SYMMETRIC)

Extending the Parser

You can extend the parser in several ways:

1. Add New Base Configuration Types

To add a new base configuration type:

# 1. Import your new config class
from torchao.quantization.quant_api import YourNewConfig

# 2. Add it to the type_mapping dictionary in ConfigParser
ConfigParser.type_mapping["yournewtype"] = YourNewConfig

2. Add New Parameter Types

To add a new parameter type:

# 1. Define a new parameter processor function
def process_your_param(match: re.Match, quant_config: Type[AOBaseConfig]) -> Tuple[str, Any]:
    return "your_param_name", process_value(match)

# 2. Add the regex pattern and processor to the param_patterns dictionary
ConfigParser.param_patterns[re.compile(r"your_pattern")] = process_your_param

3. Create Special Processing Logic

If you need special handling based on the config type:

def process_special_param(match: re.Match, quant_config: AOBaseConfig) -> Tuple[str, Any]:
    if issubclass(quant_config, YourSpecialConfig):
        # Special handling for this config type
        return "special_param", special_value(match.group(1))
    else:
        # Regular handling
        return "regular_param", regular_value(match.group(1))

Best Practices for Extensions

When extending the parser:

Well-defined patterns: Create regex patterns that are specific and won't conflict with existing patterns
Descriptive errors: Provide helpful error messages when parameters are invalid
Type safety: Ensure your processor returns the correct types expected by the config classes
Documentation: Update documentation to include your new base types or parameters
Testing: Add tests for your new patterns and configurations

Complete Example

To add support for a new "fast" parameter that enables faster computation:

# 1. Define processor function
def process_fast_mode(match: re.Match, quant_config: Type[AOBaseConfig]) -> Tuple[str, Any]:
    return "fast_mode", True

# 2. Add to param_patterns
ConfigParser.param_patterns[re.compile(r"fast")] = process_fast_mode

# Now you can use strings like "int8wo_g32_fast"

Open Questions

1. Parameter Selection and Naming

Which parameters should be exposed in the string interface?
Should we prioritize brevity (g32) or clarity (group32)?
How to handle parameters only applicable to specific base types?

2. Parameter Validation

How to handle conflicting parameters? For instance it is possible to have two parameter patterns that want similiar string representations. I kind of side stepped this by passing in the matched quant config

3. Default Values

How to communicate defaults to users?
We are currently relying on the fact that most (all?) configs have defaults and just utilizing those

4. Extensibility

I think that third-party users might and should be able to extend slash build on this. E.g. Does that make sense?
What's our deprecation strategy for parameters/base types?

6. Backward Compatibility

How to evolve the string format over time?

Parsing Flow Diagram

flowchart TD
    Start([Input String]) --> Split[Split by underscores]
    Split --> FirstToken[Extract First Token]
    FirstToken --> ConfigType[Map to Config Class]
    Split --> RemainingTokens[Process Remaining Tokens]
    
    RemainingTokens --> ParamLoop{For each token}
    ParamLoop --> MatchPatterns[Match against regex patterns]
    MatchPatterns --> ProcessMatch[Process match using\nappropriate processor]
    ProcessMatch --> ParamDict[Add to parameters dictionary]
    ParamDict --> ParamLoop
    
    ConfigType --> Instantiate[Instantiate Config Class]
    ParamLoop -- All tokens processed --> Instantiate
    Instantiate --> ValidateFields[Validate Fields]
    ValidateFields --> End([Return Config Object])
    
    ValidateFields -- Invalid params --> ErrorHandling[Error Handling]
    MatchPatterns -- No match --> TokenError[Unrecognized Parameter Error]

stack-info: PR: #1774, branch: drisspg/stack/39

pytorch-bot · 2025-02-25T04:54:50Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1774

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 7 New Failures

As of commit ed64709 with merge base 38e36de ():

NEW FAILURES - The following jobs have failed:

Run Regression Tests / test (CPU 2.3, linux.4xlarge, torch==2.3.0 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
test/quantization/test_config_parser.py::TestConfigParser::test_uintxwo_config
Run Regression Tests / test (CPU 2.4, linux.4xlarge, torch==2.4.0 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
test/quantization/test_config_parser.py::TestConfigParser::test_uintxwo_config
Run Regression Tests / test (CPU 2.5.1, linux.4xlarge, torch==2.5.1 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
test/quantization/test_config_parser.py::TestConfigParser::test_uintxwo_config
Run Regression Tests / test (CUDA 2.3, linux.g5.12xlarge.nvidia.gpu, torch==2.3.0, cuda, 12.1) / linux-job (gh)
test/quantization/test_config_parser.py::TestConfigParser::test_uintxwo_config
Run Regression Tests / test (CUDA 2.4, linux.g5.12xlarge.nvidia.gpu, torch==2.4.0, cuda, 12.1) / linux-job (gh)
test/quantization/test_config_parser.py::TestConfigParser::test_uintxwo_config
Run Regression Tests / test (CUDA 2.5.1, linux.g5.12xlarge.nvidia.gpu, torch==2.5.1 --index-url https://download.pytorch... / linux-job (gh)
test/quantization/test_config_parser.py::TestConfigParser::test_uintxwo_config
Run TorchAO Experimental Tests / test (macos-14) (gh)
[ FAILED ] test_linear_8bit_act_xbit_weight.Kleidi_dotprod_1x8x32_m101xn34xk128xg64

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vkuzo · 2025-02-25T14:34:42Z

Didn't read the code yet, but a couple of high level questions:

can we handle nested params? FooConfig(BarConfig(baz=3))?
can we handle tensors as arguments StaticQuantConfig(scale=torch.tensor(1.0, device="cuda"))
can we put this logic on AOBaseConfig so all the configs get it for free (I did no research on this, but if this is possible in a clean way it would be nice)

andrewor14 · 2025-02-25T17:05:35Z

My high-level thoughts are:

The point of a string API is to have quick shorthands for common default configurations, e.g. int4wo. If users want to configure different parameters they should just use the config objects instead.
Encoding parameters in the string like int8dqint4_g32_sym gets complicated quickly. E.g. does "gs32" or "sym" refer to weights or activations? What if we want asymmetric activations + symmetric weights, what does "sym" mean in this case? Also then you don't need the complex parser. IMO the complexity is just not worth it
What strings do other frameworks use? E.g. I've seen W4A8 used in a lot of places. Should we follow that format or do our own thing here? We could also have multiple strings map to the same config

vkuzo · 2025-02-25T17:13:45Z

The point of a string API is to have quick shorthands for common default configurations, e.g. int4wo. If users want to configure different parameters they should just use the config objects instead.

I'd change that to "one benefit, from several, of a string API is quick shorthand...". One other benefit is serialization/deserialization, i.e. being able to go between strings and Python objects.

drisspg · 2025-02-25T17:36:13Z

@vkuzo

can we handle nested params? FooConfig(BarConfig(baz=3))?

Not as implemented unless you define your param matcher so that it clearly disambiguates the two

can we handle tensors as arguments StaticQuantConfig(scale=torch.tensor(1.0, device="cuda"))

You could write a param-matcher to do this. But feels kinda weird or at least I dont really know a good notation for doing this besides a few special matrices

can we put this logic on AOBaseConfig so all the configs get it for free (I did no research on this, but if this is possible in a clean way it would be nice)

I went back and fourth on this. We could enforce that AoBaseconfigs define their "string_form"

@andrewor14

The point of a string API is to have quick shorthands for common default configurations, e.g. int4wo. If users want to configure different parameters they should just use the config objects instead.

I very much agree w/ this and thus why I only added a few params its very easy to add 0 params and just enforce the string map and you only get default values.

Encoding parameters in the string like int8dqint4_g32_sym gets complicated quickly. E.g. does "gs32" or "sym" refer to weights or activations? What if we want asymmetric activations + symmetric weights, what does "sym" mean in this case?

Agree I played w/ having a "wSym" "wAsym" "aSym" "aAsym" +1 on complexity

What strings do other frameworks use? E.g. I've seen W4A8 used in a lot of places. Should we follow that format or do our own thing here? We could also have multiple strings map to the same config

Yeah listed this as follow up. Not tottally sure what you mean by multiple strings map to same config. As stated here that is possible since we dont enforce a param order. But if just had base_config + dtypes then I think we should not allow for multiple strings

Initial stab at string based config parser

ed64709

stack-info: PR: #1774, branch: drisspg/stack/39

drisspg force-pushed the drisspg/stack/39 branch from c9f0b11 to ed64709 Compare February 25, 2025 04:54

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 25, 2025

drisspg added topic: new feature Use this tag if this PR adds a new feature enhancement New feature or request labels Feb 25, 2025

drisspg marked this pull request as draft February 25, 2025 05:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial stab at string based config parser #1774

Initial stab at string based config parser #1774

drisspg commented Feb 25, 2025 •

edited

Loading

pytorch-bot bot commented Feb 25, 2025 •

edited

Loading

vkuzo commented Feb 25, 2025

andrewor14 commented Feb 25, 2025

vkuzo commented Feb 25, 2025

drisspg commented Feb 25, 2025

Initial stab at string based config parser #1774

Are you sure you want to change the base?

Initial stab at string based config parser #1774

Conversation

drisspg commented Feb 25, 2025 • edited Loading

Configuration String Parser for TorchAO Quantization

Configuration String Schema

Base Types

Parameter Tokens

The Parsing Process

Example Parsing

Extending the Parser

1. Add New Base Configuration Types

2. Add New Parameter Types

3. Create Special Processing Logic

Best Practices for Extensions

Complete Example

Open Questions

1. Parameter Selection and Naming

2. Parameter Validation

3. Default Values

4. Extensibility

6. Backward Compatibility

Parsing Flow Diagram

pytorch-bot bot commented Feb 25, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1774

❌ 7 New Failures

vkuzo commented Feb 25, 2025

andrewor14 commented Feb 25, 2025

vkuzo commented Feb 25, 2025

drisspg commented Feb 25, 2025

drisspg commented Feb 25, 2025 •

edited

Loading

pytorch-bot bot commented Feb 25, 2025 •

edited

Loading