Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial stab at string based config parser #1774

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

drisspg
Copy link
Contributor

@drisspg drisspg commented Feb 25, 2025

Stacked PRs:


Configuration String Parser for TorchAO Quantization

This document explains how this prototype example string-based configuration system could* work for TorchAO quantization, including the parsing process and how to extend it for future needs.

Configuration String Schema

The configuration strings follow this general pattern:
<base_config>_<param1>_<param2>_...
Where:

  • <base_type> defines the fundamental quantization config (e.g., Int8WeightOnlyConfig, Float8DynamicActivationFloat8WeightConfig)
  • Each <param> adds specific configurations like bit width, group size, or data type

Base Types

The following base types are supported:

Base Type Configuration Class
int4wo Int4WeightOnlyConfig
int8wo Int8WeightOnlyConfig
int8dqint4 Int8DynamicActivationInt4WeightConfig
int8dqint8 Int8DynamicActivationInt8WeightConfig
int4dqint4 Int4DynamicActivationInt4WeightConfig
float8wo Float8WeightOnlyConfig
float8dqfloat8 Float8DynamicActivationFloat8WeightConfig
uintxwo UIntXWeightOnlyConfig

Parameter Tokens

WIP: Need to finalize on what params we think we should support and which ones we shouldn't as well as naming scheme

Parameters are specified as tokens after the base type, separated by underscores. Each parameter has its own format:

Token Pattern Example Description Parameter Name
<N>bit 4bit Specifies bit width bits
g<N> g32 Specifies group size group_size
sym or asym sym Symmetry type mapping_type
int4, int8, uint4, uint8, e4m3, e5m2 int8 Data type dtype or weight_dtype
per_row per_row Per-row quantization per_row

The Parsing Process

The parsing process follows these steps:

  1. The input string is split by underscores (_) into tokens
  2. The first token determines the base quantization configuration type
  3. Each subsequent token is matched against regex patterns to extract parameter values
  4. The configuration object is instantiated with the extracted parameters
  5. Error handling provides informative messages about invalid or unrecognized parameters

Example Parsing

Example configuration string: int8dqint4_g32_sym

  1. Split into tokens: ["int8dqint4", "g32", "sym"]
  2. First token int8dqint4 maps to Int8DynamicActivationInt4WeightConfig
  3. Token g32 matches the pattern g(\d+) and sets group_size=32
  4. Token sym matches the pattern (sym|asym) and sets mapping_type=MappingType.SYMMETRIC
  5. Instantiate Int8DynamicActivationInt4WeightConfig(group_size=32, mapping_type=MappingType.SYMMETRIC)

Extending the Parser

You can extend the parser in several ways:

1. Add New Base Configuration Types

To add a new base configuration type:

# 1. Import your new config class
from torchao.quantization.quant_api import YourNewConfig

# 2. Add it to the type_mapping dictionary in ConfigParser
ConfigParser.type_mapping["yournewtype"] = YourNewConfig

2. Add New Parameter Types

To add a new parameter type:

# 1. Define a new parameter processor function
def process_your_param(match: re.Match, quant_config: Type[AOBaseConfig]) -> Tuple[str, Any]:
    return "your_param_name", process_value(match)

# 2. Add the regex pattern and processor to the param_patterns dictionary
ConfigParser.param_patterns[re.compile(r"your_pattern")] = process_your_param

3. Create Special Processing Logic

If you need special handling based on the config type:

def process_special_param(match: re.Match, quant_config: AOBaseConfig) -> Tuple[str, Any]:
    if issubclass(quant_config, YourSpecialConfig):
        # Special handling for this config type
        return "special_param", special_value(match.group(1))
    else:
        # Regular handling
        return "regular_param", regular_value(match.group(1))

Best Practices for Extensions

When extending the parser:

  1. Well-defined patterns: Create regex patterns that are specific and won't conflict with existing patterns
  2. Descriptive errors: Provide helpful error messages when parameters are invalid
  3. Type safety: Ensure your processor returns the correct types expected by the config classes
  4. Documentation: Update documentation to include your new base types or parameters
  5. Testing: Add tests for your new patterns and configurations

Complete Example

To add support for a new "fast" parameter that enables faster computation:

# 1. Define processor function
def process_fast_mode(match: re.Match, quant_config: Type[AOBaseConfig]) -> Tuple[str, Any]:
    return "fast_mode", True

# 2. Add to param_patterns
ConfigParser.param_patterns[re.compile(r"fast")] = process_fast_mode

# Now you can use strings like "int8wo_g32_fast"

Open Questions

1. Parameter Selection and Naming

  • Which parameters should be exposed in the string interface?
  • Should we prioritize brevity (g32) or clarity (group32)?
  • How to handle parameters only applicable to specific base types?

2. Parameter Validation

  • How to handle conflicting parameters? For instance it is possible to have two parameter patterns that want similiar string representations. I kind of side stepped this by passing in the matched quant config

3. Default Values

  • How to communicate defaults to users?
  • We are currently relying on the fact that most (all?) configs have defaults and just utilizing those

4. Extensibility

  • I think that third-party users might and should be able to extend slash build on this. E.g. Does that make sense?
  • What's our deprecation strategy for parameters/base types?

6. Backward Compatibility

  • How to evolve the string format over time?

Parsing Flow Diagram

flowchart TD
    Start([Input String]) --> Split[Split by underscores]
    Split --> FirstToken[Extract First Token]
    FirstToken --> ConfigType[Map to Config Class]
    Split --> RemainingTokens[Process Remaining Tokens]
    
    RemainingTokens --> ParamLoop{For each token}
    ParamLoop --> MatchPatterns[Match against regex patterns]
    MatchPatterns --> ProcessMatch[Process match using\nappropriate processor]
    ProcessMatch --> ParamDict[Add to parameters dictionary]
    ParamDict --> ParamLoop
    
    ConfigType --> Instantiate[Instantiate Config Class]
    ParamLoop -- All tokens processed --> Instantiate
    Instantiate --> ValidateFields[Validate Fields]
    ValidateFields --> End([Return Config Object])
    
    ValidateFields -- Invalid params --> ErrorHandling[Error Handling]
    MatchPatterns -- No match --> TokenError[Unrecognized Parameter Error]
Loading

stack-info: PR: #1774, branch: drisspg/stack/39
Copy link

pytorch-bot bot commented Feb 25, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1774

Note: Links to docs will display an error until the docs builds have been completed.

❌ 7 New Failures

As of commit ed64709 with merge base 38e36de (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 25, 2025
@drisspg drisspg added topic: new feature Use this tag if this PR adds a new feature enhancement New feature or request labels Feb 25, 2025
@drisspg drisspg marked this pull request as draft February 25, 2025 05:17
@vkuzo
Copy link
Contributor

vkuzo commented Feb 25, 2025

Didn't read the code yet, but a couple of high level questions:

  1. can we handle nested params? FooConfig(BarConfig(baz=3))?
  2. can we handle tensors as arguments StaticQuantConfig(scale=torch.tensor(1.0, device="cuda"))
  3. can we put this logic on AOBaseConfig so all the configs get it for free (I did no research on this, but if this is possible in a clean way it would be nice)

@andrewor14
Copy link
Contributor

My high-level thoughts are:

  • The point of a string API is to have quick shorthands for common default configurations, e.g. int4wo. If users want to configure different parameters they should just use the config objects instead.
  • Encoding parameters in the string like int8dqint4_g32_sym gets complicated quickly. E.g. does "gs32" or "sym" refer to weights or activations? What if we want asymmetric activations + symmetric weights, what does "sym" mean in this case? Also then you don't need the complex parser. IMO the complexity is just not worth it
  • What strings do other frameworks use? E.g. I've seen W4A8 used in a lot of places. Should we follow that format or do our own thing here? We could also have multiple strings map to the same config

@vkuzo
Copy link
Contributor

vkuzo commented Feb 25, 2025

The point of a string API is to have quick shorthands for common default configurations, e.g. int4wo. If users want to configure different parameters they should just use the config objects instead.

I'd change that to "one benefit, from several, of a string API is quick shorthand...". One other benefit is serialization/deserialization, i.e. being able to go between strings and Python objects.

@drisspg
Copy link
Contributor Author

drisspg commented Feb 25, 2025

@vkuzo

can we handle nested params? FooConfig(BarConfig(baz=3))?

Not as implemented unless you define your param matcher so that it clearly disambiguates the two

can we handle tensors as arguments StaticQuantConfig(scale=torch.tensor(1.0, device="cuda"))

You could write a param-matcher to do this. But feels kinda weird or at least I dont really know a good notation for doing this besides a few special matrices

can we put this logic on AOBaseConfig so all the configs get it for free (I did no research on this, but if this is possible in a clean way it would be nice)

I went back and fourth on this. We could enforce that AoBaseconfigs define their "string_form"

@andrewor14

The point of a string API is to have quick shorthands for common default configurations, e.g. int4wo. If users want to configure different parameters they should just use the config objects instead.

I very much agree w/ this and thus why I only added a few params its very easy to add 0 params and just enforce the string map and you only get default values.

Encoding parameters in the string like int8dqint4_g32_sym gets complicated quickly. E.g. does "gs32" or "sym" refer to weights or activations? What if we want asymmetric activations + symmetric weights, what does "sym" mean in this case?

Agree I played w/ having a "wSym" "wAsym" "aSym" "aAsym" +1 on complexity

What strings do other frameworks use? E.g. I've seen W4A8 used in a lot of places. Should we follow that format or do our own thing here? We could also have multiple strings map to the same config

Yeah listed this as follow up. Not tottally sure what you mean by multiple strings map to same config. As stated here that is possible since we dont enforce a param order. But if just had base_config + dtypes then I think we should not allow for multiple strings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request topic: new feature Use this tag if this PR adds a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants