Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Affine Units Functionality #159

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open

Conversation

Deduction42
Copy link
Contributor

This adds a new type of Dimension (AffineDimension) that can be used to describe units with offsets (like Celsius and Fahrenheit). Like most other packages, operations on AffineDimensions are forbidden except subtraction. This is done by throwing an error on convert if the offset is zero; operations are allowed if the offset is zero (by using promote/convert to SI units).

This comment was marked as resolved.

@MilesCranmer
Copy link
Member

Thanks! Can you walk me through the implementation a bit? How does it work, what are the fields used for, etc?

@MilesCranmer
Copy link
Member

And FYI I would like to get things to 100% unit-test coverage if possible (particularly important as I'm unfamiliar with the code). My expectation is that there will be many unexpected errors that will pop out when going through this process.

@MilesCranmer
Copy link
Member

I think the Aqua and JET tests will likely take a bit of work to get passing but are also important

@Deduction42
Copy link
Contributor Author

Sure! As for how it works. Almost all of the changes are in "affine_dimensions.jl", which somewhat mirrors the patterns found in "symbolic_dimensions.jl" (or as much as reasonably possible). The basic structure is the AffineDimension <: AbstractAffineDimension <: AbstractDimension which has the following fields:

@kwdef struct AffineDimensions{R} <: AbstractAffineDimensions{R}
    scale::Float64 = 1.0
    offset::Float64 = 0.0
    basedim::Dimensions{R}
    symbol::Symbol = :nothing
end

The scale and offset parameters are used to convert this unit to its SI equivalent. So the SI quantitiy is basically

si_quantity = affine_quantity*scale + offset

This is essentially what uexpand(affine_quantity) does when it converts things to SI dimensions. The symbol field is optional and is for displaying units. If no symbol is present, it will merely display (scalse+offset basedim). For example:

kelvin  = AffineDimensions(scale=1.0, offset=0.0, basedim=u"K")
rankine = AffineDimensions(scale=5/9, offset=0.0, basedim=kelvin)
fahrenheit = AffineDimensions(scale=1.0, offset=459.67, basedim=rankine)
>> (0.5555555555555556+255.37222222222223 K)

But if we used

fahrenheit = AffineDimensions(scale=1.0, offset=459.67, basedim=rankine, symbol=:°F)
>>°F

@Deduction42
Copy link
Contributor Author

Generally speaking, it's not encouraged to do mathematical operations in affine units because of potential ambiguities present in the offset. Promotion rules will attempt to convert affine quantities into basic dimensional (SI) quantities. However, the function "convert" will fail if the offset is not zero (I allow typical multiplication/division operations for affine units with zero offsets in order to make affine unit macros work with compound units like m/s).

The affine unit marcro is "ua" and I registered the Celsius and Fahrenheit units. So you could write

 uconvert(ua"°C", 0ua"°F")
>> -17.77777777777777 °C

Becuase AffineDimensions are more general than Dimensions and SymbolicDimensions, registering a unit will also make it available to the "ua" macro, and I registered all items in UNIT_SYMBOLS and UNIT_VALUES off the bat, so you could write:

velocity = ua"mm/s"
>>1.0 (0.001 m s⁻¹)

This generality is so that you could make a single Type-Stable container (like a dictionary or vector) that represents both Affine and Non-Affine units and easily convert them all to SI or Symbolic units. Note however, compound non-affine units break, becuase offsets cannot be multiplied:

ua"°C/K"
ERROR: AssertionError: AffineDimensions °C has a non-zero offset, implicit conversion is not allowed due to ambiguity. Use uexpand(x) to explicitly convert
Stacktrace:

@Deduction42
Copy link
Contributor Author

Deduction42 commented Feb 1, 2025

Finally, there is a special macro for registering AffineDimensions only, because the regular registration macro doesn't handle offsets. Registration will apply to anything that can be evaluated as a dimension or quantity. For example, you could do:

@register_unit psi 6.89476us"kPa" #Regular registration method for PSI
@register_affine_unit psig AffineDimensions(offset=u"Constants.atm", basedim=u"psi") #Gauge-pressure subtracts atmospheric

uconvert(ua"psig", u"Constants.atm")
>> 0.0 psig

Generally, when building AffineDimensions, you want to be explicit about your "offset" units. If you use a numeric value, it will assume the same units as "basedim" which could be a quantity.

AffineDimensions(offset=1, basedim=0.5u"m") #Offset units are assumed to be half-meters "0.5m"
>>(0.5+0.5 m)

This may seem odd until we account for the fact that we could register half-meters so

@register_unit half_meter 0.5u"m"
AffineDimensions(offset=1, basedim=u"half_meter")
>>(0.5+0.5 m)

In such a case, it would be "least astonishing" that in the absence of units, the offset is in the same units as the base dimensions because no other units are mentioned. Building AffineDimesnions from other AffineDimensions will use the basedim's offset as a starting point, so you could do something like this:

kelvin  = AffineDimensions(scale=1.0, offset=0.0, basedim=u"K")
rankine = AffineDimensions(scale=5/9, offset=0.0, basedim=kelvin)
fahrenheit = AffineDimensions(scale=1.0, offset=459.67, basedim=rankine)
celsius = AffineDimensions(scale=9/5, offset=32, basedim=fahrenheit)

uconvert(Quantity(1.0, fahrenheit), Quantity(-40.0, celsius))
>>-39.99999999999996 °F

@MilesCranmer
Copy link
Member

MilesCranmer commented Feb 1, 2025

Quick question - why do affine dimensions need to support algebraic operations on non-offset affine dimensions? If it's non-offset, why not just force the user to immediately convert to SI units if they want to use such operations?

@Deduction42
Copy link
Contributor Author

Anyway, I'll start writing the tests for this on Monday, I just wanted to make sure you were onboard with adding this before I put the effort into it. I was pleasantly surprised that this took me less than a week to do; your package has some fairly elegant ways to plug in this feature (although these plug-in points are a bit hard to find). To reiterate, almost all of the changes are in the new "affine_dimensions.jl" file I added. I also had to make some changes to register_units.jl to add automatic registering of any units to the AffineUnits registry, and make a separate macro for registering AffineDimensions only.

I do have some questions:

  1. In the "test coverage" tools, is there any way to identify what lines are missing from the coverage? Or is this a game of educated whack-a-mole?
  2. I noticed in "register_units.jl" that you use a SpinLock(), is there any reason you're using this rather than the more common (and more recommended) ReentrantLock()?
  3. Last time I contributed, I was able to run all the tests by just running "run_tests.jl", but this time, there's a bunch of test-related dependencies. Is there a standard way of loading this environment before doing the tests, or do I have to just install all the test dependencies myself?

@Deduction42
Copy link
Contributor Author

Deduction42 commented Feb 1, 2025

Quick question - why do affine dimensions need to support algebraic operations on non-offset affine dimensions? If it's non-offset, why not just force the user to immediately convert to SI units if they want to use such operations?

The only reason I do this is so that I can parse units like "ua"m/s"" as an Affine Unit. However, I could just as well do an automatic promotion to SI units and then evaluate the compound unit as Affine in the end if that's something you would prefer.

@MilesCranmer
Copy link
Member

This is the command I usually run at the root of this repository:

julia --startup-file=no --depwarn=yes --threads=auto --code-coverage=user --project=. -e 'using Pkg; Pkg.test(coverage=true)'
julia --startup-file=no --depwarn=yes --threads=auto coverage.jl

Then I use coverage gutters in VSCode https://marketplace.visualstudio.com/items?itemName=ryanluker.vscode-coverage-gutters which highlights the missing lines inside the IDE.

@MilesCranmer
Copy link
Member

MilesCranmer commented Feb 1, 2025

  • I noticed in "register_units.jl" that you use a SpinLock(), is there any reason you're using this rather than the more common (and more recommended) ReentrantLock()?

ReentrantLock lets the same thread acquire a lock multiple times and I didn't see any use of that in this context. Also I expect the SpinLock to basically never get hit, it's just for the case where someone actually tries to register two units simultaneously, which I would expect to be exceedingly rare.

@Deduction42
Copy link
Contributor Author

Excellent, thanks!

@Deduction42
Copy link
Contributor Author

Deduction42 commented Feb 4, 2025

Good news! I managed to get tests passed with 100% code coverage. I will start working on the documentation.

Quick question: I noticed that uexpand(q) doesn't work if q has regular SI dimensions (not symbolic). I didn't want to extend this function:

uexpand(q::UnionAbstractQuantity{<:Any, <:Dimensions}) = q

but I did need a function that could do this for arbitrary quantities (in order to reduce repetition and over-specialization) so I created siunits(q) as the "catch all" for this.

siunits(q::UnionAbstractQuantity{<:Any, <:Dimensions}) = q
siunits(q::UnionAbstractQuantity{<:Any, <:AbstractSymbolicDimensions}) = uexpand(q)
function siunits(q::Q) where {T, R, D<:AbstractAffineDimensions{R}, Q<:UnionAbstractQuantity{T,D}}
    return force_convert(with_type_parameters(Q, T, Dimensions{R}), q)
end
siunits(q::QuantityArray) = siunits.(q)

I extended uexpand(q) to work with AffineDimensions, but I'm wondering if we should

  1. just extend uexpand(q) for regular Dimensions and remove siunits(q)
  2. expose siunits(q) as a catch all or
  3. just leave siunits(q) unexposed and uexpand(q) unextended.

.vscode/settings.json Outdated Show resolved Hide resolved
lcov.info Outdated Show resolved Hide resolved
test/runtests.jl Outdated Show resolved Hide resolved
test/affine_tests.jl Outdated Show resolved Hide resolved
@Deduction42
Copy link
Contributor Author

Okay, I think I managed to address your corrections and added some documentation on these features, both in the docs and in the Readme.md file. Let me know what you think.

@MilesCranmer
Copy link
Member

Can there be aliases for other common ways to write temperature, like degC?

Also what other affine units are out there that are common? I think it would be useful to have a couple others, just to ensure the structure itself captures all cases other than just temperature.

@Deduction42
Copy link
Contributor Author

Deduction42 commented Feb 4, 2025

Can there be aliases for other common ways to write temperature, like degC?

Yes, that's totally possible. I was simply following the Unitful.jl convention for the label. Did you want me to add "degC" and "degF" to the default registry?

Also what other affine units are out there that are common? I think it would be useful to have a couple others, just to ensure the structure itself captures all cases other than just temperature.

Honestly, it's almost always temperature because its the only dimension where we didn't have an obvious notion of "zero" when we started measuring it. If we did, this whole mess could have been averted. Case and point, temperature units are the only default affine units in Unitful.jl

In some cases we use "gauge pressure" which is pressure relative to atmosphere; this is sometimes denoted using a "g" at the end as "kPag" or "psig" but the notations for those aren't exactly standard. I've seen a lot of variation in how these units are represented (it's almost as bad as the hideous "standard cubic feet" that I often run into in the oil and gas sector). Because of this variation, I might prefer leave it up to the user to register it (the documented examples use "psig" for this reason). I just added the psi example to the tests (it's a derived affine unit and in the docs, so it's definitely a good idea to test). Just let me know if you want "degC/degF" in the default registry or if you want to just stick with the Unitful.jl convention.

@Deduction42
Copy link
Contributor Author

Deduction42 commented Feb 5, 2025

I didn't get a response so I went ahead and registered "degC" and "degF" as default aliases. Let me know if there's any other blockers for merging.

@Deduction42
Copy link
Contributor Author

Hello @MilesCranmer, just checking in to see if there are any other blockers for merging. I just added a small update to the pull request to ignore symbols when checking for affine dimension equality.

Some of my projects that use this package are starting to get some traction in my organization, and I'd really like to be able to simplify the workflow with this PR before bringing more people onboard with these projects. DynamicQuantities has the simplest and most expressive string parsing and unit registry tools of all the packages I've seen; I'm currently handling affine units with the Unitful.jl extension, and the kludges I had to use to parse affine unit strings in CSV config files are an unsightly, type-unstable abomination.

@MilesCranmer
Copy link
Member

The src/affine_dimensions.jl is still a bit large in my view, is there any way to simplify the implementation a bit? Or make use of existing code?

@Deduction42
Copy link
Contributor Author

Deduction42 commented Feb 10, 2025

The biggest simplification would be to allow uexpand to work on Quantities with dimension (where it just returns the value). A lot of the conversion code funnels SymbolicDimensions |> Dimensions |> AffineDimensions so having a universal simple function to convert any dimension into Dimensions would be nice. I currently defined a function siunits that's basically just this, a uexpand that works on regular dimensions.

I really tried to simplify this to the maximum extent possible. My main goals of this PR were:

  1. Support creating AffineDimensions from regular Dimensions
  2. Support registering/parsing of AffineUnits
  3. Support unit conversion from AffineUnits
  4. Make sure that when people try to do operations on Quantity{T,<:AffineDimensions}, they get useful errors telling them not to

Moreover, I wanted to do this through maximizing code reuse, minimizing new code, and following the package's existing conventions where possible. Unfortunately, most of the code's complexity is a direct consequence of the design decisions of the rest of the package.

Lines 15-63 deal with the fact that the macro u"kg/s" gives a quantity, not a dimension. Thus, if I'm building an AffineDimension, I have to support using quantities as a dimensional input.

Lines 90-106 deal with the fact that uexpand doesn't work on Quantities with Dimensions

Lines 132-158 are basically pulled from symbolic_dimensions.jl and deal with the fact that there are numerous Quantity types. If you can think of a way to simplify this, I'd be open to hear it.

Lines 160-175 is basically telling the promotion rules to just convert any AffineDimensions to Dimensions. If they have a non-zero offset, it will throw an error explicitly telling people to convert to non-affine units. It's a way of catching cases where people are trying to perform mathematical manipulation of affine quantities.

Lines 180-216 are the raison d'etre of this pull request, allowing you to convert to and from AffineDimensions.

Lines 218-265 deal with the fact that map_to_dimensions doesn't account for scale, outside actual quantities. Supporting these operations is necessary for parsing affine units in a type-stable way.

Lines 267-282 basically throw errors when adding two numbers with the same affine dimensions. This is necessary to prevent silently returning potentially wrong results.

Lines 293-436 is a mirror image of the SymbolicDimensions unit registry and units parsing. Unfortunately, a separate registry is required because AffineDimensions are more general than SymbolicDimensions. This means a new macro is required for registering units, but this macro doesn't affect the other registries.

@MilesCranmer
Copy link
Member

Thanks for talking me through this!

For many of these that are replicas of the original code, I would definitely be open to refactoring both the replica and the new copy to be instances of a more general code pattern, so long as it's still fairly clean. Like, for example:

Lines 293-436 is a mirror image of the SymbolicDimensions unit registry and units parsing. Unfortunately, a separate registry is required because AffineDimensions are more general than SymbolicDimensions. This means a new macro is required for registering units, but this macro doesn't affect the other registries.

Here, maybe there's a way to refactor both so they can be an instance of a single registry somehow? i.e., I would like to avoid needing to make "shotgun surgery" in the future (want to guard my overall maintenance load). Especially since this feature by itself is something I would very rarely if ever use, I want to make sure that I can still dogfood the relevant code. If the code is shared with functionality I do use, it's a lot easier for me to hit it from my normal use in other contexts.

@Deduction42
Copy link
Contributor Author

I think the biggest opportunity to reduce "shotgun surgery" by refactoring is in lines 398-423.

map_to_scope and lookup_units are the biggest culprits. They make use of different module-level constants. If I was able pass a "registry" argument like this

map_to_scope(ex::Expr, registry::Dict{Symbol, <:Quantity})

I could import this function independent of any module. I'm also wondering why you're not looking things up in a dictionary directly. Dictionaries are slow when looking things up in small arrays, but the registry for Symbolic is likely big enough to make dictionaries more efficient. Moreover, you're doing the lookup twice. For example in symbolic_dimensions.jl Lines 496-505, you have

    function map_to_scope(sym::Symbol)
        if sym in UNIT_SYMBOLS
            # return at end
        elseif sym in CONSTANT_SYMBOLS
            throw(ArgumentError("Symbol $sym found in `Constants` but not `Units`. Please use `us\"Constants.$sym\"` instead."))
        else
            throw(ArgumentError("Symbol $sym not found in `Units` or `Constants`."))
        end
        return lookup_unit(sym)
    end

The first branch iterates through all of UNIT_SYMBOLS and then lookup_unit is then called again. I'm not sure why all the fuss is done to find the element in UNIT_SYMBOLS rather than just looking things up in the dictionary like this:

    function map_to_scope(sym::Symbol, unitregistry::Dict{Symbol, <:Quantity}, constregistry::Dict{Symbol, <:Quantity})
        q = get(unitregistry, sym) do
            if haskey(constregistry, q)
                throw(ArgumentError("Symbol $sym found in `Constants` but not `Units`. Please use `us\"Constants.$sym\"` instead."))
            else
                throw(ArgumentError("Symbol $sym not found in `Units` or `Constants`."))
            end
        end
        return q
    end

Now AffineDimensions only has a unit registry, no constants, so it would look more like

    function map_to_scope(sym::Symbol, unitregistry::Dict{Symbol, <:Quantity})
        q = get(unitregistry, sym) do
                throw(ArgumentError("Symbol $sym not found in `AffineUnits` registry."))
        end
        return q
    end

Either way, this would eliminate the need for having two constants "UNIT_SYMBOLS" and "UNIT_VALUES" as well as the need for multiple lookups.

@Deduction42
Copy link
Contributor Author

The only thing is that the above refactoring would require quite a bit of effort. Shotgun surgery could go both ways as well. Because this parsing utility is independent of SymbolicDimensions (which is arguably more complex), then updating that codebase is less likely to break something here. In fact, you probably could refactor your part of the code that way and then I could follow suit later if you like.

I really don't mind maintaining this. I already contributed to this repo half a year ago to improve performance of Unitful.jl integration. I'm also going to be putting this functionality through the ringer on some really big projects over the next couple of months, so if anything goes wrong, I'll likely be the first to hear about it and fix it.

@MilesCranmer
Copy link
Member

Thanks for the reassurance, that is good to hear and makes me much more positive about the increased code from this! Ok, I will try to take a look later this week. This week in particular is a bit insane for me; are you ok with waiting until this weekend?

@Deduction42
Copy link
Contributor Author

No problem, this weekend works for me! Seriously though, thank you so much for your contributions. This package and your juliacall Python package patterns really helped me a lot this year!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants