-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments on Matrix Autodiff Type #32
Comments
A user could do this by printing the transformed MIR from stanc3.
Your understanding is correct, it would propagate as |
Yeah I'd be fine with having that exposed |
I think that would be friendly enough for the more advanced users who would be digging into these optimizations, in which case it's more of a documentation concern. A good opportunity for a nice section in the manual, users' guide, or even Discourse about effective use of Stan's linear algebra library.
I totally understand why this would be hard for the compiler logic (although a great example of why Stan3 sets the stage for all kinds of potential sophisticated optimizations going forwards). My only concern is that there are likely to be enough exceptions where advanced users would know safe places to cast that the functionality would be beneficial.
The more warnings the better! |
I do not think we should add a I would much prefer to add a notion of
Then it wopuld be illegal to subsequently reassign elements I'd even be OK with From the compiler perspective, any matrix that is never assigned to again can be assigned a |
I do not think we should add a to_varmat() function to the language, primarily because it's using the arbitrary name given to a class in Stan math rather than something in the user's language.
I agree —I was imagining `to_varmat()` as just a placeholder for something more interpretable to users, or at least advanced users.
I would much prefer to add a notion of const variable to Stan. So that'd be something like:
const matrix[M, N] Sigma = my_funky_gm_covariance(...);
Then it wopuld be illegal to subsequently reassign elements Sigma[m, n]. This is more restrictive than we need---there's no problem with reassigning all of Sigma, it's just the elements that are problematic. But I think that'll be OK. It'd also support meaningful error messages for users trying to reassign.
I'd even be OK with const_matrix as a unified type if we don't want to do this for other types, but we probably want to allow anything to be declared const if we're going to allow const.
From the compiler perspective, any matrix that is never assigned to again can be assigned a const type. This is just like const in C++. I don't see much room for useful user choice here. I guess for a matrix that doesn't ever get used for arithmetic or linear algebra, it would be wasteful to eagerly reassign to varmat.
The design doc discusses reasons against adding a new type or type annotation to the language and I think that they’re reasonable.
Really this is something more like a compiler hint or a specifier in the C++ parlance that lets the compiler know that the user thinks that it's supposed to be safe to be more aggressive in using varmat. Maybe something like
```
matrix[N, N] Sigma = my_funky_gm_covariance(…) encase;
```
that would parse only for inline matrix declarations .
|
I'd be fine with adding a new type to Stan for these, but tbh the biggest thing is what the gosh dang to name these! It's really just moving from an array of structs to a struct of arrays. I kind of like bobs suggestion of |
I'd be fine with adding a new type to Stan for these, but tbh the biggest thing is what the gosh dang to name these! It's really just moving from an array of structs to a struct of arrays. const is okay and the sparse matrix prototype in the Stan compiler has the doo-dads to figure out when a const matrix is being assigned to illegally. Though I kind of worry with a const keyword that people will start writing their programs in weird ways so that they can trick the compiler to give them the new matrix type which is why
I strongly agree. Even mathematically the difference between a matrix as a representation of a linear transformation and a matrix as a collection of numbers is subtle, and there’s no name that unambiguously denotes the difference. `const` isn’t write because the matrix itself isn’t immutable, it just can’t be changed element by element. `encase`, `encapsulate`, and the like capture the fact that varmat encapsulates all of the data together but those words also have other connotations that can confuse users.
In some sense the most precise thing to do would be to not allow element wise access of matrix types _at all_, and instead require users to cast to and from arrays when necessary; then matrix types are the efficient linear algebra types and arrays are the collections of numbers. This would, however, require nearly everyone to rewrite all of their Stan programs.
I'd rather keep it 99% a thing the compiler just does and then have some weird esoteric function/keyword for advanced users.
If only because of the naming problems I am very okay with this approach. In particular if this functionality can be kept as an “advanced” feature then I would hope that we wouldn’t have to be too precious about the naming since it would require users to know that underlying C++ context already. Maybe even something somewhat ambiguous like “to_fast_matrix()” that avoids explicit references to the C++ but would be easy to look up in the docs.
|
Let me flip the table, what if we gave it a bad name on purpose so it looks smelly? I kind of like exposing
I like that because it sounds kind of scary / technical which should hopefully dissuade anyone that doesn't know what they are doing from using it. Though the only bad thing about not having a keyword is that this can't be used to make parameters default to |
If we don’t introduce yet another container type for users to choose between (varmat, matvar, and array) then I think this is the best way to allow advanced users to get around the limited compiler logic in the few exceptional patterns where it might make a substantial performance difference. Maybe even putting the function in its own reference guide section like “Compiler Helper Functions” so that users looking through the matrix functions don’t find it by accident.
Note the cost would be O(R * C).
… On Jan 5, 2021, at 7:57 PM, Steve Bronder ***@***.***> wrote:
Let me flip the table, what if we gave it a bad name on purpose so it looks smelly? I kind of like exposing to_var_matrix() then we have in the docs something like
soa_matrix to_var_matrix(aos_matrix)
soa_matrix to_var_matrix(soa_matrix)
WARNING: This function is intended only for advanced users.
Takes in the standard Stan matrix type as an array of structs (aos) and forces the compiler to convert it to the the struct of arrays (aos) format (Aka from an Eigen::Matrix<var, R, C> to an var_value<Eigen::Matrix<double, R, C>>. This function can cause the Stan compiler to produce code that will not compile. This function has O(N^2) cost.
I like that because it sounds kind of scary / technical which should hopefully dissuade anyone that doesn't know what they are doing from using it. Though the only bad thing about not having a keyword is that this can't be used to make parameters default to var<Matrix>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#32 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AALU3FXBP52NL5PIZ7BOTUDSYOYN5ANCNFSM4VWKPHJA>.
|
This is in developer language, not user language. We could change the name, but I think having something that looks more like a compiler hint as @betanalpha is suggesting without adding a new type is the way to go.
I don't remember making that suggestion. I prefer "immutable" as the term for the new matrix structures. That's because we can allow holistic assignment, just not element-level assignment. I imagine something like this.
This is a constraint on how a variable is used. It does not introduce a new type of variable. What's the state of functions accepting matrices? Have they all been updated to allow |
Just to clarify, we do allow element-level assignment to varmat as well. Its just a lot slower (2-4 times was the estimate from a benchmark @bbbales2 made). Its non trivial for the compiler to "guess" when varmat would actually still be faster for a given expression even if it contains some element-level accesses which is why we resorted to a conservative approach to back-fall to matvar in these cases because that guarantees the model will at worst have the same performance as it had in the last release. For example matrix[N, N] a;
matrix[N, N] b;
matrix[N, N] c = a*b;
a[10,10] = a[10,10] + 2.0; would definitely be faster with varmat, but as a has element-level access, the compiler will make it a matvar. We have not figured out any simple and clear set of rules that would guarantee no performance regression and would also be easy to test. I am personally scared of any sort of heuristic rules in the compiler, but I am sure there are some optimization strategies that someone with more knowledge of compiler design can figure out in the future.
Not all of them, but a lot of them have. The compiler implementation (on a branch, nothing is on master yet) uses a list of varmat-supported functions to figure out when varmat can be used. |
Another idea that does come to mind is that we would have a This would work and be faster under the assumption that the user writes the model using vectorized functions wherever possible. My hunch is that once all or almost all functions support |
Thanks for clarifying. But why allow that? It seems like it'd be a huge hassle on the math lib side and be really inefficient in the case of either a lot of assignments or big matrices, depending on how it's implemented. Is there a link to the benchmark somewhere? Where do I find the implementation of var that allows assignment? I somehow missed most of this discussion as it was going down.
Access is fine for getting values, just not for assigning.
How do we build up a covariance matrix for GPs if the only type we have is var? I would think we'd need something like comprehensions.
The way to make that precise would be to claim that the most efficient way to write any model is using |
By "we allow it", I mean it can be done in Stan Math. The current conservatively-envisioned compiler implementation would not allow it in Stan. For benchmarks see #29 (comment) and two posts below that. For row and column-wise see #30 (comment) (those are faster for varmat). For where to find all the parts of that implementation I am going to defer to @t4c1 @SteveBronder and @bbbales2. I would just point you to the wrong place :) |
Here's the benchmark of reading or writing scalars: link. There's another one that reads + writes below it. As an example of why accessing subsets is handy, here are benchmarks of similar operations on rows/columns: link This is the implementation (could be other versions of this elsewhere, but this is one place where this is done). The routine is, on assignment to a varmat, add a vari to the stack that zeroes the adjoints and undoes the assignment in the reverse pass. I think this changes the behavior of doing two reverse passes without a zero, but I don't know of a reason to do that.
Slowly
The latest on that conversation is the question here and the response here. Edit: The short answer on comprehensions is that varmat ended up being useful without them and I'm not actually sure how to do them in a way that would be much more efficient than the way we build GPs in Stan code now -- there's still be tons of varis everywhere. |
Relevant to this discussion, we're postponing the varmat release until 2.27 (check here) |
A few comments for consideration in the matrix autodiff design document.
As much as the matrix of vars enables better performance it also greatly facilities the implementation matrix-valued reverse mode updates, as demonstrated in the matrix product example. The resulting code is not only easier to develop but also easier to read -- all in addition to likely being much faster. Perhaps worthy of mentioning in the summary.
Given the amount of compiler logic proposed it would be great to have a compiler option that returns an annotated Stan program indicating with variables were assigned
matrix<var>
and which were assignedvar<matrix>
. This would help users track down undesired casts down tomatrix<var>
and optimize their code without having to guess at the underlying compiler logic.It would also be great to have some way to force a conversion from a
matrix<var>
to avar<matrix>
, even if only with a function meant for advanced users. I'm thinking in particular of Gaussian process applications where custom Gram matrices are built element by element in a user-defined function, but afterwards are used used as only monolithic matrices. My understanding of the proposed logic is that this would propagate amatrix<var>
throughout the Stan program and miss out on potential performance unless there is some way to force a user-defined function to return avar<matrix>
to cast the return, for exampleThe text was updated successfully, but these errors were encountered: