Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pragmatic solution to parameter changes in standalone mode #1429

Merged
merged 40 commits into from
Aug 1, 2023

Conversation

mstimberg
Copy link
Member

We have been discussing for a long time how we can make things like parameter explorations more convenient with standalone mode (see discussions about "encapsulation mode" #251). I recently thought about this a bit more, and came up with an intermediate solution that I'm really happy about. It has the advantage that 1) it already works with this PR and 2) it requires minimal changes to code and devices like brian2cuda. The basic idea is:

  1. To use command line arguments in the main binary to set variable values. This value can either be a scalar, or the name of a file containing the values (which makes it possible to specify whole arrays, which otherwise wouldn't be really convenient via the command line).
  2. To make the results directory configurable by command line arguments

The difference of 1) with what we discussed earlier (#251) is that I enabled this for arbitrary variables of the model – this removes the whole need for a discussion of syntax to define which variables should be changeable by a user (Parameter class, etc.). The addition of 2) makes it immediately possible to use this for parallel simulations. The syntax in device.run (see examples below) still needs to be improved, transforming the initialization into command line arguments is of course not something that the user should do. Oh, and don't look too closely at the C++ code for now :)
The main thing that is missing functionality-wise for now is the ability to set arbitrary constants defined outside of equations. Your towards_encapsulation branch already had a change where instead of "freezing" the constants, it would put them as variables into a file. Applying this change here would make implementing this functionality almost trivial, but I'd like to leave this out of the very first version of this feature, since it requires bigger changes in devices like brian2cuda – you can work around this limitation by including a variable like tau : second (shared, constant) in your model and setting this instead of an external constant.

Examples

Here are a few things that are possible with these changes:

Run a simulation several times to get new random number instantiations, without recompiling the code (this was already possible before, just mentioning it here for completeness):

# "simple case" example which runs a model several times to
# explore the effect of randomness (initialization here) 
set_device('cpp_standalone')
tau = 10*ms
eqs = 'dv/dt = (1-v)/tau : 1'
G = NeuronGroup(10, eqs, method='exact')
G.v = 'rand()'
mon = StateMonitor(G, 'v', record=True)

# Normal run
run(100*ms)
results = [mon.v[:]]

for _ in range(8):
    # Do 8 more runs
    device.run()
    results.append(mon.v[:])

To run this in parallel, you'd previously had to change the project directory for each process, which meant that each process would compile everything again. With this PR, you can compile it once and then run it in different result directories (#1240):

set_device('cpp_standalone', build_on_run=False)
# ... same model as before

run(100*ms)
device.build(run=False)  # compile once

def do_run(run_number):
    device.run(results_directory=f'results_{run_number}')
    return mon.t[:], mon.v[:]

import multiprocessing
p = multiprocessing.Pool()
results = p.map(do_run, np.arange(9))

And now the main change: setting an initial value via the command line. This will set the tau value (now included in the model as discussed above):

eqs = '''dv/dt = (1-v)/tau : 1
tau : second (constant, shared)
'''
# ...
# Normal run
run(100*ms)

results = [mon.v[:]]

for tau in np.linspace(5, 50, 8)*ms:
    # Do 8 more runs
    device.run(run_args=[f"neurongroup.tau={float(tau)}"])
    results.append(mon.v[:])

And by going via a file, you can set the initial values of v to whatever you like:

for index in range(8):
    # Save file with values
    fname = f"v_values_{index}.dat"
    (np.random.randn(10)*0.05 + index*0.1).tofile(os.path.join(device.project_dir, fname))
    device.run(run_args=[f"neurongroup.v={fname}"])
    results.append(mon.v[:])

The latter examples could also run things in parallel by combining this with the results_directory argument.

Again, not claiming that we should keep the run_args syntax (or make the user store the values to a file), but the great thing about this is that it already works and that it will work with minimal changes for brian2cuda, brian2genn, etc., since they leave the loading/saving values from/to disk to Brian. It should also be noted that this actually already provides a C++ API to change parameters for a run—admittedly an ugly one: you could link main.cpp and call something like main(2, {"mycall", "neurongroup.tau=0.05"}); from your C++ code...

I feel like this feature (with a syntax that replaces setting run_args manually) would make a lot of people happy, and is actually feasible with minimal effort. What do you think, @thesamovar ?

@thesamovar
Copy link
Member

If this works it is already a lot better than what I did in the towards encapsulation branch which is hopelessly out of date and presumably no longer rescuable.

We probably want to think about the syntax a bit, but that's fine. There's also the question of when these command line arguments get run which might introduce some issues?

There's still no solution I think to the problem of making number of neurons a variable that you can change, right? I think that needs a more fundamental change.

But I think this is worth doing. We want to have this functionality and if we wait for the perfect solution we'll never have it.

@thesamovar
Copy link
Member

I'm away for August so if we want to discuss syntax, can we do that tomorrow/Friday?

@Bitloader22

This comment was marked as off-topic.

@mstimberg
Copy link
Member Author

There's also the question of when these command line arguments get run which might introduce some issues?

Currently, they set the variables just before the run starts, which I think is the best place, since it makes it override anything that was set before.

There's still no solution I think to the problem of making number of neurons a variable that you can change, right? I think that needs a more fundamental change.

You are right, this would be a more fundamental change and is not supported with the solution I present here. We hardcode constants like this as unchangeable all over the code, so I don't see any quick solution.

I'm away for August so if we want to discuss syntax, can we do that tomorrow/Friday?

Happy to discuss here (or over video or something) any time this afternoon or tomorrow. But I will be away for August as well, so it won't change much 😊

@thesamovar
Copy link
Member

We could have a bit of syntax that allows you to choose where to set the values. Allows more flexibility because then the code can do some derived values from that, e.g.

set_values_here()
G.tau_inh = 10*G.tau_exc
...
run(tau_exc=5*ms)

Let's discuss more in September if you're also away in August.

@mstimberg
Copy link
Member Author

We could have a bit of syntax that allows you to choose where to set the values. Allows more flexibility because then the code can do some derived values from that

Yes, worth discussing! I wonder whether it is worth the complication, given that in most cases you could do the same by simply setting dependencies explicitly, e.g. in your example

run(tau_exc=5*ms, tau_inh=50*ms)

A lightweight solution could also be to include this information in the run call and reuse the existing code slots (https://brian2.readthedocs.io/en/stable/user/computation.html#custom-code-injection). The default would be before_network_run, but you could also use after_start to set things before the usual initialization. Maybe in that case it would be good to have additional code slots, actually, since most of the existing ones are not very useful here (e.g. before_start would set values before they are initialized with zeros, and everything from after_network_run on wouldn't affect the simulation).

Let's discuss more in September if you're also away in August.

Sounds good, let's find a time over email.

RTD runs Python 3.7, recent sympy versions don't support it
@mstimberg
Copy link
Member Author

@thesamovar I think this is more or less ready to go – not quite mergeable yet, since the C++ code is still a bit convoluted, but let's make sure that we agree on syntax before I finalize things.

This now supports all the use cases I mentioned above: repeatedly running a simulation to get different random values, setting variables to scalars, and setting variables with arrays. Together with the new option to choose the results directory, this can also be used with several simulations running in parallel.

The current syntax does not use direct keyword arguments to device.run (e.g. device.run(tau_exc=5*ms)), but instead reuses the existing run_args. I don't think we can go with simple keyword arguments, since we have to specify the group as well. The syntax is not quite as ugly as above anymore, though: instead of writing device.run(run_args=['neurongroup.tau_exc=0.005']) (which would actually still work), you'd write device.run(run_args={G.tau_exc: 5*ms}).

As I argued earlier, I like such a general mechanism that allows changing parameters without needing to change in the existing source code (e.g. for brian2wasm). But I certainly also see your point about Brian's "be explicit" principle, i.e. to make it clear in the model description which parameters are supposed to be changed later. Maybe in the long run there'd be room for both approaches? My feeling is that it would be good to release some mechanism soon, and then see what people use it for...

I triggered a documentation build for this branch for easier reading of the new docs.
The new syntax is explained here: https://brian2.readthedocs.io/en/standalone_parameter_change/user/computation.html#multiple-full-simulation-runs
Some notes on the implementation are here: https://brian2.readthedocs.io/en/standalone_parameter_change/developer/standalone.html#command-line-arguments

@thesamovar
Copy link
Member

It'll be a little while before I have time to do this unfortunately. But it's going on my priority to-do list.

@mstimberg
Copy link
Member Author

No worries. I've also used the opportunity to ask for feedback on the discussion forum.

@mstimberg mstimberg marked this pull request as ready for review July 24, 2023 16:21
@mstimberg
Copy link
Member Author

It's been a while (almost a year!). This is now ready-to-merge from my side, I'm quite happy with it and already started using it myself (in particular for brian2wasm, but that's another topic). Not sure whether anyone will have time to look at the code in the near future, but at least some general feedback from the user-interface side would be great! Following the discussion on the discourse forum, I actually added two more features: you can now also set synaptic variables (which can effectively replace store/restore in some train/test scenarios), and the values in a TimedArray. I think it is reasonably well documented, and I also added/adapted a few examples to use this feature (potentially, tutorial 3 could also introduce it?).
A few links:

@mstimberg
Copy link
Member Author

I am going to ignore our usual procedure here, and will merge this now as it is without waiting for review (which would probably take a while to come in with summer holidays and all that). This is a major addition, but I am quite happy with it, and I'd say it is fairly well tested and documented. One important reason for merging it is also that I'd like to get users to use/test this feature, and that is easier if they can install the package from testpypi. But it also simply becomes hard to keep such long-running PRs updated, adding significant work. Of course, the fact that I merge it now doesn't mean that you shouldn't have a closer look at it, please do! Let's consider this making me the case for post-publication review (or EAFP) 😉

@mstimberg mstimberg merged commit 616bfac into master Aug 1, 2023
50 checks passed
@mstimberg mstimberg deleted the standalone_parameter_change branch August 1, 2023 13:55
mstimberg added a commit to brian-team/brian2cuda that referenced this pull request Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants