You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Use TulipaIO instead of directly accessing the data
We need to generalize the data access and move the logic to TulipaIO. A large part depends on DuckDB-specific things (e.g., connection). Maybe we'll need to create a TulipaIO structure?
Remove the dependency on data stored in the graph
In preparation to remove this data from the graph, we can initially just use the data from some other place.
Remove the data from the graph
Don't save data in the graph. After the issue "Remove the dependency on graph data", this will be simple
Create all data and tables necessary for the model before the function #885
We can control and improve both the creation of the data/tables and the create_model function better if these concerns are separated.
This includes the "expression" data, e.g., data that informs the incoming and outgoing flow.
This includes clustering stuff (representative periods, timeframe, years, etc.)
Partition should also be created before the model, though we are missing the "scenario specification" data as a separate entity
Define variable tables #884
The way that we control the indexing - to prevent sparsity issues - is to precompute the indexes of the variables. The way that we are doing it right now is to compute tables for these indexes. Assuming that keeps being the case, we have to define what is necessary for these tables and define their names.
These tables will probably also save the final result, so they are used outside of the model as well.
If we use tables as DuckDB tables, currently have 1:3 in the column but we have to change to two columns (?) 1 and 3.
Maybe it makes sense to keep these tables (only) as DF, so we keep storing ranges and possibly JuMP variables in them
Create a TulipaVariable struct and store the DuckDB table with the indexes (i.e., each row corresponds to a index), and store also a Vector of JuMP variables
move the construction from construct_dataframes to compute_variables_indices
change many places that access the variables through dataframes to instead unpack x = model[:x], or instead use variables. We should try both strategies and see what makes more sense
Define (balance) constraints tables #927
The way that the (balance) constraints work is by asset. The table defines the necessary information for each of these assets. In the current implementation, one of the columns of this table is the JuMP expression with the incoming flow (and one for the output, and possibly more for extra stuff).
Constructing the incoming and outgoing flows is a huge issue. It was slow and we had to figure out a nice way to do it.
Furthermore, if we want to use an external table format, we can't save the expressions in the table. So the format will need to be changed.
Whatever we do we must be sure that it doesn't compromise performance.
ConstraintPartition struct
Table: asset, clustering (rp, tf, year), time blocks
Vectors for incoming, outgoing
Define partition tables #895
Do we need something separate for partitions in general, or the structs above are sufficient?
Reimplement all "add_expression..." function
The add_expression functions receive an asset-based constraint (e.g., balance) and the flow variable, and computes all incoming and outgoing flow for each row of the constraint.
The current implementation stores the flow and the resulting expressions in the tables. This will not be possible with a DuckDB table, so we either store the resulting expressions in a vector alongside the constraints, or in a separate structure, or don't pre-computed the resulting expressions.
Given that we want to precompute as much as possible separately from the model creation, computing the indexes required for the incoming and outgoing expressions might be desired, at first.
Renaming/Style
Find better names for functions that add, construct, create, compute, etc.
Verify if we can use only one df (without using a dict) for assets_profiles and assets_timeframe_profiles (better after having a pipeline with timeframe); have a pipeline example before changing that.
comment -> if we do change all the partitioning to dataframe including the constraints partition computation, then construct_dataframes is not necessary.
constraints_partitions from Dict to df
When using tables the filters will be on the tables not in the graph
Double-check the order of the indices in all the elements, constraints, variables, and expressions. They should follow the same order to be more efficient (at least in GAMS and AIMMS is like that)
Separate the model data from the scenario data. The scenario data for multi-year is currently hard-coded in create_model!, this needs to be generalized and relocated (see Create model discount parameters #803).
Discussed in #688
Originally posted by datejada July 1, 2024
Overview of the changes
We need to generalize the data access and move the logic to TulipaIO. A large part depends on DuckDB-specific things (e.g., connection). Maybe we'll need to create a TulipaIO structure?
In preparation to remove this data from the graph, we can initially just use the data from some other place.
Don't save data in the graph. After the issue "Remove the dependency on graph data", this will be simple
We can control and improve both the creation of the data/tables and the create_model function better if these concerns are separated.
This includes the "expression" data, e.g., data that informs the incoming and outgoing flow.
sets
#892The way that we control the indexing - to prevent sparsity issues - is to precompute the indexes of the variables. The way that we are doing it right now is to compute tables for these indexes. Assuming that keeps being the case, we have to define what is necessary for these tables and define their names.
These tables will probably also save the final result, so they are used outside of the model as well.
1:3
in the column but we have to change to two columns (?) 1 and 3.This involves:
construct_dataframes
tocompute_variables_indices
dataframes
to instead unpackx = model[:x]
, or instead usevariables
. We should try both strategies and see what makes more senseMake sure that we differentiate them from the input tables in DuckDB/TulipaIO
The way that the (balance) constraints work is by asset. The table defines the necessary information for each of these assets. In the current implementation, one of the columns of this table is the JuMP expression with the incoming flow (and one for the output, and possibly more for extra stuff).
Constructing the incoming and outgoing flows is a huge issue. It was slow and we had to figure out a nice way to do it.
Furthermore, if we want to use an external table format, we can't save the expressions in the table. So the format will need to be changed.
Whatever we do we must be sure that it doesn't compromise performance.
Do we need something separate for partitions in general, or the structs above are sufficient?
The add_expression functions receive an asset-based constraint (e.g., balance) and the flow variable, and computes all incoming and outgoing flow for each row of the constraint.
The current implementation stores the flow and the resulting expressions in the tables. This will not be possible with a DuckDB table, so we either store the resulting expressions in a vector alongside the constraints, or in a separate structure, or don't pre-computed the resulting expressions.
Given that we want to precompute as much as possible separately from the model creation, computing the indexes required for the incoming and outgoing expressions might be desired, at first.
x = model[:x]
) or usevariables[:x]
,expressions[:x]
?Work on Create flow variables using the indices in the variable structure #898 should help decide what we'll do.
See https://jump.dev/JuMP.jl/stable/tutorials/getting_started/design_patterns_for_larger_models/#Generalize-constraints-and-objectives
This will make the code cleaner in many places because it simplifies the function arguments, and the required variables and explicitly unpacked.
Some of the code will still depend on the indexes of the variables, and these might still need to be passed.
Checklist of how we want things to look
Pipeline draft:
3.1 Create variables
3.2 Create constraint partitions
Input data names
create_input_dataframes
create_internal_structures
Example:
representative_periods
to use DuckDB #713compute_assets_partitions
compute_assets_partitions!
to use DuckDB more efficientlycompute_constraints_partitions
compute_rp_partitions
solve_model
andsolve_model!
energy_problem.dataframes[:highest_in_out]
#637create_model
construct_dataframes
is not necessary.add_expression_terms_intra_rp_contraints
will need a refactor, and it will determine changes we need before and afterprofile_aggregation
functions are in the table in the documentationfor ... for ...
instead.AND THEN:
Maybe also:
The text was updated successfully, but these errors were encountered: