Reward Function, DRL Target, and Primary Frequency Response #3
Replies: 3 comments 1 reply
-
Hello Tim,
That is true.
Yes. As you explained, the reward is greater when freq is closer to 1.
This is probably your research question, but in my understanding, primary frequency response is done through droop-based controls. They act in a smaller time frame compared with the secondary. If you change the I guess the primary frequency control concerns the droop in turbine governors or the Pref in renewables. Maybe you can have some discussions with your advisor. Thank you for the kind words. I'm glad that you find them useful. |
Beta Was this translation helpful? Give feedback.
-
Abou the numbner of steps, Dr. Yichen Zhang @whoiszyc may have an answer. Thanks! |
Beta Was this translation helpful? Give feedback.
-
Hello,
I have been looking through the andes_freq.py file and trying to understand how the target of the training is determined. From my experience RL is generally guided by the reward function, which is often dependent on the error (difference between desired value and value resulting from action taken).
The example provided focuses on secondary frequency control and seeks to use DRL to drive the frequency back to 60Hz. Am I correct in my understanding that this target of 60Hz is set by the following lines:
if not sim_crashed and done: reward -= np.sum(np.abs(3000 * (freq - 1))) else: reward -= np.sum(np.abs(100 * (freq - 1)))
where the reward is determined by the absolute value of some constant multiplied by the difference between the simulated frequency and 1pu (60Hz). Is this correct, or is there somewhere else where the desired result is defined?
I am curious because I am looking to use andes_gym to apply the same algorithm, DDPG, in a different part of the system to improve primary frequency response. I would rather use a short-term action to drive the frequency to the post-event steady state and allow other traditional methods to return the frequency from post-event steady state back to 60Hz. Then if this section is where the target is set, I would simply change the "1" to the corresponding per unit value for post-event steady state frequency.
Furthermore, as I am looking to focus on primary frequency response as opposed to the secondary frequency response, are there any aspects of the environment I should be changing?
Thank you very much. Andes and Andes_Gym have been extremely helpful tools!
Beta Was this translation helpful? Give feedback.
All reactions