if the trajectory stays n the terminal state (for a limited number of times) #6

ArezooAalipanah · 2024-02-05T01:26:30Z

hi thank you sooooooo much for this amazing repo.
I have been trying to build mu own environment but I faced some issues.
what if we have something like this : going from s0 to s1 to s2 and then staying in s3 for ever
(I changed the value iteration so now my trajectories are all 50 steps ) so my svf is something like(1,1,1,47, 0,...,0)
However I am facing some difficulties.
my zs and za start getting so big and then they become nan. and this ends in my omega to be nan as well
I was wondering if you have any idea how I can fix it? and what is the problem.
I am reading Dr.Zeibart's thesis but still have no clue how to tackle such problem(since z_terminal is 1 I am thinking maybe that results in the problem)
if you have any idea I would be so grateful if you share your thoughts
Thanks again

ArezooAalipanah · 2024-02-05T01:29:25Z

here is a bit more info :
my trajectory (I made it of len 40 this time)

the first iteration with initialization of 1
next array is my parameters after first iteration
however after second iteration they all end up being nan

qzed · 2024-04-16T19:11:33Z

Hi, I'm sorry for the long silence. I will likely need some time to look into this, and as I was a bit busy with work-related things lately I never got around to it. Things should be less stressful now, so I will try to look into it this weekend.

ArezooAalipanah · 2024-04-23T13:28:17Z

thank you so much. I made some modifications, like normalizing the rewards and weights each time to avoid going to infinity, but I still have to keep number of iterations limited since it will never converge. I am looking forward for your insight as well _

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

if the trajectory stays n the terminal state (for a limited number of times) #6

if the trajectory stays n the terminal state (for a limited number of times) #6

ArezooAalipanah commented Feb 5, 2024

ArezooAalipanah commented Feb 5, 2024

qzed commented Apr 16, 2024

ArezooAalipanah commented Apr 23, 2024

if the trajectory stays n the terminal state (for a limited number of times) #6

if the trajectory stays n the terminal state (for a limited number of times) #6

Comments

ArezooAalipanah commented Feb 5, 2024

ArezooAalipanah commented Feb 5, 2024

qzed commented Apr 16, 2024

ArezooAalipanah commented Apr 23, 2024