The state and skill encoder learned with contrastive learning is never used? #6

xf-zhao · 2022-06-25T21:07:39Z

Hi, thank you very much for sharing the codes of the paper. Integrating contrastive learning into skill discovery is very attractive.

However, I found that in this implementation, the state encoder and skill encoder in cic module ($g_{\psi_1}$ and $g_{\psi_2}$ in the paper) are never used before being fed into policy neural networks. In cic/agent/cic.py line 222, parameters in cic is updated once
but not called for encoding obs and skill thereafter.

Another question is how can the agent guarantee that the policy is "indeed conditioned on z" since the intrinsic reward has noting to do with z? In another word, $\tau$ can be arbitarily diverse, which is good for exploration, but there lacks a mechnism to ensure the agent know "what's the influnce of z".

I really like your work. But these issues confuse me a lot. Please correct me if I am wrong or miss something. Thank you again for your kindness of sharing.

The text was updated successfully, but these errors were encountered:

pickxiguapi · 2022-07-13T03:50:55Z

Hi, I have the same confusion too, may I ask whether your question has been solved now? I think the contrastive learning updated parameters are not being used.

xf-zhao · 2022-07-13T09:37:29Z

Hi, I have the same confusion too, may I ask whether your question has been solved now? I think the contrastive learning updated parameters are not being used.

@pickxiguapi Hi, sorry, not solved. I think this is a mistake the author has not noticed since the work is still somehow in the progress / unfinished totally.

seolhokim · 2022-07-27T02:56:43Z

Why should g1 and g2 be used after updating once? I think there is no reason to call it from anywhere else before finetuning.

kc-ustc · 2024-06-20T05:48:33Z

I would like to ask a simple question. During pre-training, it was found that the neg in compute.cpc_loss is approximately 1200, while the pos is around 6. Is this a normal phenomenon?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The state and skill encoder learned with contrastive learning is never used? #6

The state and skill encoder learned with contrastive learning is never used? #6

xf-zhao commented Jun 25, 2022

pickxiguapi commented Jul 13, 2022

xf-zhao commented Jul 13, 2022

seolhokim commented Jul 27, 2022

kc-ustc commented Jun 20, 2024

The state and skill encoder learned with contrastive learning is never used? #6

The state and skill encoder learned with contrastive learning is never used? #6

Comments

xf-zhao commented Jun 25, 2022

pickxiguapi commented Jul 13, 2022

xf-zhao commented Jul 13, 2022

seolhokim commented Jul 27, 2022

kc-ustc commented Jun 20, 2024