You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@joanrod
Hola, Juan! Gracias por compartir este proyecto!
Tengo una pregunta sobre la Sección 4.2 de tu artículo.
No estoy seguro de qué "feature maps" se utilizan exactamente en el cálculo de OCR Perceptual loss.
Está escrito en la Sección 4.2 de la siguiente manera:
"... through the OCR model, and extract L feature maps from intermediate layers. Specifically, we store the activation map after each upsampling layer..."
@joanrod
Hola, Juan! Gracias por compartir este proyecto!
Tengo una pregunta sobre la Sección 4.2 de tu artículo.
No estoy seguro de qué "feature maps" se utilizan exactamente en el cálculo de OCR Perceptual loss.
Está escrito en la Sección 4.2 de la siguiente manera:
"... through the OCR model, and extract L feature maps from intermediate layers. Specifically, we store the activation map after each upsampling layer..."
However, in the code, it seems that the feature maps are extracted from the VGG16-BN part of the network, instead of the upsampling layers (which are in the UNet part of the network).
https://github.com/joanrod/ocr-vqgan/blob/68e36b568b59df275940296c164b1cf40585512b/taming/modules/losses/craft.py#L89
https://github.com/joanrod/ocr-vqgan/blob/68e36b568b59df275940296c164b1cf40585512b/taming/modules/losses/lpips.py#L28-L29
The text was updated successfully, but these errors were encountered: