Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing Results #2

Open
TalalWasim opened this issue Mar 14, 2021 · 6 comments
Open

Reproducing Results #2

TalalWasim opened this issue Mar 14, 2021 · 6 comments

Comments

@TalalWasim
Copy link

I am unable to reproduce the results that you show in the images using the dataset downloaded from the ICDAR website. Could you please provide the pretrained model or the actual dataset you used to train the model?

@denabazazian
Copy link
Owner

Hi Talal, thanks for your interest in this repository. Unfortunately I don't have the trained model of this code, but it should be easy to reproduce it if you follow all the implementation details of Section 4 of the paper. You should get the trained model after 100 epochs of training and then you can reproduce the results. Have you installed all the required packages as the same version as mentioned in the README.?

@TalalWasim
Copy link
Author

Hello,

Yes I have followed the instructions and I am able to train the model. But the results are very noisy and nothing close to what you show in the paper. I trained the model for 100 epochs as you said.

I thing it might be due to the data. Can you somehow share the dataset that you trained the model on, or let me know where I can download it from? The Text Segmentation dataset available on the ICDAR website has a total of 462 images. While the train and val files in your repository "data" folder has 800 entries for training and 200 for validation.

So can you help me with the dataset, or share the dataset in case you still have a copy of it?

@denabazazian
Copy link
Owner

We have used ICDAR-2013 scene text segmentation [https://rrc.cvc.uab.es/], KAIST scene text segmentation [http://www.iapr-tc11.org/mediawiki/index.php?title=Scene_Text_Segmentation_in_the_KAIST_Dataset], and MRRC English scene text segmentation [https://www.cse.iitb.ac.in/~sharat/icvgip.org/icvgip2014/mile/publications/softCopy/DocumentAnalysis/Deepak_mocr2013.pdf]. The link of the dataset for MRRC apparently doesn't work, and unfortunately I don't have it any more on my system. We only considered the English images for training including the 230 images from ICDAR-2013, 310 images of KAIST and 60 image of MRRC, so probably without MRRC data you should get some decent results too.

@TalalWasim
Copy link
Author

Hello,

Thank you so much. I thought that the model was only trained on the ICDAR data. I did not consider the KAIST one. Apologies for that. I have the model training right now. I will update here with the results I get.

Once again thank you for your quick response and support

@tanmayj000
Copy link

Hi @TalalWasim were you able to successfully train and get the desired results? I am not able to reproduce the results either.

@denabazazian
Copy link
Owner

Hi @tanmayj000, have you downloaded the following datasets for training?
1- ICDAR-2013 scene text segmentation- task 2.2 (Link).
2- KAIST scene text segmentation (Link).
3- MRRC English scene text segmentation (Link).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants