Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to rewrite resource url in cases when ckan.datapusher.callback_url_base is used in CKAN #206

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mbocevski
Copy link
Contributor

This complements and fixes an issue with datapusher when ckan.datapusher.callback_url_base is set to True in CKAN. In scenarios when running on the cloud, in secure environments or using docker-compose it's useful for datapusher to use an internal CKAN url like for instance http://ckan:5000.

This PR adds options to enable rewriting of URL's of uploaded resources in CKAN so that datapusher can reach them. The two options are:

  • REWRITE_RESOURCES - boolean whether or not to enable rewriting of resources URL's, default: False
  • REWRITE_URL - URL to use for rewriting, defaults to http://ckan:5000/

…ing used when CKAN in firewall or other setups like docker-compose
@mbocevski
Copy link
Contributor Author

@amercader if you're OK with this feature, let's wait on merging until #207 is merged so that I can update the config options in the readme.

@amercader
Copy link
Member

@mbocevski
I think this needs to be fixed in CKAN core. Here we should take ckan.datapusher.callback_url_base into account instead of using site_url. metadata['ckan_url'] is what DataPusher uses to construct the CKAN URLs. When ckan.datapusher.callback_url_base was introduced it only handled the result_url which is the URL that CKAN Service provider pings to report progress.

It makes sense to configure this only at the CKAN level and not at the DataPusher level as this can be specific to each CKAN instance.

@mbocevski
Copy link
Contributor Author

@amercader Yeah that makes sense, I will try it out and submit a PR in core. However, it's not immediately clear whether that's just for constructing the URL for the resource or for the actual resource file. Cause note that we only want to rewrite resources that are uploaded to CKAN filestore and not rewrite resources that are links.

@mbocevski
Copy link
Contributor Author

mbocevski commented Oct 7, 2020

@amercader Yeah, just as I suspected, so site_url in core gets set to the callback URL which is correct. The logic is that once a job is created, datapusher then calls resource_show to get the URL of the resource, download it and insert it in the database.

They way things are, I see at least two options:

  1. Take the approach that I'm suggesting in this PR
  2. Change datapusher extension in Core to send the URL of the resource with the request, and implement the logic if uploaded to filestore use the callback_url instead of site_url, remove resource_show from datapusher. I'm not sure why this is not part of the current design, so there might be a good reason for that.

WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants