Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I26 use urls #27

Merged
merged 12 commits into from
Sep 1, 2020
Merged

I26 use urls #27

merged 12 commits into from
Sep 1, 2020

Conversation

eyvorchuk
Copy link
Contributor

This PR resolves issue 26, which is to allow the config file for wps_parameters.py to have FILE_PATHs pointing to data files on THREDDS. For each URL, the data was copied to a temporary file in the working directory of the process, and the URL in the config file was replaced with the corresponding local file path. This functionality was tested by mocking the URLs prior to running the process to avoid requesting data that may not exist on THREDDS.

@eyvorchuk
Copy link
Contributor Author

eyvorchuk commented Aug 24, 2020

@jameshiebert I should mention that some of the RVIC modules in the venv need to be modified before the tests will work. The necessary changes are in the testing section of the README.

@@ -73,7 +73,7 @@ SEARCH_FOR_CHANNEL: False
#-- Path to Pour Points File (char) --#
# A comma separated file of outlets to route to [lons, lats] - one coordinate pair per line (order not important)
# May optionally include a column [names] - which will (if not aggregating) be included in param file
FILE_NAME: https://docker-dev03.pcic.uvic.ca/twitcher/ows/proxy/thredds/dodsC/datasets/RVIC/sample_pour.txt
FILE_NAME: https://docker-dev03.pcic.uvic.ca/twitcher/ows/proxy/thredds/fileServer/datasets/RVIC/sample_pour.txt

Copy link
Contributor

@sum1lim sum1lim Aug 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the ideal use of pour input is to get user inputs (lon and lat) and create a pour.txt file in the tmp directory. It is a variable that needs to be dynamically defined by users. @jameshiebert @corviday , do you think this is the right approach? I guess we don't have to worry about that in this PR. I will write an issue if it is needed.

Copy link
Contributor

@sum1lim sum1lim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest is not passing for me. The error message is attached to one of the comments.

While it is good to download files from https, there is a method you can use opendap as I did in a thunderbird PR. xarray library enables to copy netcdf contents retrieved from oepdap URL. You can create a copy of the files with the contents in the tmp directory. I think it is worth to consider this method since I implemented thunderbird/wps_update_metadata this way.

@@ -73,7 +73,7 @@ SEARCH_FOR_CHANNEL: False
#-- Path to Pour Points File (char) --#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to use this file in the jupyter lab demo as well

@pytest.mark.parametrize(
("config"), [resource_filename(__name__, "configs/parameter_https.cfg")],
)
def test_parameters_https(config, make_mock_urls):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is failing with the following error message:

Traceback (most recent call last):
  File "/home/sangwonl/Desktop/osprey/venv/lib/python3.6/site-packages/pywps/app/Process.py", line 249, in _run_process
    self.handler(wps_request, wps_response)  # the user must update the wps_response.
  File "/home/sangwonl/Desktop/osprey/osprey/processes/wps_parameters.py", line 119, in _handler
    parameters(config, np)
  File "/home/sangwonl/Desktop/osprey/venv/lib/python3.6/site-packages/rvic/parameters.py", line 51, in parameters
    outlets, config_dict, directories = gen_uh_init(config_file)
  File "/home/sangwonl/Desktop/osprey/venv/lib/python3.6/site-packages/rvic/parameters.py", line 161, in gen_uh_init
    config_dict = copy_inputs(config_file, directories['inputs'])
  File "/home/sangwonl/Desktop/osprey/venv/lib/python3.6/site-packages/rvic/core/utilities.py", line 243, in copy_inputs
    copyfile(section['FILE_NAME'], new_file_name)
  File "/usr/lib/python3.6/shutil.py", line 120, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pywps_process_36j0za4t/sample_pour2b3izq8y.txt'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does each FILE_PATH in your parameter_https.cfg file look like? I just realized that running the notebook as is permanently changes it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the demo doesn't use that .cfg file, so it won't be modified. I'll look into it more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you try replacing self.workdir in this line with a different directory, and after the for loop in replace_urls, print filedata? I just want to see if the contents of the config file are being modified correctly.

Copy link
Contributor

@sum1lim sum1lim Aug 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does each FILE_PATH in your parameter_https.cfg file look like? I just realized that running the notebook as is permanently changes it.

I think you are right here. I did git stash and ran pytest again, and it passed.

Actually, the demo doesn't use that .cfg file, so it won't be modified. I'll look into it more.

Last time I ran the demo first by changing the input to the parameter_https.cfg, so that caused the issue you said.

@jameshiebert
Copy link

jameshiebert commented Aug 24, 2020

@jameshiebert I should mention that some of the RVIC modules in the venv need to be modified before the tests will work. The necessary changes are in the testing section of the README.

Could you fix this in RVIC and open a PR against RVIC as requested by Joe Hamman for your second testing note? Testing should be automated; it's not feasible/scalable to have to make manual changes to an installed virtual environment. We need to fix the root of this problem. And if its as simple as you note, it should be a no brainer.

@sum1lim
Copy link
Contributor

sum1lim commented Aug 24, 2020

@jameshiebert I already made a PR in RVIC repo and emailed Joe Hamman, but still waiting for the reply.

@eyvorchuk
Copy link
Contributor Author

Could you fix this in RVIC and open a PR against RVIC as requested by Joe Hamman for your second testing note? Testing should be automated; it's not feasible/scalable to have to make manual changes to an installed virtual environment. We need to fix the root of this problem. And if its as simple as you note, it should be a no brainer.

Unfortunately, we haven't had luck with getting those issues fixed in RVIC. The first issue has been fixed in RVIC 1.1.1, but for some reason, that version isn't installable as a python package (the only one available is v1.1.0.post1). @sum1lim opened a PR for the other issue a month ago and I believe emailed one of the devs about it, but neither of those have gotten responses.

Copy link
Contributor

@nikola-rados nikola-rados left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think this is a nice solution to the issue, good work! That being said I am getting failures on my pytest runs locally. I have followed the instructions in the README but still get issues. Does it work locally for you? Are there any changes I need to make that aren't in the README?

osprey/cli.py Outdated
@@ -152,7 +152,7 @@ def stop():
@click.option(
"--parallelprocesses",
metavar="INT",
default="2",
default="20",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this change

osprey/utils.py Outdated
Comment on lines 47 to 49
read_config = open(config, "r")
filedata = read_config.readlines()
read_config.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use a context manager here.

Comment on lines +51 to +63
for i in range(len(filedata)):
if "https" in filedata[i]:
url = filedata[i].split(" ")[-1] # https url is last word in line
url = url.rstrip() # remove \n character at end
r = requests.get(url)
filename = url.split("/")[-1]
prefix, suffix = filename.split(".")
suffix = "." + suffix
local_file = NamedTemporaryFile(
suffix=suffix, prefix=prefix, dir=outdir, delete=False
)
local_file.write(r.content)
filedata[i] = filedata[i].replace(url, local_file.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is fine, we generally don't want to use indices if we can help it. It is more understandable to iterate through the iterable object : for data in filedata:....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the index because I wasn't sure if the assignment in line 63 would modify the filedata list if I iterated through each element.

Comment on lines +59 to +61
local_file = NamedTemporaryFile(
suffix=suffix, prefix=prefix, dir=outdir, delete=False
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great use of tempfile here 👍

osprey/utils.py Outdated
Comment on lines 65 to 68
write_config = open(config, "w")
for line in filedata:
write_config.write(f"{line}")
write_config.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, we can use a context manager.

@@ -2,6 +2,6 @@ pywps>=4.2
jinja2
click
psutil
rvic==1.1.0post1
rvic==1.1.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -1,6 +1,7 @@
pytest
flake8
pytest-flake8
requests-mock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know this package existed. I wonder if we can apply to our notebooks to make truly offline tests.

@@ -73,7 +73,7 @@ SEARCH_FOR_CHANNEL: False
#-- Path to Pour Points File (char) --#
# A comma separated file of outlets to route to [lons, lats] - one coordinate pair per line (order not important)
# May optionally include a column [names] - which will (if not aggregating) be included in param file
FILE_NAME: https://docker-dev03.pcic.uvic.ca/twitcher/ows/proxy/thredds/dodsC/datasets/RVIC/sample_pour.txt
FILE_NAME: https://docker-dev03.pcic.uvic.ca/twitcher/ows/proxy/thredds/fileServer/datasets/RVIC/sample_pour.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this change from dodsC to fileServer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so that https instead of OpenDAP is used to download the file from THREDDS. James told me that it makes downloading entire files easier since I won't have to specify attributes or time ranges.

Comment on lines +1 to +7
import pytest
from osprey.testing import make_mock_urls


@pytest.fixture
def conftest_make_mock_urls(config, requests_mock):
return make_mock_urls(config, requests_mock)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fixture!

Comment on lines +35 to +37
read_config = open(config, "r")
temp_config.writelines(read_config.read())
temp_config.read()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we aren't just directly reading into the temp_config?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean here.

@eyvorchuk
Copy link
Contributor Author

That being said I am getting failures on my pytest runs locally.

What do the FILE_NAMEs in your .cfg files look like?

@nikola-rados
Copy link
Contributor

FILE_NAME: https://docker-dev03.pcic.uvic.ca/twitcher/ows/proxy/thredds/fileServer/datasets/RVIC/sample_flow_parameters.nc

@eyvorchuk
Copy link
Contributor Author

eyvorchuk commented Aug 31, 2020

And in sample_parameter_config.cfg? Or is it just the https one that's failing?

@nikola-rados
Copy link
Contributor

FILE_NAME: tests/data/samples/sample_pour.txt

And here are the failures I got:

==================================================================================================== FAILURES =====================================================================================================
____________________________________________________________ test_wps_convolution[/home/nrados/github/pcic/osprey/tests/configs/convolve_opendap.cfg] _____________________________________________________________
Traceback (most recent call last):
  File "/home/nrados/github/pcic/osprey/tests/test_wps_convolution.py", line 38, in test_wps_convolution
    run_wps_process(Convolution(), params)
  File "/home/nrados/github/pcic/osprey/venv/lib/python3.6/site-packages/wps_tools/testing.py", line 65, in run_wps_process
    assert_response_success(resp)
  File "/home/nrados/github/pcic/osprey/venv/lib/python3.6/site-packages/pywps/tests.py", line 145, in assert_response_success
    assert len(success) == 1
AssertionError
---------------------------------------------------------------------------------------------- Captured stderr call -----------------------------------------------------------------------------------------------
curl error details: 
__________________________________________________________________________________________ test_wps_convolution[config1] __________________________________________________________________________________________
Traceback (most recent call last):
  File "/home/nrados/github/pcic/osprey/tests/test_wps_convolution.py", line 38, in test_wps_convolution
    run_wps_process(Convolution(), params)
  File "/home/nrados/github/pcic/osprey/venv/lib/python3.6/site-packages/wps_tools/testing.py", line 65, in run_wps_process
    assert_response_success(resp)
  File "/home/nrados/github/pcic/osprey/venv/lib/python3.6/site-packages/pywps/tests.py", line 145, in assert_response_success
    assert len(success) == 1
AssertionError
---------------------------------------------------------------------------------------------- Captured stderr call -----------------------------------------------------------------------------------------------
curl error details: 

@eyvorchuk
Copy link
Contributor Author

Oh, convolution is @sum1lim's process. I didn't modify anything involved with it in the PR, so the issues might be happening elsewhere.

@sum1lim
Copy link
Contributor

sum1lim commented Aug 31, 2020

@nikola-rados if only convolution process is not working, the possibility is that valid_ranges are not changed in share.py. Could you double-check it?

@sum1lim
Copy link
Contributor

sum1lim commented Aug 31, 2020

OK, I know the problem. @eyvorchuk, could you change this line to following?

"DATL_LIQ_FLDS": ["RUNOFF", "BASEFLOW"],

I actually encountered this problem after RVIC 1.1.1 is used.

@eyvorchuk
Copy link
Contributor Author

Works now.

@nikola-rados nikola-rados merged commit 87225b5 into master Sep 1, 2020
@nikola-rados nikola-rados deleted the i26-use-urls branch September 1, 2020 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow wps_parameters to use https URLs
4 participants