Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modularizing authentication #433

Closed
rwegener2 opened this issue Jul 31, 2023 · 1 comment · Fixed by #435
Closed

Modularizing authentication #433

rwegener2 opened this issue Jul 31, 2023 · 1 comment · Fixed by #435

Comments

@rwegener2
Copy link
Contributor

rwegener2 commented Jul 31, 2023

The Problem

The authentication method that exists right now is inherently tied to the Query class because the .earthdata_login() method is a method on that class. That has led to passing ._session values from Query objects to Variables objects as needed.

As the cloud read effort develops, additional modules, such as Read, will also need to authenticate. Because of this I would like to propose that we make authentication in the repository more modular, so it can be more easily extended to different classes.

Possible Solutions

Both of the ideas below create an EarthdataAuth class. This class would have the existing earthdata_login() functionality within it.

1 - Inheritance

We could create an class called EarthdataAuth that holds the ._auth, ._session, and ._s3login_credentials attributes that currently get created in Query. Query, Variables, and other classes could then be subclasses of the EarthdataAuth superclass.

Pros/Cons: After some reading it seems that isn't exactly the type of relationship we want to convey. The set of complexity added by a new level of inheritance feels disproportionate to the fairly simple need to keep track of 2-3 attributes and one function. The pro of this solution, however, is that I believe we could it up such that nothing about the user experience changes.

2 - Creating EarthdataAuth as a class that other objects have as an attribute

Query and other related objects could have an .auth attribute that is an instance of EarthdataAuth. EarthdataAuth can get created either by calling the earthdata_login() function, or automatically when a user needs authentication. The user (or the internal code) would access attributes like ipx.Query.auth._session.

Pros/Cons: I think this solution could simplify our internal code and at least for me it is more intuitive than inheritance (that may just be me, though). The con of this solution is that it would change how the user authenticates, either by removing that step all together or by requiring they switch to ipx.Query.auth.earthdata_login().

Questions

  • Any opinions about which of the two options to pursue? I am leaning toward the second one, but could be otherwise convinced. @JessicaS11
  • Is there a reason that we prefer to ask users to run the earthdata_login() function themself? Are we interested in setting up the workflow such that the user is prompted for login credentials only when the execute a function that requires authentication? I think that could happen fairly easily on our end, but it would change the user experience.

Code

  • this commit on the move_auth branch shows a wip example of approximately what the inheritance method could look like (method no. 1 above). I got that far and had some issues. I think the ordering of classes was causing different ._session objects to overwrite each other. That's when I starting thinking about if method no. 2 above would be simpler.
@JessicaS11
Copy link
Member

JessicaS11 commented Aug 2, 2023

Thanks for this great writeup @rwegener2! Of the two solutions you propose, I would definitely lean towards the second one. We actually not too recently removed our own Earthdata module, instead building out a similar set of auth methods directly in earthaccess (and also gaining the improved cloud and multi-DAAC auth already built in to that library). In that line of thinking, I wonder if - rather than create an internal EarthdataAuth class - we could simply leverage the earthaccess auth object directly, more or less as is done in Query.earthdata_login(). Then the challenge turns back into your comment below.

Is there a reason that we prefer to ask users to run the earthdata_login() function themself?

The short answer is this is a legacy from the early days of programmatic data access. In many cases, particularly for tutorials, the primary mode of authentication was interactive (and there was no icepyx or earthaccess wrapping the manual sending of auth info with a request and response sequence, so users needed to add their credentials to the dict themselves). Thus, even once we added automatic auth options we wanted it to be clear to the user this was happening and didn't want to break existing workflows, so we did not make earthdata_login() private. However, if the user calls Query.download() and has not logged in, this will be called automatically, so they are not explicitly required to run it themselves.

Are we interested in setting up the workflow such that the user is prompted for login credentials only when the execute a function that requires authentication? I think that could happen fairly easily on our end, but it would change the user experience.

I think this is the way to go, and I think we could implement it without changing the user experience. For classes that might need auth, we can include it as part of the init as you showed in the Variables module in your linked commit. Rather than directly passing in a ._session (as is done now), the auth object can be passed and we can use auth.get_session(). This will also help us be better positioned to adopt more earthaccess functionality in the future (e.g. using it to interact with CMR, which we currently still do "manually" by sending requests).

@rwegener2 rwegener2 mentioned this issue Aug 3, 2023
4 tasks
@JessicaS11 JessicaS11 linked a pull request Oct 18, 2023 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants