Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fallback to ConfigReader.get, **kwargs to connect_to_database, and pymssql support to call_stored_procedure #214

Closed
wants to merge 2 commits into from

Conversation

markus-leben
Copy link

Mostly exists to better support pymssql querying. Fixes this issue from pymssql in database library: pymssql/pymssql#860

…Manager

Some database connection functions don't require all of dbapiModuleName, dbName, dbUsername...

These functions also fail in some libraries if given a blank string on some libraries. 

By 'some libraries' here I mostly mean pymssql, but these changes should be invisible to any tests running with a fully db.cfg. 

Similarly, I passed **kwargs in from connect_to_database on connections to misc databases not in the existing if/elif cases. This was mostly to allow for passing in tds_version on pymssql (see this issue: pymssql/pymssql#860), but it would be useful for any misc database to be able to set up their connection using custom parameters. 

It may also be a good idea to pass kwargs in to each db_connection in that whole big if/else statement, but I wanted this to be minimally impactful to connection modules I don't know well enough to not break.
Similar support to existing call_stored_procedure functions, but pymssql annoyingly requires data types on its outputs, so the syntax has to be 'CURSOR_[TYPE]' instead of just CURSOR.
@amochin
Copy link
Collaborator

amochin commented May 13, 2024

Hello @markus-leben, thanks a lot for your contribution!
I am not quite sure about some of the changes in your pull request.
Could you please provide more information?

  1. The "add fallback" commit 0fe7f4a
    1. could you please explain the issue you're trying to solve?
    2. If it's mainly about this additional connection parameter tds_version which is needed to solve this issue with pymssql?
    3. If yes, would a keyword Connect using custom params work for you?
    4. I am also not sure about this "UNSET fallback" implementation and hardcoding the port number...
  2. The "add support for call stored procedure" commit 7c283ba
    1. Same here - I'd like to better understand the need or the issue you're trying to solve
    2. Could you please provide some tests or usage examples in Robot Framework?

Thank you!

@markus-leben
Copy link
Author

markus-leben commented May 20, 2024

  1. Add fallback
    1. Currently Connect to Database is pretty prescriptive on what has to be in dbConfigFile, even on modules that those db property values don't exist for or are optional for, and it doesn't allow for additional specifications either in your cfg file or in the function itself. At least for me it wasn't really well documented that the values in the example db.cfg were required values, and it didn't really make sense for them all to be required values when for some db connections they aren't needed.
      1. Would it generally be a clearer error if you forgot to put a password in your dbConfigFile and that threw either an error in your module or you got back an error from your db connection? Docs right now don't really specify that things like dbPort are required, so it's kinda hard to tell at first blush that the reason this is happening is because Connect to Database requires those parameters in either the config or the function arguments.
    2. Yeah that was the core issue for me in general. Pyodbc has no good stored procedure support, and pymssql requires that parameter to function correctly, I wanted to do testing with stored procedures so getting pymssql to a point where it could connect was a goal for me.
    3. Tested this and Connect To Database Using Custom Params would work.
      1. I guess this brings up a design question of what Connect To Database and Connect To Database Using Custom Params. Because right now Connect to Database is "Connect using exactly these parameters and no other parameters, but you can distribute them as you like between arguments and a config file" where Connect To Database Using Custom is currently "Connect using whatever you like but it has to be arguments." What's the goal of having Connect To Database be so prescriptive? Is the idea there to make something that takes a very specific set of inputs because it covers a majority of database connections?
    4. The gist of using that fallback variable is "Allow for null or default values for any of these". Basically it means that the config file can have any of the current config property values, but if they're not specified then it will continue with the fallback value if there is one. If there isn't a fallback value, it'll throw an error, so if there was something you wanted to be required then you could have that one be required even if others defaulted to something. The _UNSET variable is just a cheaty workaround to permit "fallback=None" as an argument.
      1. I agree on not hardcoding the default port, but there might be a better fix: What if the default port was based on the dbapiModuleName? Something like this:
         portFallback = 0000
         if dbapiModuleName == 'psycopg2':
             portfallback = 5432
         ... # Other default db ports go here
         dbPort = int(dbPort if dbPort is not None else config.get("dbPort", fallback=portFallback))
      
  2. Add support for call stored procedure
    1. This one is for adding return variable support to pymssql for Robot Framework Database Library. Basically, pymssql's syntax for outputs is slightly different because it requires a variable type for the outputs. Because of that, in order to pass in output variables you need dedicated syntax. Instead of it being CURSOR, for an output, it has to be CURSOR_[TYPE]. This is just better pymssql support in the simplest way I could see to implement it.
    2. Here's a simple code snippet to show the use: https://github.com/markus-leben/RobotDBSnippet/blob/main/example.robot

@amochin
Copy link
Collaborator

amochin commented Aug 16, 2024

Hello Markus, sorry for the late reply.

I've been working on the second part of your PR - regarding handling the OUT parameters when calling a procedure in MSSQL.

I basically took your solution, but I had to change something.
That's why I put in a separate PR #219 - could you please have a look?

I also extended the documentation of the keyword 'Call Stored Procedure' - now it should explain in a better way the special cases with different databases.

@amochin
Copy link
Collaborator

amochin commented Aug 16, 2024

Regarding the first part of your PR (config files) - it's still in the ToDo list, but I'll check it out soon.

@markus-leben
Copy link
Author

Wrote a brief review and tested query.py locally. Looks like it works well, and it's compatible with the connection_manager changes I made same as my code was. I'm gonna leave this open for the config file stuff, but feel free to close when you get those to a state you're happy with.

Also no worries on the delay, from the looks of things you were prepping to present at a con and that sounds like a LOT of hard work.

@amochin
Copy link
Collaborator

amochin commented Aug 26, 2024

@markus-leben I finally took a closer look at your "add fallback" changes and more generally at the way, how the library creates a DB connection with/without a config.file.
Here are some points I'd need brainstorming help with:

  1. The documentation of connection parameters should be definitely improved - which of them are required etc.
  2. I agree, that it should be possible to provide additional parameters when calling the Connect to database keyword (the **kwargs you implemented)
  3. Reading any additional parameters from the config file should be also implemented
  4. The basic connection parameters module, host, port, db_name, username, password should stay mandatory - because knowing their names when calling the keyword makes it easier for library users and because these params often require special handling depending on the Python DB module.
    1. Do I get you right, that you have a case when some of above mentioned basic params are not required by a Python DB module? My understanding was that they are always necessary.
  5. I'm rather against the idea of hardcoding the fallback values depending on the DB. It would push the library even deeper in the rabbit hole of maintenance and hell of custom handling. I also believe that it would make handling the connection parameters less transparent for library users and therefore harder to debug, if there are some issues. I think, this is the case when less "magic" is better.
    • Update - just realised, there are already fallback values for port numbers in the code - for different DB's, but not for MSSQL. I'm still against this approach, but the current implementation should stay because of compatibility. So adding a fallback value for port in MSSQL would be consistent with the current version.
  6. Some of the connection parameters (charset, driver etc.) can't be currently set in the config file. This should be fixed.
  7. Currently listed additional, DB specific parameters like charset might be removed from a list of explicitly specified arguments. Instead, they can be provided in **kwargs (see point 2)

What do you think?

@markus-leben
Copy link
Author

Do I get you right, that you have a case when some of above mentioned basic params are not required by a Python DB module? My understanding was that they are always necessary.

There is, yeah. Using a trusted connection for MSSQL connects using your windows auth you're logged in on without another username or password. Like with this connection string for an mssql server:

Suite Setup Connect To Database tds_version=7.0

And this db.cfg:

[default]
dbapiModuleName=pymssql
dbHost=MY_SERVER

It's actually slightly fiddlier than that. In this case basic params must explicitly be unset. You can't have a dbUsername and dbPassword or it fails with this relatively opaque error:

Parent suite setup failed:
OperationalError: (18456, b"Login failed for user 'username'. DB-Lib error message 20018, severity 14:\nGeneral SQL Server error: Check messages from the SQL Server\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (MY_SERVER)\n")

So at least in this specific context defaulting to None is probably appropriate behavior, as you end up with this connection:

pymssql.connect(database=None, user=None, password=None, host='MY_SERVER', port=1433, tds_version='7.0')

I also believe that it would make handling the connection parameters less transparent for library users and therefore harder to debug, if there are some issues. I think, this is the case when less "magic" is better.

I think there's a middle ground here, like you were mentioning with port numbers. Assuming defaults appropriate to the defaults of different DB types seems helpful, and hopefully not very likely to change over time. It would probably be good to be as explicit as possible in the logger.info statements in ConnectionManager.connect_to_database about what the connection parameters are, both for passed in **kwargs and for any assumed defaults.

I figure if you assume some defaults, but allow them to be overwritten and are explicit in docs, debug console logging, and error messaging, then the burden of having perfect defaults is lessened. If you say "Connect to Database uses these defaults" and then the debug console logs "Connecting using : this_db.connect(params)", and then an error throws "Connection failed, connection was this_db.connect(params), error message: foo" and someone ends up not understanding or frustrated with default values then they're probably in the wrong. This is a bit subjective though.

Here's a doodle for what the error handling might look like:

connect_dict = {
'database':dbName,
'user':dbUsername,
'password':dbPassword,
'host':dbHost,
'port':dbPort,
**kwargs}
try:
db_connection = db_api_2.connect(
**connect_dict
)
except Exception as ex:
msg= (f"Connection failed, connection was {dbapiModuleName}.connect("
f"{', '.join([f'{key}={value}' for key,value in connect_dict.items()])})")
msg = '{}, error message: {}'.format(msg, ex.args[0]) if ex.args else str(msg)
ex.args = (msg,) + ex.args[1:]
raise ex

Sorry for the somewhat unhinged fstring and dict comprehension for spreading arguments into the message.

@amochin
Copy link
Collaborator

amochin commented Sep 25, 2024

@markus-leben I've just created a PR #220 with custom connection parameters. Could you please have a look?

@amochin
Copy link
Collaborator

amochin commented Oct 1, 2024

Implemented in #220

@amochin amochin closed this Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants