Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Over 2000 retries/second on connection loss #470

Closed
Granjow opened this issue May 9, 2017 · 29 comments · Fixed by #718
Closed

Over 2000 retries/second on connection loss #470

Granjow opened this issue May 9, 2017 · 29 comments · Fixed by #718

Comments

@Granjow
Copy link

Granjow commented May 9, 2017

When I start my application, then stop the DB server, and then try to execute a query on the server, mssql 4.0.1 tries to reconnect 2700 times per second on my laptop:

image

There should probably be a small delay after which mssql reconnects.

@Granjow
Copy link
Author

Granjow commented May 9, 2017

Okay, this happens in generic-pool in Pool.js where createResource() calls _dispense() which immediately calls createResource() again etc. Partially related bug: coopernurse/node-pool#175

@blown302
Copy link

+1

@blown302
Copy link

blown302 commented Sep 19, 2017

I'm having the same issue during testing. Are there any workarounds for this?

@blown302
Copy link

I backed off to the older version of mssql until this is resolved.

@Halt001
Copy link

Halt001 commented Oct 24, 2017

+1

@kunKun-tx
Copy link

i am building a scheduled tasks runner that remote db offline is expected. This occurred to me during the test. It's good that the module would keep trying but blocking this process is not suitable in my case. Also, i think it would be better to follow 'let it break' in this case and let user handler the logic of retry.

I am reverting back to tedious.js for now, which just has a pr for remote db goes to offline. It's pretty tedious to use that low-level library so i hope this can be improved soon.

@clshortfuse
Copy link

clshortfuse commented Oct 25, 2018

This is still happening for me. I'm trying to test reconnecting the pool (and retrying any pending queries).

When I call connection.close() the callback/promise never goes through.

    function onConnectionError(err) {
      connection.removeAllListeners('error');
      connection.addListener('error', (e) => {
        console.log('failed to close connection');
        // every tick or so, a new ConnectionError appears
        console.error(e);
       });
      connection.close((e) => {
         // never reached if a connection disconnects
         performReconnect();
      });
    }

I can achieve this by running a query at an interval, waiting for it to timeout, having the error occur (caught by connection.on('error')) and then calling the onConnectionError function. If a connection has been successfully made before, then onConnectionError will get in that infinite loop on the disconnect. If no connection has ever been made, onConnectionError handles the issue just fine.

Some more info, if I don't add a listener for error before performing the close, an UnhandledPromiseRejection occurs. (and that took me a while to discover that's happening from within mssql, not my code).

My method of testing is disabling the Ethernet/WiFi connection after a connection is made. It triggers a ENOTFOUND error, which returns immediately with no delay (since there's no network connection). It's possible that there's no timing buffer before retrying a connection. Regardless, any ConnectionPool that has .close() called should never try to reconnect.

@willmorgan
Copy link
Collaborator

@clshortfuse Thanks for your comprehensive update. Can you please confirm the node-mssql version you have installed?

@willmorgan willmorgan reopened this Oct 25, 2018
@clshortfuse
Copy link

clshortfuse commented Oct 26, 2018

@willmorgan

I'm using mssql: 4.2.2. I'm not entirely sure about what tedious version that was running. It was either tedious: 2.7.1 or the latest tedious: 3.0.1.

I can't tell exactly which tedious version because when I was writing my reconnection algorithm, I was bouncing back and forth between using mssql and a pure tedious implemention.

Just to add, I believe it's related to some code (I forget where exactly) requesting a reconnect with .setImmediate(), which bumps up execution priority above whatever disconnects or clears the connection pool. I don't exactly remember where I saw it, but I don't think it was in the tedious library.

@willmorgan
Copy link
Collaborator

Had a look for usage of setImmediate in this library, and couldn't find anything that immediately jumped out at me. Strange. Looks like we don't ever timeout any ending of the Tedious connection, and there's no handling of any error events to reject the returned promise:

node-mssql/lib/tedious.js

Lines 268 to 279 in 691456d

_poolDestroy (tedious) {
return new base.Promise((resolve, reject) => {
debug('connection(%d): destroying', IDS.get(tedious))
tedious.once('end', () => {
debug('connection(%d): destroyed', IDS.get(tedious))
resolve()
})
tedious.close()
})
}

Out of interest, what do you see when you perform this testing exercise with the DEBUG=* EV set when you run node wish-i-had-an-ethernet-port.js? You should see a stream of the debug messages which could help track things down.

@Indrego
Copy link

Indrego commented Dec 19, 2018

The issue is still present in mssql 4.3.0. Once the pool has been created and the initial connection succeeds, any new connection issues result in an infinite loop. It's an issue with Pool._createResource when factory.create (ie _poolCreate()) rejects with an error.

@dhensby
Copy link
Collaborator

dhensby commented Jan 3, 2019

Perhaps it's time to revive #624?

@clshortfuse
Copy link

I decided to share a rewrite of the tedious connection pool I made. The original was a bit more complicated, (auto-deadlock handling, parameter mapping, and auto-connection retries) and it's being used in production. I trimmed it down for make it more modular.

https://gist.github.com/clshortfuse/33447c2b731531ecf5a7a224f649e818

All you have to do is

const pool = new TediousPool(sqlConfig);
pool.start().then(() => {
  // Connected
  pool.getConnection(connection => new Promise((resolve, reject) => {
    const rowData = [];
    const request = new Request(query, (err) => {
      if (err) {
        reject(err);
        return;
      }
      resolve(rowData);
    });
    request.on('row', (columns) => {
      const object = {};
      columns.forEach((c) => {
        object[c.metadata.colName] = c.value;
      });
      rowData.push(object);
    });
    connection.execSql(request);
  })).catch((err) => {
    // Handle errors;
  }).then((results) => {
    // Handle results;
  });
});

getConnection(fn) expects a promise, but you can also decide to perform your query synchronously as well.

@dhensby
Copy link
Collaborator

dhensby commented Feb 5, 2019

Perhaps the recent patch to generic-pool (released in 4.3.1 and 5.0.0-beta.1) have solved this problem?

@clshortfuse care to test?

PS: It may be better to resolve the bug in the underlying library rather than re-write your own pool for tedious specifically? Or add a custom evictor to the current library?

@clshortfuse
Copy link

@dhensby It's still broken in 4.3.1. I just took my sample code I posted before and expanded it a bit to check. After you let it connect, and report the current time, and then disconnect the network adapter, it'll start outputting errors nonstop:

const mssql = require('mssql');

const sqlConfig = {
  // REDACTED
}

let pool = null;

/** @return {void} */
function reconnect() {
  mssql.connect(config).then((connection) => {
    /**
     * @param {Error} err
     * @return {void}
     */
    function onConnectionError(err) {
      pool = null;
      connection.removeAllListeners('error');
      connection.addListener('error', (e) => {
        console.log('failed to close connection');
        // Every tick or so, a new ConnectionError appears
        // Gets stuck in infinite loop.
        console.error(e);
      });
      connection.close((e) => {
        // Never reached if a connection disconnects
        console.log('NEVER REACHED!');
        // Retry in 10 seconds
        setTimeout(reconnect, 10000);
      });
    }
    connection.on('error', onConnectionError);
    pool = connection;
  }).catch((err) => {
    console.error('initial connection error', err);
  });
}

/** @return {void} */
function getTime() {
  if (!pool) {
    return;
  }
  pool.request().query('SELECT GETUTCDATE()').then((result) => {
    console.log(result);
    console.log('DISCONNECT NETWORK ADAPTER NOW!');
  }).catch((err) => {
    console.error(err);
  });
}

setInterval(getTime, 5000);
reconnect();

I was just rewriting my connection pooler and remembered about this issue and figured I'd share my pool code for others. It's pretty clean without using any complicated timers or sequencing which avoids race conditions. It basically maintains a queue of JS Promises. The only thing I would think needs a bit of tweaking is the idle check interval. There's a more efficient way than checking every X seconds for idle connections.

@dhensby
Copy link
Collaborator

dhensby commented Feb 13, 2019

@clshortfuse thanks for this - I'll look into it

@dhensby
Copy link
Collaborator

dhensby commented Feb 14, 2019

@clshortfuse with the code your provided the behaviour is as follows:

  1. Connection is successful and query result is logged
  2. DB server goes away
  3. Errors are logged (1000 errors get logged, in chunks of ~125)
  4. NEVER REACHED is logged, connection is closed and initial connection error is logged
  5. The connection is not recreated because the global connection already exists (this looks like a bug)

@dhensby
Copy link
Collaborator

dhensby commented Feb 14, 2019

5. The connection is not recreated because the global connection already exists (this looks like a bug)

well, to elaborate on this, this is actually because you are calling connection.close() instead of mssql.close() and so the global connection reference is not updated.

However, I think we can fix this by checking whether the global connection is connected or not...


see #804

@clshortfuse
Copy link

@dhensby I suggest you run the code. It has nothing to do with connection.close. You can remove the the connection.close line and it'll still spit out errors.

I was only using the .close() function to check the event is ever emitted. It doesn't cause the underlying issue.

@clshortfuse
Copy link

You can use this

/** @return {void} */
function reconnect() {
  mssql.connect(config).then((connection) => {
    connection.on('error', (err) => {
      pool = null;
      connection.removeAllListeners('error');
      connection.addListener('error', (e) => {
        console.log('failed to close connection');
        // Every tick or so, a new ConnectionError appears
        // Gets stuck in infinite loop.
        console.error(e);
      });
    });
    pool = connection;
  }).catch((err) => {
    console.error('initial connection error', err);
  });
}

And it'll still cause the issue. If you don't use connection.addListener('error'), then an UnhandledPromiseRejectionWarning is emitted, as I said in the October comment above.

(node:37105) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 3)
(node:37105) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

And as the warning states, in the future, NodeJS will exit the app when it encounters an unhandled promise rejections. This was the primary reason why I had to stop using the built-in connection pooling,

@dhensby
Copy link
Collaborator

dhensby commented Feb 14, 2019

I suggest you run the code.

I... did?? That's why I am able to provide my experience of what happens.

I can remove connection.close, sure - but the behaviour is the same, just it doesn't do step 5 in my list.

@clshortfuse
Copy link

clshortfuse commented Feb 14, 2019

@dhensby Sorry, I understood

this is actually because you are calling connection.close()

When the issue happens regardless of me calling it. I'm not sure what "this" was in reference to. I assumed it's the infinite error loop.

@clshortfuse
Copy link

Actually, re-reading your comment, you said 'NEVER REACHED' is logged. This does not happen for me. The environments could be different. I used to run on this on a Linux container, but I'm able to reproduce it now on Mac OS X v14.10.

Even if I do:

      connection.close((e) => {
        // Never reached if a connection disconnects
        console.log('NEVER REACHED!');
        process.exit();
        // Retry in 10 seconds
        setTimeout(reconnect, 10000);
      });

The process will never exit.

@dhensby
Copy link
Collaborator

dhensby commented Feb 14, 2019

Ah, no, that's just for the final point which, I agree, is inconsequential to your bug report

However I am not getting an infinite loop. It stops at about 1000 errors (connection attempts) can you share the non-secret parts of your DB config?

@clshortfuse
Copy link

clshortfuse commented Feb 14, 2019

The config is pretty limited:

const config = {
  server: '******.us-east-1.rds.amazonaws.com',
  port: ******,
  user: '*****',
  password: '********',
  database: '******',
};

You say you get about 1000 errors. I've been letting it spit out errors for the last 2 minutes. It's 120,000 errors with no sign of stopping, which is about 2000/sec as the original topic states. This is the error it spits out:

{ ConnectionError: Failed to connect to ******.us-east-1.rds.amazonaws.com:***** - getaddrinfo ENOTFOUND *****.us-east-1.rds.amazonaws.com
    at Connection.tedious.once.err (/Volumes/Mac SSD/projects/*****/node_modules/mssql/lib/tedious.js:239:17)
    at Object.onceWrapper (events.js:315:30)
    at emitOne (events.js:116:13)
    at Connection.emit (events.js:211:7)
    at Connection.socketError (/Volumes/Mac SSD/projects/*****/node_modules/mssql/node_modules/tedious/lib/connection.js:1024:14)
    at /Volumes/Mac SSD/projects/*****/node_modules/mssql/node_modules/tedious/lib/connection.js:868:25
    at GetAddrInfoReqWrap.callback (/Volumes/Mac SSD/projects/*****/node_modules/mssql/node_modules/tedious/lib/connector.js:69:18)
    at GetAddrInfoReqWrap.onlookupall [as oncomplete] (dns.js:79:17)
  code: 'ESOCKET',
  originalError: 
   { ConnectionError: Failed to connect to *****.us-east-1.rds.amazonaws.com:***** - getaddrinfo ENOTFOUND *****.us-east-1.rds.amazonaws.com
    at ConnectionError (/Volumes/Mac SSD/projects/*****/node_modules/mssql/node_modules/tedious/lib/errors.js:12:12)
    at Connection.socketError (/Volumes/Mac SSD/projects/*****/node_modules/mssql/node_modules/tedious/lib/connection.js:1024:30)
    at /Volumes/Mac SSD/projects/*****/node_modules/mssql/node_modules/tedious/lib/connection.js:868:25
    at GetAddrInfoReqWrap.callback (/Volumes/Mac SSD/projects/*****/node_modules/mssql/node_modules/tedious/lib/connector.js:69:18)
    at GetAddrInfoReqWrap.onlookupall [as oncomplete] (dns.js:79:17)
     message: 'Failed to connect to *****.us-east-1.rds.amazonaws.com:***** - getaddrinfo ENOTFOUND *****.us-east-1.rds.amazonaws.com',
     code: 'ESOCKET' },
  name: 'ConnectionError' }

That repeats infinitely.

@dhensby
Copy link
Collaborator

dhensby commented Feb 14, 2019

@clshortfuse to start, I'm testing on v5.0.0-beta.1 - what version are you currently using?

I've modified your supplied code slightly just to keep the verbosity down:

const mssql = require('mssql');

const sqlConfig = {
    password: '********',
    database: '********',
    // connectionTimeout: undefined,
    // requestTimeout: 30000,
    stream: false,
    options: { encrypt: true },
    port: 1433,
    user: '********',
    server: 'localhost',
    pool: {
        acquireTimeoutMillis: 15000,
    },
};
let errorCount = 0;

let pool = null;

/** @return {void} */
function reconnect() {
    mssql.connect(sqlConfig).then((connection) => {
        /**
         * @param {Error} err
         * @return {void}
         */
        function onConnectionError(err) {
            pool = null;
            connection.removeAllListeners('error');
            connection.addListener('error', (e) => {
                errorCount++;
                console.log('failed to close connection');
                // Every tick or so, a new ConnectionError appears
                // Gets stuck in infinite loop.
                console.error('connection error', errorCount);
            });
            connection.close((e) => {
                // Never reached if a connection disconnects
                console.log('NEVER REACHED!');
                // Retry in 10 seconds
                setTimeout(reconnect, 10000);
            });
        }
        connection.on('error', onConnectionError);
        pool = connection;
    }).catch((err) => {
        errorCount++;
        console.error('initial connection error', err);
        clearInterval(interval);
    });
}

/** @return {void} */
function getTime() {
    if (!pool) {
        console.log('no pool');
        return;
    }
    pool.request().query('SELECT GETUTCDATE()').then((result) => {
        console.log(result);
        console.log('DISCONNECT NETWORK ADAPTER NOW!');
    }).catch((err) => {
        errorCount++;
        console.error('error', errorCount);
        clearInterval(interval);
    });
}

const interval = setInterval(getTime, 5000);
reconnect();

From that I get this output:

$ node test.js 
{ recordsets: [ [ [Object] ] ],
  recordset: [ { '': 2019-02-14T23:25:46.930Z } ],
  output: {},
  rowsAffected: [ 1 ] }
DISCONNECT NETWORK ADAPTER NOW!
failed to close connection
connection error 1
failed to close connection
connection error 2
...
no pool
failed to close connection
connection error 375
failed to close connection
connection error 376
...
no pool
failed to close connection
connection error 750
failed to close connection
connection error 751
...
connection error 999
error 1000
NEVER REACHED!
failed to close connection
connection error 1001
initial connection error { ConnectionError: Failed to connect to localhost:1433 - Could not connect (sequence)
    at Connection.tedious.once.err (/.../node_modules/mssql/lib/tedious.js:244:17)
    at Object.onceWrapper (events.js:277:13)
    at Connection.emit (events.js:189:13)
    at Connection.socketError (/.../node_modules/tedious/lib/connection.js:1095:12)
    at Connector.execute (/.../node_modules/tedious/lib/connection.js:961:21)
    at SequentialConnectionStrategy.connect (/.../node_modules/tedious/lib/connector.js:121:7)
    at Socket.onError (/.../node_modules/tedious/lib/connector.js:136:12)
    at Socket.emit (events.js:189:13)
    at emitErrorNT (internal/streams/destroy.js:82:8)
    at emitErrorAndCloseNT (internal/streams/destroy.js:50:3)
    code: 'ESOCKET',
        originalError:
    { ConnectionError: Failed to connect to localhost:1433 - Could not connect (sequence)
        at ConnectionError (/.../node_modules/tedious/lib/errors.js:13:12)
        at Connection.socketError (/.../node_modules/tedious/lib/connection.js:1095:28)
        at Connector.execute (/.../node_modules/tedious/lib/connection.js:961:21)
        at SequentialConnectionStrategy.connect (/.../node_modules/tedious/lib/connector.js:121:7)
        at Socket.onError (/.../node_modules/tedious/lib/connector.js:136:12)
        at Socket.emit (events.js:189:13)
        at emitErrorNT (internal/streams/destroy.js:82:8)
        at emitErrorAndCloseNT (internal/streams/destroy.js:50:3)
        at process._tickCallback (internal/process/next_tick.js:63:19)
        message:
            'Failed to connect to localhost:1433 - Could not connect (sequence)',
                code: 'ESOCKET' },
    name: 'ConnectionError' }

and the process exits with exit code 0

@dhensby
Copy link
Collaborator

dhensby commented Feb 15, 2019

@clshortfuse I think I'm getting a better understanding of this now. I'm going to close this as I've opened a new issue so that it's a bit less noisy (#805). I'd really appreciate your comment on it.

@dhensby
Copy link
Collaborator

dhensby commented Feb 18, 2019

@clshortfuse you're right, this is a complete excrement exhibit. I've made the changes I can to #805 to make this a bit more reasonable, but there's still an issue where the pool will still try to create connections even after close the pool (see coopernurse/node-pool#256)

I've also opened #808, but this changes APIs so can't come out in v4 or v5... thought I don't suspect v6 is too far off given tedious are releasing new versions like it's going out of fashion

@dhensby
Copy link
Collaborator

dhensby commented Mar 8, 2019

This is now fixed in v5 by using a backoff (#806) and v6 by dropping generic-pool (#808)

@dhensby dhensby closed this as completed Mar 8, 2019
@tediousjs tediousjs locked as resolved and limited conversation to collaborators Mar 8, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants