Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RiakError with no further details sometimes raised when calling store() [JIRA: CLIENTS-553] #410

Closed
dellsystem opened this issue Aug 25, 2015 · 12 comments

Comments

@dellsystem
Copy link

Very occasionally, I'll see a RiakError (with no further details) raised during the execution of my code, typically when calling the store() method on a riak object (with no arguments).

The traceback (starting from the line immediately after the store() call) looks like this:

  File "/home/ubuntu/env/local/lib/python2.7/site-packages/riak/riak_object.py", line 286, in store
    timeout=timeout)
  File "/home/ubuntu/env/local/lib/python2.7/site-packages/riak/client/transport.py", line 196, in wrapper
    return self._with_retries(pool, thunk)
  File "/home/ubuntu/env/local/lib/python2.7/site-packages/riak/client/transport.py", line 138, in _with_retries
    return fn(transport)
  File "/home/ubuntu/env/local/lib/python2.7/site-packages/riak/client/transport.py", line 194, in thunk
    return fn(self, transport, *args, **kwargs)
  File "/home/ubuntu/env/local/lib/python2.7/site-packages/riak/client/operations.py", line 554, in put
    timeout=timeout)
  File "/home/ubuntu/env/local/lib/python2.7/site-packages/riak/transports/pbc/transport.py", line 220, in put
    MSG_CODE_PUT_RESP)
  File "/home/ubuntu/env/local/lib/python2.7/site-packages/riak/transports/pbc/connection.py", line 56, in _request
    return self._recv_msg(expect)
  File "/home/ubuntu/env/local/lib/python2.7/site-packages/riak/transports/pbc/connection.py", line 178, in _recv_msg
    raise RiakError(bytes_to_str(err.errmsg))
RiakError

It happens very rarely, and I haven't been able to reproduce it on demand yet to uncover the actual problem. In the meantime, I would appreciate any suggestions on the best way to handle this. Should I catch the RiakError in the application level, wrapping the .store() call in a try/catch block? Or is there a better way?

Thanks in advance.

@Basho-JIRA Basho-JIRA changed the title RiakError with no further details sometimes raised when calling store() RiakError with no further details sometimes raised when calling store() [JIRA: CLIENTS-553] Aug 25, 2015
@hazen
Copy link

hazen commented Aug 25, 2015

@dellsystem, a try/catch block would certainly be a good solution. You could read the key again and see if it was actually successfully written. It does seem strange, however, that your exception doesn't seem to have any error message. Do you see anything on the server side in the logs?

@dellsystem
Copy link
Author

Thanks for the tip about checking the logs @javajolt.

Here's what I see in console.log, at around the same time that the RiakError occurred (different machines with some clock drift, so I'm not sure if it was exactly the same time or not):

2015-08-24 17:05:23.139 [info] <0.419.0>@riak_kv_entropy_manager:perhaps_log_throttle_change:853 Changing AAE throttle from 10 -> 0 msec/key, based on maximum vnode mailbox size 1 from '[email protected]'

Would that explain it? There are no other logs that contain entries around that time.

@hazen
Copy link

hazen commented Aug 25, 2015

I don't think that would be problematic. It just information from the AAE subsystem.

@dellsystem
Copy link
Author

Could it have something to do with Solr? The bucket that I was updating has a Solr index, and I've been having some problems with Solr lately - it seems as if there's a Riak node containing old versions of certain search indexes.

I also see this in /var/log/solr.log of some (but not all) of the Riak nodes in my cluster:

2015-08-27 01:09:32,356 [WARN] <qtp248927068-366709>@CoreAdminHandler.java:495 Creating a core with existing name is not allowed
2015-08-27 01:09:32,356 [ERROR] <qtp248927068-366709>@SolrException.java:109 org.apache.solr.common.SolrException: Core with name 'someindexname' already exists.
    at org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:496)
    at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:152)
    at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
    at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:732)
    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:268)
    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
    at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
    at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
    at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
    at org.eclipse.jetty.server.Server.handle(Server.java:368)
    at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
    at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
    at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
    at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
    at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
    at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
    at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
    at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
    at java.lang.Thread.run(Thread.java:745)

2015-08-27 01:09:32,356 [ERROR] <qtp248927068-366709>@SolrException.java:120 null:org.apache.solr.common.SolrException: Core with name 'someindexname' already exists.
    at org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:496)
    at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:152)
    at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
    at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:732)
    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:268)
    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
    at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
    at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
    at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
    at org.eclipse.jetty.server.Server.handle(Server.java:368)
    at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
    at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
    at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
    at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
    at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
    at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
    at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
    at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
    at java.lang.Thread.run(Thread.java:745)

This message shows up on the logs approximately once a minute. Not sure what's causing it, or if it's even related to my original problem, so any thoughts on this would be appreciated

@hazen
Copy link

hazen commented Aug 27, 2015

Are you re-creating the Solr index? If so, we already have an issue pointing out this problem: #408

@dellsystem
Copy link
Author

Not that I know of. I do have scripts that occasionally try to access the indexes mentioned, but they never deliberately attempt to create the index, and they don't run at that frequency either. Is there anything else that could be causing those errors?

I've also recently been getting a RiakError with no further details when trying to run a Solr query using the Riak python client, but this time I see the following message in solr.log. Maybe it's related?

2015-08-27 14:59:51,918 [WARN] <qtp814878326-21>@ServletHandler.java:517 /internal_solr/someindexname/select
java.lang.IllegalStateException: Committed
    at org.eclipse.jetty.server.Response.resetBuffer(Response.java:1144)
    at org.eclipse.jetty.server.Response.sendError(Response.java:314)
    at org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:824)
    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:448)
    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
    at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
    at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
    at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
    at org.eclipse.jetty.server.Server.handle(Server.java:368)
    at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
    at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
    at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
    at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
    at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
    at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
    at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
    at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
    at java.lang.Thread.run(Thread.java:745)

EDIT: Just had a thought, related to the issue you linked. I did do some index deleting/recreating a while back. Could the log messages be related to that? Is there a background process that periodically checks search indexes for validity or something, that keeps running even weeks after the indexes were last updated?

@hazen
Copy link

hazen commented Aug 27, 2015

@dellsystem Which version of Riak are you using? Could you explicitly remove this index (someindexname) especially if you are not using it?

@dellsystem
Copy link
Author

@javajolt, 2.1.1.

I'm going to do some index deleting/recreating to see if that fixes anything. I'll update this issue with what I find. Thanks for your help.

@dellsystem
Copy link
Author

Update on this, in case anyone else has the same problem: I've rebuilt all the problematic indexes, upgraded the specs of the machines in the cluster, and incorporated the retrying Python library in case it happens again. I haven't seen this specific error (when calling .store()) occur again, so I suspect rebuilding the indexes fixed it, at least for the time being.

@hazen
Copy link

hazen commented Sep 1, 2015

@dellsystem Can we please close this issue then and reopen if it comes back? I'm also suspicious that this might be related to Solr than to the Python client.

@dellsystem
Copy link
Author

It does feel like more of a Solr issue. Will do.

@Basho-JIRA
Copy link

Fixed, or closed via GitHub issues.

[posted via JIRA by Alexander Moore]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants