Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simpler client request interface? #356

Open
gavin-norman-sociomantic opened this issue Aug 9, 2018 · 14 comments
Open

Simpler client request interface? #356

gavin-norman-sociomantic opened this issue Aug 9, 2018 · 14 comments

Comments

@gavin-norman-sociomantic
Copy link

gavin-norman-sociomantic commented Aug 9, 2018

(This issue possibly doesn't apply to swarm specifically, but might have to be implemented independently in each *proto. Writing it here initially, just to put it in a central location.)

The current recommended client usage looks something like this (taking the DHT client's Get request as an example):

private void getNotifier ( DhtClient.Neo.Get.Notification info, Const!(DhtClient.Neo.Get.Args) args )
{
    with ( info.Active ) switch ( info.active )
    {
        // Success cases...
        case received:
            // Received the requested record.
            break;

        case no_record:
            // Key not found in DHT.
            break;

        // Transient error cases...
        case timed_out:
        case node_disconnected:
            // The request may be retried.
            break;

        // Fatal error cases...
        case failure:
        case no_node:
        case node_error:
        case wrong_node:
        case unsupported:
            // The request may be retried, but is unlikely to ever succeed.
            break;
    }
}

This is of the "precise knowledge gives great control" variety of API: it provides the user with full information about what happened, allowing them to decide what to do, but does not implement any default behaviour.

On the other hand, we have the simplified, Task-blocking API, which looks like this:

void[] get_buf;
auto result = this.dht.blocking.get("channel", 0x1234567812345678,
    get_buf);
if ( result.succeeded )
{
    if ( result.value.length )
        // Received the requested record.
    else
        // Key not found in DHT.
}
else
    // All error cases

This is the opposite extreme: it only provides the user with ultra minimal information about what happened, and still doesn't implement any default error handling behaviour.

It seems to me that:

  1. Real use cases will inevitably use the first, more complex API.
  2. Most (all?) use cases will end up implementing more or less the same error handling behaviour via the notifier.
  3. An API that provides built-in error handling would be more useful.

Something like this, for DHT Get:

  • Return the fetched record to the user in a buffer.
  • Automatically log and retry the request after a transient error.
  • Automatically log and abort the request after a fatal error.
@gavin-norman-sociomantic
Copy link
Author

gavin-norman-sociomantic commented Aug 9, 2018

@scott-gibson-sociomantic @don-clugston-sociomantic @daniel-zullo-sociomantic as the primary users of the neo clients, any thoughts on this?

@scott-gibson-sociomantic

in the case of fatal errors failure, no_node, node_error, wrong_node and unsupported, is there any thing that would an application would do differently as a response for these cases? Could these be lumped in to a generic fatal_error that provides more detail in the format notification?

As for retry behaviour, i'm unsure how to approach this as my applications have different requirements (some should always retry, some can retry a few times then decide not to continue). As long as these apps can have control over the behaviour then perhaps a simple request wrapper would suffice.

In either case i think we might need a greater sample size of usage until we can try to simplify.

@gavin-norman-sociomantic
Copy link
Author

in the case of fatal errors failure, no_node, node_error, wrong_node and unsupported, is there any thing that would an application would do differently as a response for these cases? Could these be lumped in to a generic fatal_error that provides more detail in the format notification?

Right, that could work. It'd require quite a bit of reworking of the way things are currently set up, but I think it'd present a nicer API to the end-user.

As for retry behaviour, i'm unsure how to approach this as my applications have different requirements (some should always retry, some can retry a few times then decide not to continue). As long as these apps can have control over the behaviour then perhaps a simple request wrapper would suffice.

Indeed, this is the tricky bit. I imagine that one-shot requests like this that could be retried would have some additional arguments to configure this behaviour. Something like this maybe:

DhtClient.Neo.Get.RetryConfig retry;
retry.max_retries = 3;
dht_client.get("channel", key, retry);

@gavin-norman-sociomantic
Copy link
Author

Right, that could work. It'd require quite a bit of reworking of the way things are currently set up, but I think it'd present a nicer API to the end-user.

Though... it'd mean a union inside a union, which might not be that fun to deal with.

@gavin-norman-sociomantic
Copy link
Author

In either case i think we might need a greater sample size of usage until we can try to simplify.

I agree with this point. We don't have a large enough usage sample yet to be able to make a good judgement. I just wanted to get some ideas going, in advance.

@nemanja-boric-sociomantic
Copy link
Contributor

How about having the taskblocking call return success on success (as is now) and a SmartUnion on failure, so that user can react on only those events they are interested in:

void[] get_buf;
auto result = this.dht.blocking.get("channel", 0x1234567812345678,
    get_buf);
if ( result.succeeded )
{
    if ( result.value.length )
        // Received the requested record.
    else
        // Key not found in DHT.
}
else {
    with (result.error) switch (result.error) {
        case node_disconnected:
            schedule_retry(); break;
     }
}

@gavin-norman-sociomantic
Copy link
Author

That would be an improvement to the current API, yes. It wouldn't address the point about most apps (probably) ending up implementing the same error handling logic, though, which is kind of my main problem with the current approach.

@gavin-norman-sociomantic
Copy link
Author

Thoughts on implementing this:

  • The retrying logic should be implementing inside each request handler, not at the level of the request assignment method.
  • Doing it this way will make error handling consistent across all requests (i.e. handle retrying or continuing after transient errors).
  • This approach will also be compatible with request timeouts, for requests that implement them. (Timeouts work at the level of the request, so you could have a situation where a request starts, gets a connection error, retries, and succeeds, all within the timeout.)

@don-clugston-sociomantic
Copy link
Contributor

Scott;

in the case of fatal errors failure, no_node, node_error, wrong_node and unsupported, is there any thing that would an application would do differently as a response for these cases? Could these be lumped in to a generic fatal_error that provides more detail in the format notification?

These cases are not quite the same. Some of those, such as unsupported indicated a deployment versioning problem. I would want to terminate the app in that case, it can only be fixed by installing new software somewhere.
But some of the others might be caused by hardware problems in one of the nodes or in a router. In that case the ISP or infrastructure team may fix it, so that it will recover automatically, without needing to restart the app. But even so, I think your idea of a fatal_error is a good one.

Gavin:

Real use cases will inevitably use the first, more complex API.

I don't think that's true. For apps like shore, where the data is not very valuable, the task blocking API is perfect.
In general I'm wary of something like a framework that tries to simplify the most common kinds of requests. It's very easy for that sort of thing to be not-quite-flexible-enough.

@daniel-zullo
Copy link

You might want to consider defining a generic default behaviour for retrying a request and error handling given the user the flexibility to implement their own ones for special/corner cases. From the user perspective this could help avoiding code duplication and also reduce the focus in the logic of handling a request and mainly deal with the business logic on application/client side instead.

@gavin-norman-sociomantic
Copy link
Author

You might want to consider defining a generic default behaviour for retrying a request and error handling given the user the flexibility to implement their own ones for special/corner cases. From the user perspective this could help avoiding code duplication and also reduce the focus in the logic of handling a request and mainly deal with the business logic on application/client side instead.

Yeah, I certainly wouldn't envisage removing the current "full info, full power" API.

@gavin-norman-sociomantic
Copy link
Author

I don't think that's true. For apps like shore, where the data is not very valuable, the task blocking API is perfect.

Interesting. I didn't expect anyone to use the blocking API without the notifiers in anything but simple tests. The danger of it is that error logging will be very unhelpful.

@gavin-norman-sociomantic
Copy link
Author

Don, maybe we're getting mixed up. There are three ways of assigning many requests:

  1. The callback-based, full notifier method.
  2. The blocking, full notifier method.
  3. The blocking, no notifier method.

3 isn't recommended for use in real apps, but 2 is.

@scott-gibson-sociomantic

Some of those, such as unsupported indicated a deployment versioning problem. I would want to terminate the app in that case, it can only be fixed by installing new software somewhere.

Theoretically the node could be reverted or updated that would resolved the unsupported request. An aggregation app continue running aggregating data in that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants