Subject: Re: Need help to understand how timeouts work in C-ARES

Re: Need help to understand how timeouts work in C-ARES

From: Timo Teräs <timo.teras_at_iki.fi>
Date: Tue, 03 Jan 2012 08:41:06 +0200

On 01/03/2012 08:33 AM, Henrik Størner wrote:
> Hi Timo,
>
> On 03-01-2012 07:12, Timo Teräs wrote:
>> On 01/02/2012 11:01 PM, Henrik Størner wrote:
>>> I then thought that after 10 seconds, my callback would be invoked with
>>> a status of ARES_ETIMEOUT, but that does not happen.
>>
>> Your test program is flawed. It should call ares_process() *also* when
>> select() returns time out.
>
> thanks for looking at this. OK, that does make sense - however, although
> it changes the behaviour a bit, it doesn't make the callback trigger
> with a timeout event.
>
> I changed the select-processing to do this:
>
> gotresponse = 0;
> while (!gotresponse) {
> FD_ZERO(&fdread); FD_ZERO(&fdwrite);
> maxfd = ares_fds(channel, &fdread, &fdwrite);
> if (maxfd == 0) {
> fprintf(stderr, "No more fds to test\n");
> gotresponse = 1;
> continue;
> }
>
> selecttimeout.tv_sec = 1; selecttimeout.tv_usec = 0;
> ares_timeout(channel, &selecttimeout, &selecttimeout);
>
> n = select(maxfd, &fdread, &fdwrite, NULL, &selecttimeout);
> if (n == -1) {
> fprintf(stderr, "select failed\n");
> return 1;
> }
>
> if (n == 0) fprintf(stderr, "timeout during select\n", maxfd);
>
> ares_process(channel, &fdread, &fdwrite);
> }
>
> so ares_process is invoked also in case of a timeout by select(). But
> the timeout callback never fires, instead it stays stuck in this loop
> doing select, timing out, calling ares_process(), ... forever.
>
> Running it does appear to indicate that something happens as part of the
> timeout setting, since the timeout returned from ares_timeout() changes
> ever so slightly around the time that the timeout *should* happen:
> $ ./a.out
> select timeout=1.000000, maxfd=4
> timeout during select
> select timeout=1.000000, maxfd=4
> timeout during select
> select timeout=1.000000, maxfd=4
> timeout during select
> select timeout=1.000000, maxfd=4
> timeout during select
> select timeout=1.000000, maxfd=4
> timeout during select
> select timeout=1.000000, maxfd=4
> timeout during select
> select timeout=1.000000, maxfd=4
> timeout during select
> select timeout=1.000000, maxfd=4
> timeout during select
> select timeout=1.000000, maxfd=4
> timeout during select
> select timeout=0.991000, maxfd=4
> timeout during select
> select timeout=1.000000, maxfd=4
> timeout during select
>
> See how on the 10th round of the loop, the timeout is 0.991 seconds
> instead of a round 1.0 ? So something is happening with this 10 second
> timeout, but unfortunately it doesn't trigger the timeout callback.

I believe you are too hasty. The timeout is the time out until first
resend of query happens.

ARES_OPT_TIMEOUT is documented as:
        The number of seconds each name server is given to respond
        to a query on the first try. (After the first try, the
        timeout algorithm becomes more complicated, but scales
        linearly with the value of timeout.) The default is five
        seconds. This option is being deprecated by ARES_OPT_TIMEOUTMS
        starting in c-ares 1.5.2.

You might want to set ARES_OPT_TRIES:
        The number of tries the resolver will try contacting each
        name server before giving up. The default is four tries.

-Timo
Received on 2012-01-03