Subject: Need help to understand how timeouts work in C-ARES

Need help to understand how timeouts work in C-ARES

From: Henrik Størner <henrik-cares_at_hswn.dk>
Date: Mon, 02 Jan 2012 22:01:22 +0100

Hi,

I am trying to use C-ARES for an application that will do DNS lookups in
parallel with other network I/O (it is a network test-tool, meant to
check the availability of various network services). And either I do not
understand how the timeout mechanism is supposed to work in C-ARES, or
there is a bug somewhere in the timeout handling.

To demonstrate this, I have written a small test application based
mostly on the "adig" utility that comes with C-ARES. This application
does the following:

- use ares_library_init() to initialize c-ares library;
- create a channel with the options including ARES_OPT_TIMEOUT with a
value of 10;
- use ares_set_servers() to point the channel to an IP that does not
respond;
- use ares_query() to send a single query to the non-responding DNS server.
It then loops doing ares_fds(), ares_timeout(), select() and
ares_process(). I pass a "maxtv" value to ares_timeout(), because I only
want to wait 1 second between each select() call (to do other I/O
between the ARES checks); as I understand the man-pages, this should be
perfectly fine.

The callback for the channel simply prints out a message that it was
invoked.

I then thought that after 10 seconds, my callback would be invoked with
a status of ARES_ETIMEOUT, but that does not happen. Instead,
ares_timeout() begins to return a timeout value of 0.0 seconds;
ares_fds() still indicates that the channel is active. The result is
that the application starts busy-looping in the
ares_fds()/ares_timeout()/select()/ares_process() loop and stays there
forever.

I have tested this with version 1.7.5 of C-ARES.

If you want to check the details, the test application can be found at
http://henrik.hswn.dk/test-arestimeout.c - it isn't completely polished
for portability, but it does compile on Linux with "gcc -I./c-ares-1.7.5
test-arestimeout.c c-ares-1.7.5/.libs/libcares.a"

A workaround is to let my application keep track of how long the lookup
has been underway, and then invoke ares_destroy() on the channel after
10 seconds - in that case the callback *is* invoked with the
ARES_EDESTRUCTION status, as expected. That may be the solution for now,
but then I don't see what the point is in having the timeout option in
C-ARES at all...

I hope someone can shed some light on this.

Thanks,
Henrik
Received on 2012-01-02