Subject: Re: New release?

Re: New release?

From: John Engelhart <john.engelhart_at_gmail.com>
Date: Tue, 27 Oct 2009 20:43:22 -0400

On Tue, Oct 27, 2009 at 6:51 PM, Daniel Stenberg <daniel_at_haxx.se> wrote:

> On Tue, 27 Oct 2009, John Engelhart wrote:
>
> On investigation I found that ares_timeout was returning a timeout of 0
>> seconds and 0 microseconds. The little digging I've done so far turned up
>> that ares_send.c sets the timeout to 0 seconds and 0 microseconds "by
>> default".
>>
>
> That's the creation of a new query so it just clears the variable, as it
> hasn't sent off the query yet.
>
> In the end of ares_send() it calls ares__send_query(), and within the
> ares__send_query() function the timeout is advanced (ares_progress.c line
> 785 in my case). So when the query is then sent off, it sets the timeout
> properly.
>
> It would only remain 0+0 if the query for some reason wasn't sent, but then
> shouldn't it retry pretty fast then?

Geeze, I wish that query I was using was still "broken" and taking a long
time. Without it, it's hard to reproduce the behavior with out it. :(

For whatever reason, I can tell you that the end result was that
occasionally, the loop that I previously posted would essentially "busy wait
poll" for seconds at a time, calling select() up to 500,000 times / sec.

Regardless, I really think that there should be some kind of "minimum" floor
returned instead of 0/0. Even with a minimum of 1000 microseconds (1
millisecond), that's 1000 times per second. Is there really a need to make
an attempt with such frequency? Even trying 100 times per second seems a
bit excessive, maybe more like 10 times per second.

There's also something about the whole thing that's nagging at me. The work
loop I gave deals with "receiving packets". It doesn't really deal with
"sending packets", at least not explicitly. In fact, looking at the API,
there doesn't seem to be an explicit function for dealing with "sending
packets", other than the initial 'top-level' API call that starts the query.
 Well, I guess there's no explicit API for dealing with the work of "dealing
with sending of packets that aren't deal with by the initial send/query
call.", such as dealing with packets that need to be retransmitted like you
suggested above.

Basically, if ares_timeout determines that a retry needs to happen at 0+0,
how exactly is that accomplished in the work loop? ares_timeout() only
determines that "some work needs to be done right now, so return 0+0." This
tells select() to return immediately. The fd_set result from select() is
then handed off to ares_process(). After a bit of digging, the answer seems
to be that ares_process() is the function that deals with timed-out packets
via a call to process_timeouts().

Digging a bit more, maybe there's a bug in process_timeouts() ? Such that
ares_timeout() thinks there is something that needs to be dealt with "right
now", but isn't being caught by process_timeouts(). Considering that
process_timeouts() tries to be "clever" about dealing timeouts, I'm guessing
that's where the problem really is.

Actually, now that I'm looking at process_timeouts(), it looks like how
timeouts are managed in general could use a lot of work. As noted in
ares_timeouts(), it does a linear scan of the "queue", yet
process_timeouts()'s uses an alternate means of determining what needs to be
processed in the queue, ostensibly in the name of "speed". I'm not sure I
much see the point in trying to be clever in process_timeouts() by avoiding
a linear scan of the queue when ares_timeouts() does exactly that, and is
called just as frequently. It's also fairly easy to get this wrong, and
probably the cause of the behavior I observed.

While I agree a linear scan of the queue isn't the most efficient means, it
is the "easiest". Even with hundreds of packets in an unsorted linked list,
just how much time is really spent running through the queue, anyways? Did
profiling actually determine that this was a source of performance problems,
or was this a case of "premature optimization"? At first glance, I'd say
this is a case where the optimization hack greatly increases the complexity
of the code, is a lot harder to get right, isn't a performance problem in
practice, and is essentially nullified because ares_timeout() does a linear
scan anyways.. and ares_timeout() is probably called once on each work loop
pass for most end users of the library, as that's what's documented and
given in the examples.

If it really is a genuine performance problem, there's better ways to crack
this nut than the solution used in process_timeouts. The most obvious is to
use a better data structure- something like a red black tree or a skip list.
 This lets you add and remove items from the queue with good, bounded time
performance and access the queue in sorted order. This would allow
ares_timeouts to look at the first item in the queue in essentially O(1)
time, and process_timeouts() to stop traversing the queue as soon as it
reached a packet that was "in the future".

Heck, even a crude, double-linked list is probably a better idea, and much
easier to get "correct" that the current hash optimization hack. Having a
pointer to the head and tail items in the queue would allow you quick access
to both ends. When you want to add an item you'd start at the tail and walk
the linked list in reverse until you find the right point to insert the new
node. This would take advantage of the fact that in the common case, new
items are going to have a time out that is "after" all the current items.
 Even in the worst case, you'd have a O(queue_size) number of operations to
get to the head.... this this linear walk of the queue would be amortized
over the savings that ares_timeout() and process_timeouts() would gain,
which would now be O(1) all the time. Both ares_timeout() and
process_timeouts() need to be "fast" and are called frequently, where-as an
insert happens just once, and in the common case, is also a O(1) operation.
Received on 2009-10-28