Subject: Re: API tweak request

Re: API tweak request

From: William Ahern <william_at_25thandclement.com>
Date: Mon, 2 Nov 2009 15:49:34 -0800

On Mon, Nov 02, 2009 at 05:58:23PM -0500, John Engelhart wrote:
<snip>
> > they are highly specific for the ares_expand_*() functionality while one
> > could argue the way you do for all the functions c-ares provides that return
> > allocated data. And I don't want two new functions for each and every one of
> > them.
> >
> > So where do we draw the line?
>
> Well, as I mentioned, it's a minor inconvenience for me. It's a
> 'different' way of dealing with returning results, and it's one of those
> choices that once you've made it, it's hard to undo it. The ICU Unicode
> library tends to use the API style I outlined, for example.

At a minimum why have two functions? Wouldn't something with the semantics of
snprintf()/strlcpy() work better?

<snip>
> In this case, it's not really my choice. The way any given object manages
> it's internals is beyond me. It's similar to not being able to 'force' any
> given function in C to "keep" the passed in pointer to a buffer instead of
> malloc()ing it's own private copy, such as when ares_search() does a
> strdup() on the query name argument.

So is the issue merely convenience for you or are you worried about the
allocation overhead? Considering that you're using Objective-C, the
performance aspect is hardly persuasive. More importantly, the original ares
API was deliberately and validly designed to passback dynamic memory
objects. It's not everybody's cup-of-tea (I prefer snprintf()-type
semantics), but there's much to be said for consistency and simplicity in
sticking to the original design.

<snip>
> > It's not very hard to make sure that memory doesn't leak in such a design.
> > I find that a rather weak argument.
>
> I would generally agree with you if we were talking about C99. Even though
> Objective-C is a strict superset of C, there are some differences, and one
> of those is Objective-C has an exception handling infrastructure (which,
> depending on the architecture and ABI, uses plain C and a combination of
> setjmp/longjmp + runtime bits and pieces, or the C++ exception handling
> machinery). An exception could be thrown which will cause the stack to
> unwind until something catches the exception. A C equivalent might be
> something like:

There are several places in c-ares that would leak memory and/or do the
wrong thing if an exception was thrown from a callback (see, e.g. the
freeing of server->tcp_buffer in read_tcp_data()). The problem is that in
many cases callbacks are done from loops, and also some cleanups aren't done
until after the callback. If you can fix the latter than you might make it
safe for exceptions. But until the former is fixed c-ares isn't generally
re-entrant safe. Practically speaking, you shouldn't let exceptions
propogate from a callback, period. I tried to submit a patch to make c-ares
completely re-entrant (including destroying it from a callback), but it was
deemed too intrusive (and perhaps rightfully so; c-ares works for people,
they depend on it as-is, and there's a strong argument for being
conservative with the codebase).

<snip>
> Which reminds me: Just what is the encoding for the strings that
> ares_expand_name() and ares_expand_string() return? I haven't dug that deep
> in to the DNS RFC's at this point to dig out the official answer. Anyone
> know off the top of their head? Or any pointers for dealing with
> "non-ASCIIish" stuff in DNS replies?

DNS uses a domain name compression scheme (suffix scheme?) which uses
pointers into domains elsewhere in the packet. Basically, every label can
either be a string, or a pointer to another label. If you encounter a
pointer, you just continue parsing from that label.

yahoo.com => yahoo.com
www.yahoo.com => www.[pointer to yahoo.com]

It was perhaps a good idea in the 80's and 90's (but see Berstein's opinion
at http://cr.yp.to/djbdns/notes.html), but today most domain information is
scattered across different systems and uses out-of-bailiwick references
which makes the compression impossible to perform. For example:

        www.microsoft.com IN CNAME xyz1.akamai.com

There's no common suffix domain. Of course, that is a poor example because,
secondly, you're not supposed to compress record data anyhow (otherwise
caching name servers could not cache unknown record types). The better
approach is like Berstein suggested; just to have used a generic compression
algorithm over the whole packet.
Received on 2009-11-03