<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Edell, David J. wrote:
<blockquote
cite="mid:FEEF8FE8A931BE478D2538A677C48BED0334AC03@aplesjustice.dom1.jhuapl.edu"
type="cite">
<meta http-equiv="Content-Type" content="text/html; ">
<meta content="MSHTML 6.00.2900.3268" name="GENERATOR">
<div><font face="Arial" size="2"><span class="827071019-10042008">Scott,</span></font></div>
<div><font face="Arial" size="2"><span class="827071019-10042008"></span></font> </div>
<div><font face="Arial" size="2"><span class="827071019-10042008">I'm
finishing up our AMS implementation and assembled a few more
miscllaneous comments/questions that I have come across. Funding on
the project is about at an end though (and I'm leaving on vacation for
CA tonight), therefore our first version of AMS can be considered
completed for now. The APL-AMS implementation is largely, with the
exception of invitations and related functions, compliant with
conformance class 4 as described. </span></font></div>
<div><font face="Arial" size="2"><span class="827071019-10042008"></span></font><font
face="Arial" size="2"><span class="827071019-10042008"></span></font> </div>
<div><font face="Arial" size="2"><span class="827071019-10042008">I
was taking another look at the configuration server interrogation
process. My current implementation utilizes a shared memory location
for announcing the registrar's location, however I'm looking into an
alternative MAMS-based method to avoid some of the synchronization and
related issues with the shared memory. </span></font></div>
<div><font face="Arial" size="2"><span class="827071019-10042008"></span></font> </div>
<div><font face="Arial" size="2"><span class="827071019-10042008">If
I modify the process to use the MAMS registrar_query, and await a
cell_spec/registrar_unknown response though, how would the CS know
where to direct the response as defined? It seems that the
registrar_query message would require a MAMS endpoint as its
supplemental data, indicating the origin of, and return
destination for, the given message.</span></font></div>
</blockquote>
(David, I hope you don't mind, but I think the points you raise here
are important enough to bring forward to the whole working group, so
I'm cc:ing sis-ams on my reply.)<br>
<br>
That's one of the big reasons TCP isn't a really good choice as a
primary transport protocol (which is what is used for MAMS traffic):
it's a bootstrapping problem. If you use UDP or some other
non-connection-oriented protocol, the transport protocol itself will
give you<br>
the origin of the registrar_query; that "echo" location (host/port, for
example) can be opaquely passed up to the registrar_query handler and
then passed back down when the cell_spec or registrar_unknown is
returned.<br>
<br>
You're right, though, that including the response MAMS endpoint in
every MAMS message that is of a query nature is a reasonable
alternative: it removes one constraint on what makes a good primary
transport protocol, at the cost of a little extra transmission. I
dunno. My inclination is not to change something that currently seems
to work okay unless there's a really compelling reason, and I'm not
sure we've got one yet. <font face="Arial" size="2"><span
class="827071019-10042008"></span></font>
<font face="Arial" size="2"><span class="827071019-10042008"></span></font>
Anybody have any thoughts on this?<br>
<blockquote
cite="mid:FEEF8FE8A931BE478D2538A677C48BED0334AC03@aplesjustice.dom1.jhuapl.edu"
type="cite">
<div><font face="Arial" size="2"><span class="827071019-10042008">It's
specified that a Fault.Indication is generated if no response is
received from a node_registration request within N2 seconds. Shouldn't
there be a similar condition for a reconnect messge? I've implemented
the same timeout for both states, with the node generating a
fault.indication and then automatically resending the registration or
reconnect message as appropriate.</span></font></div>
</blockquote>
I think I see your point. The node_registration Fault.indication gives
the application an opportunity to decide whether it wants to keep
trying or bail. Once the node is connected, it is likely engaged in
ongoing application message exchange with its peers; if the registrar
dies while this is going on, the node probably doesn't want to
interrupt its application message exchange activity just because
there's no registrar, so my inclination has been not to bother the node
with reconnection status. On the other hand, it might be helpful to
tell the node that any new subscriptions it posts aren't going to have
any effect because the registrar is dead.<br>
<br>
>From that point of view, what would probably be more useful would be to
deliver a Fault.indication when the absence of the registrar is first
noticed, and then deliver some other sort of indication -- something
new -- when the reconnection succeeds, ignoring the timeouts (since
they don't change the registrar connection state). But that isn't
quite perfect either, since the registrar could actually have been dead
for 30 seconds before AMS detected the third missing heartbeat and
notified the application.<br>
<br>
I like the idea of giving the application better information, but I
don't know how really substantial the benefit would be. Because we're
trying to get a Blue Book out relatively soon, so missions can have
something stable to work from, my inclination is to defer making
significant changes until after we've got some operational experience
and then incorporate the lessons learned into a second version.<font
face="Arial" size="2"><span class="827071019-10042008"></span></font>
<blockquote
cite="mid:FEEF8FE8A931BE478D2538A677C48BED0334AC03@aplesjustice.dom1.jhuapl.edu"
type="cite">
<div><font face="Arial" size="2"><span class="827071019-10042008">The
reference field of a 'you_are_dead' message is noted as an echo,
however technically there is no originating message when the message is
transmitted in response to a heartbeat timeout. I'm assumming the
targets node ID as the reference value in this case (although the field
is ignored either way by the node)</span></font></div>
</blockquote>
Actually the node shouldn't send you_are_dead other than in response to
a message (either reconnect or heartbeat); a heartbeat timeout just
causes an "imputed termination". This has various effects depending on
what terminated, but itdoesn't produce you_are_dead in any of those
cases.<br>
<br>
You're right, though, that the reference value in a you_are_dead is
only meaningful when the message it's responding to is reconnect, in
which case it's what enables the (nominally blocking) reconnect query
to be unblocked and terminated. A you_are_dead that is sent in
response to a heartbeat isn't turning off anything that's blocking; its
reference value is the heartbeat source, copied from the heartbeat
message, but that isn't especially useful to the now-officially-dead
sender of the heartbeat.<br>
<br>
Scott<br>
<br>
</body>
</html>