<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2900.3268" name=GENERATOR></HEAD>
<BODY text=#000000 bgColor=#ffffff>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=199561720-05052008>Scott (& Pat),</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=199561720-05052008></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=199561720-05052008>This is one case where our APL-AMS is a bit
non-standard, in that we are using POSIX message queues as our primary/only
transport service. Using this method, we have no data from the transport
service to indicate a message's origin. </SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=199561720-05052008></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=199561720-05052008>An alternative, may be to designate the unused
supplementary data in these query messages as "reserved for optional transport
service extensions," indicating that the PTS may optionally include additional
routing/configuration information (ie: MAMS endpoint) if required. The
field would otherwise be ignored by other transport services, which would
accordingly set the supplement data length to 0 when populating these messages.
This would free the protocol from additional limitations on the transport
service, without placing any unnecessary burden on the standard transport
services.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=199561720-05052008></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=199561720-05052008></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=199561720-05052008>The registrar transmits the you_are_dead message in
response to a node's heartbeat timeout, which is what I was referring to.
Presumably the echo should be the terminated nodes ID, but that is not
explicitly stated. This is only marginally useful, but more so than
populating the field with a garbage/0 value.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=199561720-05052008></SPAN></FONT> </DIV>
<BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px">
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=199561720-05052008>>From that point of view, what would probably be
more useful would be to deliver a Fault.indication when the absence of the
registrar is first noticed, and then deliver some other sort of indication --
something new -- when the reconnection succeeds, ignoring the timeouts (since
they don't change the registrar connection state). But that isn't quite
perfect either, since the registrar could actually have been dead for 30
seconds before AMS detected the third missing heartbeat and notified the
application.</SPAN></FONT></DIV></BLOCKQUOTE>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=199561720-05052008>That level of feedback could be a useful addition, but
I agree that it's not necessary for the current version. During the
reconnection/registration period, any subscription attempts should return an
error message, so the purpose of the added fault may be more to support external
error reporting and/or recovery in an application-specific manner if a fault
condition repeats. </SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=199561720-05052008></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=199561720-05052008>For now, generating the fault.indication on
registration and optionally on reconnection timeout seems a reasonable approach,
but perhaps one left as an implementation decision. The parameter given to
the fault.indication call (defined as implementation-specific) may specify the
nature of the error condition, currently an enum value of AMS_FAULT_TIMEOUT in
my implementation, but may be sub-divided into registration and reconnection
timeouts. </SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=199561720-05052008></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=199561720-05052008></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=199561720-05052008>- David</SPAN></FONT></DIV><BR>
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> Donahue, Pat [mailto:pat.donahue@nasa.gov]
<BR><B>Sent:</B> Tuesday, April 29, 2008 12:16 PM<BR><B>To:</B> Scott Burleigh;
Edell, David J.<BR><B>Subject:</B> RE: [Sis-ams] Re: AMS<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=308190916-29042008>David,</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=308190916-29042008></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=308190916-29042008>My UDP implementation does as Scott suggests.
When I get the REGISTRAR_QUERY, I can read from the Packet what the
remoteAddress and remotePort are. I just return the CELL_SPEC to that
remoteAddress/remotePort.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=308190916-29042008></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=308190916-29042008>Pat</SPAN></FONT></DIV>
<DIV> </DIV><!-- Converted from text/plain format -->
<P><FONT size=2>Patrick Donahue<BR>(256) 544-5943 office<BR>(256) 721-0726
home<BR>(256) 682-9753 cell</FONT> </P>
<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV><FONT
size=2></FONT><BR>
<BLOCKQUOTE
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> sis-ams-bounces@mailman.ccsds.org
[mailto:sis-ams-bounces@mailman.ccsds.org] <B>On Behalf Of </B>Scott
Burleigh<BR><B>Sent:</B> Tuesday, April 29, 2008 11:01 AM<BR><B>To:</B> Edell,
David J.<BR><B>Cc:</B> sis-ams@mailman.ccsds.org<BR><B>Subject:</B> [Sis-ams]
Re: AMS<BR></FONT><BR></DIV>
<DIV></DIV>Edell, David J. wrote:
<BLOCKQUOTE
cite=mid:FEEF8FE8A931BE478D2538A677C48BED0334AC03@aplesjustice.dom1.jhuapl.edu
type="cite">
<META content="MSHTML 6.00.2900.3268" name=GENERATOR>
<DIV><FONT face=Arial size=2><SPAN
class=827071019-10042008>Scott,</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=827071019-10042008></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=827071019-10042008>I'm finishing up
our AMS implementation and assembled a few more miscllaneous
comments/questions that I have come across. Funding on the
project is about at an end though (and I'm leaving on vacation for CA
tonight), therefore our first version of AMS can be considered
completed for now. The APL-AMS implementation is largely, with the
exception of invitations and related functions, compliant with conformance
class 4 as described. </SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=827071019-10042008></SPAN></FONT><FONT face=Arial size=2><SPAN
class=827071019-10042008></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=827071019-10042008>I was taking
another look at the configuration server interrogation process. My
current implementation utilizes a shared memory location for announcing the
registrar's location, however I'm looking into an alternative MAMS-based
method to avoid some of the synchronization and related issues with the
shared memory. </SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=827071019-10042008></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=827071019-10042008>If I modify the
process to use the MAMS registrar_query, and await a
cell_spec/registrar_unknown response though, how would the CS know where to
direct the response as defined? It seems that the registrar_query
message would require a MAMS endpoint as its supplemental data, indicating
the origin of, and return destination for, the given
message.</SPAN></FONT></DIV></BLOCKQUOTE>
<DIV>(David, I hope you don't mind, but I think the points you raise here are
important enough to bring forward to the whole working group, so I'm cc:ing
sis-ams on my reply.)<BR><BR>That's one of the big reasons TCP isn't a really
good choice as a primary transport protocol (which is what is used for MAMS
traffic): it's a bootstrapping problem. If you use UDP or some other
non-connection-oriented protocol, the transport protocol itself will give
you<BR>the origin of the registrar_query; that "echo" location (host/port, for
example) can be opaquely passed up to the registrar_query handler and then
passed back down when the cell_spec or registrar_unknown is
returned.<BR><BR>You're right, though, that including the response MAMS
endpoint in every MAMS message that is of a query nature is a reasonable
alternative: it removes one constraint on what makes a good primary transport
protocol, at the cost of a little extra transmission. I dunno. My
inclination is not to change something that currently seems to work okay
unless there's a really compelling reason, and I'm not sure we've got one yet.
<FONT face=Arial size=2><SPAN class=827071019-10042008></SPAN></FONT><FONT
face=Arial size=2><SPAN class=827071019-10042008></SPAN></FONT> Anybody
have any thoughts on this?<SPAN class=199561720-05052008><FONT face=Arial
color=#0000ff size=2> </FONT></SPAN><SPAN
class=199561720-05052008> </SPAN></DIV>
<BLOCKQUOTE
cite=mid:FEEF8FE8A931BE478D2538A677C48BED0334AC03@aplesjustice.dom1.jhuapl.edu
type="cite">
<DIV><FONT face=Arial size=2><SPAN class=827071019-10042008>It's specified
that a Fault.Indication is generated if no response is received from a
node_registration request within N2 seconds. Shouldn't there be a
similar condition for a reconnect messge? I've implemented the same
timeout for both states, with the node generating a fault.indication and
then automatically resending the registration or reconnect message as
appropriate.</SPAN></FONT></DIV></BLOCKQUOTE>I think I see your point.
The node_registration Fault.indication gives the application an opportunity to
decide whether it wants to keep trying or bail. Once the node is
connected, it is likely engaged in ongoing application message exchange with
its peers; if the registrar dies while this is going on, the node probably
doesn't want to interrupt its application message exchange activity just
because there's no registrar, so my inclination has been not to bother the
node with reconnection status. On the other hand, it might be helpful to
tell the node that any new subscriptions it posts aren't going to have any
effect because the registrar is dead.<BR><BR>>From that point of view, what
would probably be more useful would be to deliver a Fault.indication when the
absence of the registrar is first noticed, and then deliver some other sort of
indication -- something new -- when the reconnection succeeds, ignoring the
timeouts (since they don't change the registrar connection state). But
that isn't quite perfect either, since the registrar could actually have been
dead for 30 seconds before AMS detected the third missing heartbeat and
notified the application.<BR><BR>I like the idea of giving the application
better information, but I don't know how really substantial the benefit would
be. Because we're trying to get a Blue Book out relatively soon, so
missions can have something stable to work from, my inclination is to defer
making significant changes until after we've got some operational experience
and then incorporate the lessons learned into a second version.<FONT
face=Arial size=2><SPAN class=827071019-10042008></SPAN></FONT>
<BLOCKQUOTE
cite=mid:FEEF8FE8A931BE478D2538A677C48BED0334AC03@aplesjustice.dom1.jhuapl.edu
type="cite">
<DIV><FONT face=Arial size=2><SPAN class=827071019-10042008>The reference
field of a 'you_are_dead' message is noted as an echo, however technically
there is no originating message when the message is transmitted in response
to a heartbeat timeout. I'm assumming the targets node ID as the
reference value in this case (although the field is ignored either way by
the node)</SPAN></FONT></DIV></BLOCKQUOTE>Actually the node shouldn't send
you_are_dead other than in response to a message (either reconnect or
heartbeat); a heartbeat timeout just causes an "imputed termination".
This has various effects depending on what terminated, but itdoesn't produce
you_are_dead in any of those cases.<BR><BR>You're right, though, that the
reference value in a you_are_dead is only meaningful when the message it's
responding to is reconnect, in which case it's what enables the (nominally
blocking) reconnect query to be unblocked and terminated. A you_are_dead
that is sent in response to a heartbeat isn't turning off anything that's
blocking; its reference value is the heartbeat source, copied from the
heartbeat message, but that isn't especially useful to the now-officially-dead
sender of the heartbeat.<BR><BR>Scott<BR><BR></BLOCKQUOTE></BODY></HTML>