[Sis-ams] Predictability concerns with AMS, and other questions too

Scott Burleigh Scott.Burleigh at jpl.nasa.gov
Fri Jan 27 12:14:38 EST 2006


Marek Prochazka wrote:

>Hi Scott,
>
>thanks for your reply. Before further commenting on some of the points
>discussed, I have yet another question: How would you summarize delivery
>guarantees/fault model of AMS?
>
>I guess that
>
>1) If a node publishes a message, it is doesn't have any guarantees on
>delivery (send) to destination nodes as it has no knowledge on which
>nodes are subscribed. The AMS has the knowledge.
>  
>
Actually the application can indeed have knowledge of which nodes are 
currently subscribed, because subscription indications are (optionally) 
passed up to the application by AMS whenever subscriptions are 
detected.  But the application isn't required to ask for these 
indications or to remember them; publication works the same way in any 
case.

>2) If a node announces or queries a message, or if it replies to a
>message, it is doesn't have any guarantees on delivery either. Is it
>right? My feeling is that all fault notifications are related only to
>wrong authorizations for announce and publish, no "best fit" delivery
>point for send, wrong context for reply, etc.
>  
>
Right, most of the fault indications called out in the spec notify the 
application that its own use of AMS itself has failed for some reason, 
NOT that communication with other nodes has been unsuccessful.

>3) However, using heartbeats, each node could figure out which nodes in
>the same zone are down.
>  
>
Again, nodes don't need to exchange heartbeats among themselves to get 
this information: AMS knows about it anyway (because of the heartbeats 
exchanged between nodes and registrars), and the registration and 
unregistration (whether intentional or not) of nodes is always 
(optionally) announced to the application by AMS whenever these events 
are detected.  And, again, the application isn't required to ask for 
these indications or to remember them, but is free to do so if it likes.

>4) There are certain types of AMS faults reported to nodes.
>  
>
Yes.

>Something else related to this topic? Could you please correct me if I'm
>wrong/misunderstood the spec?
>  
>
I not sure I fully understand the question, but I'll try to come up with 
a useful answer anyway.

When a node publishes a message, the message is sent to N destination 
nodes where 0 <= N <= (cardinality of the message space).  Each message 
transmission utilizes a transport service that is implicitly chosen for 
that particular transmission based on the service mode specified by the 
subscriber.  The only possible guarantees on delivery and performance 
for each individual transmission are those provided by the underlying 
transport service.  If, for example, all subscribers specify service 
modes that map to TCP/IP transport service, then there's a pretty good 
guarantee of delivery to all current subscribers, but no guarantee of 
maximum latency on any of those deliveries.  If some subscribers specify 
service modes that map to UDP instead, then it's possible that the 
message won't even be delivered to all subscribers; yet if all 
subscribers specify service modes that map to TCONS, then I believe 
there's a guarantee of maximum latency in delivery to every subscriber 
[and maybe some sort of failure notification wherever timely delivery 
fails, I dunno].  But AMS itself isn't guaranteeing any of this.  All 
AMS can guarantee is to initiate transmission to all known subscribers 
at the time publication is requested, in accord with the transmission 
preferences expressed by the application.

Announcement of a message is similar to publication; the main difference 
is that the set of subscribers is chosen by the publishing application 
node rather than inferred from subscriptions.

When a node issues a query, operation of the node is suspended until a 
response to the query arrives or a timeout interval expires.  In this 
case, of course, the response to the query gives the querying node a 
firm guarantee that the query reached its destination.  But again the 
message will be conveyed by a transport service implicitly selected by 
the destination node; if that service is, say, UDP, then it's entirely 
possible for the query to fail to reach its destination or for the reply 
to fail to reach the querying node - so in the absence of a response the 
querying node has no idea what really happened.  If that matters, then 
the destination node has got to be configured accordingly.

The key concept here is that AMS by design gives the application a great 
deal of control over - and, correspondingly, responsibility for - 
message exchange performance.  Application developers still have to give 
some thought to exactly what is important to them, because nothing comes 
for free - retransmission increases the chance of delivery but increases 
latency, real-time performance is only possible in networks of limited 
scope, bandwidth reservation can degrade network performance, and so on 
- and AMS declines to make all these design trade-offs in advance and 
lock all developers into those decisions.

Summing up: AMS is no substitute for engineering.  There's no magic here.

Scott





More information about the Sis-ams mailing list