[Sis-ams] Predictability concerns with AMS, and other questions too

Tue Jan 24 13:23:29 EST 2006

Marek Prochazka wrote:

>Hello Scott,
>
>I've finished reading the AMS white book, liked some of its features and
>have some questions or notes. First of all, I have to say I'm a newbie
>so maybe I missed some past discussions on the same topic.
>
Hi, Marek.  Thanks for taking the time to read through the book and 
develop some good questions.  Some answers in-line below.

>Here are my notes:
>
>1) Predictability issues. There is a number of places where I'm
>concerned about predictability issues. Most of them are related to
>"immediate propagation" of certain information to "all" nodes,
>registrars or configuration servers. I understand that in most cases the
>selected protocol follows the general publish/subscribe communication
>model of AMS. But the AMS white book makes me feel that it all is so
>dynamic and so costly, that it can be hardly used in hard real-time
>applications.
>
>Maybe the answer is as follows: An HRT application should avoid dynamic
>changes such as subscriptions, terminations, etc., and thus avoid
>time-costly and unpredictable operations happening at runtime. An HRT
>application should setup the message space(s) in an initialization phase
>and then only use regular messaging.
>
>I have two notes to this: First, if what I have written is how AMS is
>meant to be used by HRT applications, then perhaps there could be a
>section explaining which parts are better to be done in the
>initialization phase and not during the "main computation" phase.
>Second, anyway, some fault handling might happen during the computation
>phase, and hence the application schedulability analysis must take it
>into account anyway.
>  
>
This is an important topic.  The AMS design isn't principally aimed at 
hard real-time applications; the main intent is to reduce the cost of 
developing and operating distributed systems over networks, including 
the future interplanetary internet when and if we get it built, and you 
normally don't expect hard real-time performance over Ethernet, for 
example.  That said, in JPL's Flight Systems Testbed we successfully 
used Tramel, the lineal antecedent of AMS, to convey data among the 
threads of a real-time attitude control system; the control laws were 
able to function without much difficulty.

As you say, it's all a question of exactly how you use the system: once 
the communication configuration of the real-time elements of the message 
space has stabilized (other bits of configuration can continue to change 
without noticeable effect) - and provided your real-time nodes are using 
a real-time-suitable transport system (such as message queues) 
underneath AMS - I believe you can get bounded maximum latency in AMS 
message exchange among those nodes.  This remains to be demonstrated, of 
course, and a lot does depend on careful implementation, but my 
experience with Tramel makes me hopeful.

I think the explanatory section you propose is a great idea, but I think 
it belongs in an AMS Green Book (yet to be written) rather than in the 
specification itself, as it is informative and advisory rather than 
normative.  And certainly nothing about the design of AMS obviates the 
need for schedulability analysis in any case.

>Here are parts of the protocol which make it (in my opinion) highly
>unpredictable with respect to response time:
>- Registrar registration (Section 2.3.2, also 2.3.3, 3.1.5.4, 3.1.6.4,
>4.2.3, etc.): After each configuration change (subscription, invitation,
>termination, etc.), the registrar propagates it immediately to all nodes
>and all other zones. Also, given Node registration (4.2.5), it seems
>that all registrars have information on all nodes in all zones and the
>propagation is always performed immediately. Is it necessary? Isn't it
>ineffective and unpredictable if you consider limited bandwidth and a
>number of messages being sent at the same moment?
>  
>
Registrars aren't required to retain information on the nodes in remote 
zones; they receive it and they are required to pass it on to all the 
nodes in their own zone, and of course nothing prevents the 
implementation of registrar functionality from retaining all this stuff, 
but no required registrar functions depend on it (the registrar is not a 
message broker).  In the JPL implementation, registrars know nothing 
about other zones' nodes.

Each node, on the other hand, is required to know about all other nodes 
in the message space.  This tends to increase nodes' memory 
requirements, but it makes it possible for all AMS message traffic to be 
exchanged directly between nodes rather than through a message broker; 
this reduces bandwidth consumption (the number of messages is cut in 
half) and increases robustness (there is no single point of failure).

The trade-off here is between increasing the number of configuration 
messages (propagating configuration information to the nodes) versus 
doubling the number of application messages (which is necessary if you 
retain configuration information only at message brokers).  On the 
assumption that application message traffic will normally be vastly 
heavier than configuration traffic, this seems like the right design 
approach.

>- Configuration service fail-over (2.3.6): Registrar is cycling through
>all well-known network locations to find out a new config server. In the
>meantime, as the registrar is not sending heartbeats, all nodes start
>cycling too. That might imply a huge number of messages being sent at
>the same moment and predictability of such fail-over being very poor.
>  
>
I think this merits some real quantitative analysis; I don't believe the 
actual traffic load would be particularly substantial, as these messages 
are quite short and aren't issued frequently.  But I certainly agree 
that the predictability of fail-over will not be good.  The point of the 
fail-over design isn't preservation of real-time performance (which 
would be unaffected by failure of a configuration server anyway, since 
the real-time application messages are exchanged directly between nodes) 
but the overall robustness and survivability of the distributed 
application.  When a configuration server fails, you can either try to 
recover automatically (the current fail-over design) or do it manually; 
in neither case is the moment of recovery very predictable.

>2) Priority of a message: It is mentioned number of times, but there is
>no clear statement on how the priority is used, what the dispatch and
>delivery mechanism is for messages with the same and different
>priorities, what "higher urgency" exactly means for the protocol and how
>the AMS entities participate on this.
>  
>
Good point, there should be some clarifying language somewhere.  
Priority and flow label are both merely passed through to the underlying 
transport layer adapter, to be used (or not) as makes sense for that 
protocol; the AMS protocol itself doesn't use them at all.  The JPL 
implementation does use priority to order arriving messages in the queue 
of messages awaiting delivery to the application, but this is strictly 
an implementation choice; interoperability is not affected.

>3) Hardcoded intervals. The document says that AMS is for communication
>both between modules of a ground system and flight system, as well as
>between modules located on different spacecrafts or between ground
>system and spacecraft system. I wonder whether then some of the
>following hardcoded numbers are right for all those cases:
>- 20 seconds between heartbeats for registrar <-> node, 3 missing
>successive heartbeats imply a fail,
>- 10 seconds between heartbeats for registrar <-> config server, 3
>missing  successive heartbeats imply a fail,
>- Configuration server location (4.2.2): config_msg_ack should be
>received within 5 seconds, otherwise Fault.indication is sent (is 5
>seconds enough for e.g. space-to-ground communication? - Maybe you want
>to have some suggestions for distribution of AMS entities, e.g. one
>config server per spacecraft and one for ground -> communication within
>5 seconds makes more sense?)
>- Registrar location (4.2.4): The same as previous
>  
>
These intervals could be configuration options rather than fixed values, 
but in my experience that introduces a lot of operational and 
implementation complexity for little if any benefit.  Certainly wide 
variations in signal propagation delay could make the fixed values in 
the spec less than useful, but I would argue that in this case you 
should partition your system into multiple closed continua and use 
remote AMS for message exchange across the long-delay links; that's 
really what RAMS was designed for.

>4) Configuration service fail-over (2.3.6, 4.2.1): After a new config
>server starts, it sends I_am_running message and if I receives such a
>message it immediately terminates. This can't work, as a scenario with
>mutually terminating config servers is likely to happen. I think that a
>kind of timestamp ordering or perhaps another negotiation protocol has
>to be added to reason about "who was the first" and avoid unnecessary
>termination of config servers.
>  
>
No, it should work fine, because each configuration server sends 
I_am_running only to other configuration servers that rank lower than 
itself in the well-known list of configuration server network locations 
(see 4.2.1); no configuration server will ever receive I_am_running from 
any other configuration server to which it sent such a message.

>5) Subject catalog (2.3.9): Last two paragraphs: I'd remove the
>suggestion about potentially sparse large arrays and keeping the subject
>numbers small. There is number of application-dependent solutions for
>this, such as fixed number of subjects, hash function etc.
>  
>
I suppose you're right, this is the sort of implementation hint that 
really belongs in a Green Book.  But I think it's correct nonetheless: I 
would suggest that every one of the alternative solutions you allude to 
is either less general or more time-consuming than simply using subject 
number as an index into an array.

>6) Node registration (4.2.5): Very strict, no option to detect why
>registration was rejected and try to re-register. Neither
>Register.request nor register.indication mention how the node is
>eventually informed about the reason for rejection (maybe this is the
>code in 5.1.5.16?).
>  
>
The bullets at the top of page 41 say that the reason for rejection is 
noted in the rejection MPDU; conveying this information to the 
application is an implementation matter, which doesn't affect 
interoperability.

>7) Node registration (4.2.5): Forwarding an I_am_starting message by a
>registrar: some kind of "marking" has to be done to avoid forwarding a
>message which was previously forwarded by another registrar (looping of
>messages). The same for other forwarding similar to this one (e.g.
>2.3.10 - Remote AMS message exchange).
>  
>
I don't understand how this would happen: I don't think there's any 
clause in the spec that talks about a registrar forwarding a MAMS 
message to another registrar, except when the source of that MAMS 
message is a node in its own zone (i.e., NOT a registrar).  So there 
can't be any looping of messages through registrars.

>8) Heartbeats (4.2.7): After it receives "reconnect" form a node, a
>registrar should return you_are_dead if it has been operating for more
>than 60 seconds. This is not right. First, imagine that a registrar
>crashes and a new one is started within x seconds. As a node connected
>to the original registrar will need 60 seconds to notice that the
>registrar is dead, therefore it can't contact the new registrar sooner
>than 60 seconds after the old one crashed (in the worst case). That
>means that effectively the node has in the worst case x seconds to i)
>locate the new registrar and ii) contact it.
>
I think we're okay here.  Suppose the registrar crashes at time T and 
the new replacement registrar starts at time T+x.  By time T+60 every 
node in the zone will have noticed the death of the original registrar 
and will have started asking the configuration server where the new one 
is.  All of the nodes will be querying the configuration server and 
trying to reconnect, every 20 seconds, so by T+x+20 every node in the 
zone will have learned about the new location of the registrar and will 
have sent a reconnect message to it.  Since the registrar doesn't shut 
off reconnects until T+x+60, there's no problem, no matter what the 
value of x is.

>Second, the communication between node and registrar could be
>spacecraft-to-spacecraft or even ground-to-spacecraft (unless you
>specify or suggest otherwise, as I suggested above) and hence 60 seconds
>for a message round trip is unrealistic at all.
>  
>
Again, for communication over long-delay links it's important to deploy 
multiple continua and use Remote AMS; ordinary AMS and MAMS functioning 
just doesn't make sense over that sort of distance, and there's no need 
for it to.

>9) Minor typographic issues: in 4.2.8 you  reference 4.2.10, 4.2.11,
>4.2.12 and 4.2.13 as "above".
>  
>
Good catch, I'll fix that.  Thanks, Marek.

Scott