[Sis-ams] Predictability concerns with AMS, and other questions
too
Scott Burleigh
Scott.Burleigh at jpl.nasa.gov
Tue Jan 24 13:23:29 EST 2006
Marek Prochazka wrote:
>Hello Scott,
>
>I've finished reading the AMS white book, liked some of its features and
>have some questions or notes. First of all, I have to say I'm a newbie
>so maybe I missed some past discussions on the same topic.
>
Hi, Marek. Thanks for taking the time to read through the book and
develop some good questions. Some answers in-line below.
>Here are my notes:
>
>1) Predictability issues. There is a number of places where I'm
>concerned about predictability issues. Most of them are related to
>"immediate propagation" of certain information to "all" nodes,
>registrars or configuration servers. I understand that in most cases the
>selected protocol follows the general publish/subscribe communication
>model of AMS. But the AMS white book makes me feel that it all is so
>dynamic and so costly, that it can be hardly used in hard real-time
>applications.
>
>Maybe the answer is as follows: An HRT application should avoid dynamic
>changes such as subscriptions, terminations, etc., and thus avoid
>time-costly and unpredictable operations happening at runtime. An HRT
>application should setup the message space(s) in an initialization phase
>and then only use regular messaging.
>
>I have two notes to this: First, if what I have written is how AMS is
>meant to be used by HRT applications, then perhaps there could be a
>section explaining which parts are better to be done in the
>initialization phase and not during the "main computation" phase.
>Second, anyway, some fault handling might happen during the computation
>phase, and hence the application schedulability analysis must take it
>into account anyway.
>
>
This is an important topic. The AMS design isn't principally aimed at
hard real-time applications; the main intent is to reduce the cost of
developing and operating distributed systems over networks, including
the future interplanetary internet when and if we get it built, and you
normally don't expect hard real-time performance over Ethernet, for
example. That said, in JPL's Flight Systems Testbed we successfully
used Tramel, the lineal antecedent of AMS, to convey data among the
threads of a real-time attitude control system; the control laws were
able to function without much difficulty.
As you say, it's all a question of exactly how you use the system: once
the communication configuration of the real-time elements of the message
space has stabilized (other bits of configuration can continue to change
without noticeable effect) - and provided your real-time nodes are using
a real-time-suitable transport system (such as message queues)
underneath AMS - I believe you can get bounded maximum latency in AMS
message exchange among those nodes. This remains to be demonstrated, of
course, and a lot does depend on careful implementation, but my
experience with Tramel makes me hopeful.
I think the explanatory section you propose is a great idea, but I think
it belongs in an AMS Green Book (yet to be written) rather than in the
specification itself, as it is informative and advisory rather than
normative. And certainly nothing about the design of AMS obviates the
need for schedulability analysis in any case.
>Here are parts of the protocol which make it (in my opinion) highly
>unpredictable with respect to response time:
>- Registrar registration (Section 2.3.2, also 2.3.3, 3.1.5.4, 3.1.6.4,
>4.2.3, etc.): After each configuration change (subscription, invitation,
>termination, etc.), the registrar propagates it immediately to all nodes
>and all other zones. Also, given Node registration (4.2.5), it seems
>that all registrars have information on all nodes in all zones and the
>propagation is always performed immediately. Is it necessary? Isn't it
>ineffective and unpredictable if you consider limited bandwidth and a
>number of messages being sent at the same moment?
>
>
Registrars aren't required to retain information on the nodes in remote
zones; they receive it and they are required to pass it on to all the
nodes in their own zone, and of course nothing prevents the
implementation of registrar functionality from retaining all this stuff,
but no required registrar functions depend on it (the registrar is not a
message broker). In the JPL implementation, registrars know nothing
about other zones' nodes.
Each node, on the other hand, is required to know about all other nodes
in the message space. This tends to increase nodes' memory
requirements, but it makes it possible for all AMS message traffic to be
exchanged directly between nodes rather than through a message broker;
this reduces bandwidth consumption (the number of messages is cut in
half) and increases robustness (there is no single point of failure).
The trade-off here is between increasing the number of configuration
messages (propagating configuration information to the nodes) versus
doubling the number of application messages (which is necessary if you
retain configuration information only at message brokers). On the
assumption that application message traffic will normally be vastly
heavier than configuration traffic, this seems like the right design
approach.
>- Configuration service fail-over (2.3.6): Registrar is cycling through
>all well-known network locations to find out a new config server. In the
>meantime, as the registrar is not sending heartbeats, all nodes start
>cycling too. That might imply a huge number of messages being sent at
>the same moment and predictability of such fail-over being very poor.
>
>
I think this merits some real quantitative analysis; I don't believe the
actual traffic load would be particularly substantial, as these messages
are quite short and aren't issued frequently. But I certainly agree
that the predictability of fail-over will not be good. The point of the
fail-over design isn't preservation of real-time performance (which
would be unaffected by failure of a configuration server anyway, since
the real-time application messages are exchanged directly between nodes)
but the overall robustness and survivability of the distributed
application. When a configuration server fails, you can either try to
recover automatically (the current fail-over design) or do it manually;
in neither case is the moment of recovery very predictable.
>2) Priority of a message: It is mentioned number of times, but there is
>no clear statement on how the priority is used, what the dispatch and
>delivery mechanism is for messages with the same and different
>priorities, what "higher urgency" exactly means for the protocol and how
>the AMS entities participate on this.
>
>
Good point, there should be some clarifying language somewhere.
Priority and flow label are both merely passed through to the underlying
transport layer adapter, to be used (or not) as makes sense for that
protocol; the AMS protocol itself doesn't use them at all. The JPL
implementation does use priority to order arriving messages in the queue
of messages awaiting delivery to the application, but this is strictly
an implementation choice; interoperability is not affected.
>3) Hardcoded intervals. The document says that AMS is for communication
>both between modules of a ground system and flight system, as well as
>between modules located on different spacecrafts or between ground
>system and spacecraft system. I wonder whether then some of the
>following hardcoded numbers are right for all those cases:
>- 20 seconds between heartbeats for registrar <-> node, 3 missing
>successive heartbeats imply a fail,
>- 10 seconds between heartbeats for registrar <-> config server, 3
>missing successive heartbeats imply a fail,
>- Configuration server location (4.2.2): config_msg_ack should be
>received within 5 seconds, otherwise Fault.indication is sent (is 5
>seconds enough for e.g. space-to-ground communication? - Maybe you want
>to have some suggestions for distribution of AMS entities, e.g. one
>config server per spacecraft and one for ground -> communication within
>5 seconds makes more sense?)
>- Registrar location (4.2.4): The same as previous
>
>
These intervals could be configuration options rather than fixed values,
but in my experience that introduces a lot of operational and
implementation complexity for little if any benefit. Certainly wide
variations in signal propagation delay could make the fixed values in
the spec less than useful, but I would argue that in this case you
should partition your system into multiple closed continua and use
remote AMS for message exchange across the long-delay links; that's
really what RAMS was designed for.
>4) Configuration service fail-over (2.3.6, 4.2.1): After a new config
>server starts, it sends I_am_running message and if I receives such a
>message it immediately terminates. This can't work, as a scenario with
>mutually terminating config servers is likely to happen. I think that a
>kind of timestamp ordering or perhaps another negotiation protocol has
>to be added to reason about "who was the first" and avoid unnecessary
>termination of config servers.
>
>
No, it should work fine, because each configuration server sends
I_am_running only to other configuration servers that rank lower than
itself in the well-known list of configuration server network locations
(see 4.2.1); no configuration server will ever receive I_am_running from
any other configuration server to which it sent such a message.
>5) Subject catalog (2.3.9): Last two paragraphs: I'd remove the
>suggestion about potentially sparse large arrays and keeping the subject
>numbers small. There is number of application-dependent solutions for
>this, such as fixed number of subjects, hash function etc.
>
>
I suppose you're right, this is the sort of implementation hint that
really belongs in a Green Book. But I think it's correct nonetheless: I
would suggest that every one of the alternative solutions you allude to
is either less general or more time-consuming than simply using subject
number as an index into an array.
>6) Node registration (4.2.5): Very strict, no option to detect why
>registration was rejected and try to re-register. Neither
>Register.request nor register.indication mention how the node is
>eventually informed about the reason for rejection (maybe this is the
>code in 5.1.5.16?).
>
>
The bullets at the top of page 41 say that the reason for rejection is
noted in the rejection MPDU; conveying this information to the
application is an implementation matter, which doesn't affect
interoperability.
>7) Node registration (4.2.5): Forwarding an I_am_starting message by a
>registrar: some kind of "marking" has to be done to avoid forwarding a
>message which was previously forwarded by another registrar (looping of
>messages). The same for other forwarding similar to this one (e.g.
>2.3.10 - Remote AMS message exchange).
>
>
I don't understand how this would happen: I don't think there's any
clause in the spec that talks about a registrar forwarding a MAMS
message to another registrar, except when the source of that MAMS
message is a node in its own zone (i.e., NOT a registrar). So there
can't be any looping of messages through registrars.
>8) Heartbeats (4.2.7): After it receives "reconnect" form a node, a
>registrar should return you_are_dead if it has been operating for more
>than 60 seconds. This is not right. First, imagine that a registrar
>crashes and a new one is started within x seconds. As a node connected
>to the original registrar will need 60 seconds to notice that the
>registrar is dead, therefore it can't contact the new registrar sooner
>than 60 seconds after the old one crashed (in the worst case). That
>means that effectively the node has in the worst case x seconds to i)
>locate the new registrar and ii) contact it.
>
I think we're okay here. Suppose the registrar crashes at time T and
the new replacement registrar starts at time T+x. By time T+60 every
node in the zone will have noticed the death of the original registrar
and will have started asking the configuration server where the new one
is. All of the nodes will be querying the configuration server and
trying to reconnect, every 20 seconds, so by T+x+20 every node in the
zone will have learned about the new location of the registrar and will
have sent a reconnect message to it. Since the registrar doesn't shut
off reconnects until T+x+60, there's no problem, no matter what the
value of x is.
>Second, the communication between node and registrar could be
>spacecraft-to-spacecraft or even ground-to-spacecraft (unless you
>specify or suggest otherwise, as I suggested above) and hence 60 seconds
>for a message round trip is unrealistic at all.
>
>
Again, for communication over long-delay links it's important to deploy
multiple continua and use Remote AMS; ordinary AMS and MAMS functioning
just doesn't make sense over that sort of distance, and there's no need
for it to.
>9) Minor typographic issues: in 4.2.8 you reference 4.2.10, 4.2.11,
>4.2.12 and 4.2.13 as "above".
>
>
Good catch, I'll fix that. Thanks, Marek.
Scott
More information about the Sis-ams
mailing list