[Sis-ams] Predictability concerns with AMS, and other questionstoo

Tue Jan 31 08:16:11 EST 2006

Hi Scott,

going through your answers (thanks for them!):

> >1) Predictability issues. There is a number of places where I'm 
> This is an important topic.  The AMS design isn't principally 
> aimed at hard real-time applications; the main intent is to 
> reduce the cost of developing and operating distributed 
> systems over networks, including the future interplanetary 

This is certainly an important message for us, as what we definitely
want is support for RT. I always thought that the requirements for the
MTS included support for RT.

> As you say, it's all a question of exactly how you use the 
> system: once the communication configuration of the real-time 
> elements of the message space has stabilized (other bits of 
> configuration can continue to change without noticeable 
> effect) - and provided your real-time nodes are using a 
> real-time-suitable transport system (such as message queues) 
> underneath AMS - I believe you can get bounded maximum 
> latency in AMS message exchange among those nodes.  This 
> remains to be demonstrated, of course, and a lot does depend 
> on careful implementation, but my experience with Tramel 
> makes me hopeful.

O.K.
My feeling is that some level of laziness could be better then immediate
(and perhaps simultaneous) propagation of configuration modifications,
cycling through nodes, etc.

> I think the explanatory section you propose is a great idea, 
> but I think it belongs in an AMS Green Book (yet to be 
> written) rather than in the specification itself, as it is 
> informative and advisory rather than normative.  And 
> certainly nothing about the design of AMS obviates the need 
> for schedulability analysis in any case.

O.K., sounds good.

> I'd say that real-time performance (a guaranteed upper bound 
> on message delivery latency) is an element of Quality of 
> Service, just as reliability (retransmission), preservation 
> of data transmission order, etc. are elements of Quality of 
> Service.  Underlying the AMS design is a commitment to the 
> layering principle and a deliberate and resolute refusal to 
> reinvent communications all over again, so a fundamental 
> design principle of AMS is to rely on the underlying 
> transport systems to provide QOS.  For example, AMS doesn't 
> do retransmission itself: it relies on (say) TCP, where many 
> thousands of hours of work have gone into a sound 
> retransmission design.  So one reason you're not seeing a lot 
> of discussion of real-time performance guarantees in the AMS 
> spec is that AMS, by design, is going to rely on the 
> real-time performance of underlying transport systems when 
> applications deem real-time performance necessary.
> If the underlying transport system 
> is, say, vxWorks messages queues - or TCONS - then the 
> latency between the moment of transmission and the moment of 
> arrival of each message will be quite predictable.  But it 
> won't really be AMS that has provided that predictability: 
> AMS has just conveyed the application's mandate to the 
> transport layer.

I have to emphasize that I'm more RT and middleware guy than a
networking one.
The comments above are true. My only arguments are:
1) In addition to the transport layer worst case overheads, you have to
take into account the worst case overheads of AMS in case of a node
failure, registrar failure, config server failure etc.  If I take into
account e.g. a potential failure of a config server, normal messages
will be delayed proportionally to the number of nodes in the same zone
(maybe this number is bounded by a number of alternative config server
addresses). The delay on a node X is caused a) by AMS on node X cycling
through registrar addresses (CPU, network interface, kernel calls,
threads perhaps not preempted if AMS runs or lower priority but maybe
blocked due to shared access to some resources, depends on transport
protocol, drivers, etc.), b) by number of messages received by the
driver, c) by network traffic.
2) It is certainly a good thing that AMS relies on e.g. TCP
retransmissions, but I'm missing AMS failure codes mapped to various
failures of the underlying transport protocol, so there is absolutely no
way to be aware of certain nodes eventually failed or having problems
when receiving messages. (Yes there is a way - to build yet another
application-level protocol on top of AMS, message delivery
acknowledgements or something like that. Not a very effective way.)

> Registrars aren't required to retain information on the nodes 
> in remote zones; they receive it and they are required to 
> pass it on to all the nodes in their own zone, and of course 
> nothing prevents the implementation of registrar 
> functionality from retaining all this stuff, but no required 
> registrar functions depend on it (the registrar is not a 
> message broker).  In the JPL implementation, registrars know 
> nothing about other zones' nodes.

I see, O.K.

> Each node, on the other hand, is required to know about all 
> other nodes in the message space.  This tends to increase 
> nodes' memory requirements, but it makes it possible for all 
> AMS message traffic to be exchanged directly between nodes 
> rather than through a message broker; this reduces bandwidth 
> consumption (the number of messages is cut in
> half) and increases robustness (there is no single point of failure).
> On the assumption that 
> application message traffic will normally be vastly heavier 
> than configuration traffic, this seems like the right design approach.

Sure.

> predictability of fail-over will not be good.  The point of 
> the fail-over design isn't preservation of real-time 
> performance (which would be unaffected by failure of a 
> configuration server anyway, since the real-time application 
> messages are exchanged directly between nodes) but the 

That's what I think is not true. The performance of normal communication
will be affected.

> >2) Priority of a message: It is mentioned number of times, 
> Good point, there should be some clarifying language somewhere.  
> Priority and flow label are both merely passed through to the 
> underlying transport layer adapter, to be used (or not) as 
> makes sense for that protocol; the AMS protocol itself 
> doesn't use them at all.  The JPL implementation does use 
> priority to order arriving messages in the queue of messages 
> awaiting delivery to the application, but this is strictly an 
> implementation choice; interoperability is not affected.

I think that the implementation of AMS should somehow deal with
priorities either. E.g. when a message is being multicasted to all
subscribed nodes and a higher priority message arrives. Or what has a
higher priority - reconfiguration or "normal" messages? What if two
messages of different priority are received by a node at the same
moment? What if a node sends query, is suspended when waiting for a
reply, and a higher priority message arrives? BTW what does it mean that
the node is suspended? No message can be sent by other application
threads?

> These intervals could be configuration options rather than 
> fixed values, but in my experience that introduces a lot of 
> operational and implementation complexity for little if any 
> benefit.  Certainly wide variations in signal propagation 
> delay could make the fixed values in the spec less than 
> useful, but I would argue that in this case you should 
> partition your system into multiple closed continua and use 
> remote AMS for message exchange across the long-delay links; 
> that's really what RAMS was designed for.

Perhaps you're right and a suggestion like this should go to the green
book.

> >4) Configuration service fail-over (2.3.6, 4.2.1): After a 
> No, it should work fine, because each configuration server 
> sends I_am_running only to other configuration servers that 
> rank lower than itself in the well-known list of 
> configuration server network locations (see 4.2.1); no 
> configuration server will ever receive I_am_running from any 
> other configuration server to which it sent such a message.

Right, sorry I missed the "lower-ranking" part.

> >7) Node registration (4.2.5): Forwarding an I_am_starting 
> >
> I don't understand how this would happen: I don't think 
> there's any clause in the spec that talks about a registrar 
> forwarding a MAMS message to another registrar, except when 
> the source of that MAMS message is a node in its own zone 
> (i.e., NOT a registrar).  So there can't be any looping of 
> messages through registrars.

I don't know what I meant in 4.2.5 ;-)
In 2.3.10, I thought that forwarding a message to "every other RAMS
gateway which it's linked and whose message space contains at least one
node..." actually implies that all the "other" RAMS gateways also
forward the message once again to all the RAMS gateways, as they match
the condition in the quotes. But perhaps 4.4.10 addresses this.

> >8) Heartbeats (4.2.7): After it receives "reconnect" form a node, a 
> I think we're okay here.  Suppose the registrar crashes at 
> time T and the new replacement registrar starts at time T+x.  
> By time T+60 every node in the zone will have noticed the 
> death of the original registrar and will have started asking 
> the configuration server where the new one is.  All of the 
> nodes will be querying the configuration server and trying to 
> reconnect, every 20 seconds, so by T+x+20 every node in the 

I don't see how you get T+x+20 here. The nodes start to reconnect after
they notice that the registrar is gone, which is by T+60 as you say
above. So the nodes will be querying the configuration and trying to
reconnect at t+60 or t+80 (the document doesn't say whether reconnection
happens immediately or in the next 20 sec time).

> zone will have learned about the new location of the 
> registrar and will have sent a reconnect message to it.
> Since the registrar doesn't shut off reconnects until T+x+60, 
> there's no problem, no matter what the value of x is.

The registrar doesn't shut off reconnects until T+x+60, but the nodes in
the zone notice the old registrar's crash by T+60, as you say above. So
they have only x to reconnect, don't they? Some misunderstanding must be
here...

> Again, for communication over long-delay links it's important 
> to deploy multiple continua and use Remote AMS; ordinary AMS 
> and MAMS functioning just doesn't make sense over that sort 
> of distance, and there's no need for it to.

O.K., maybe another paragraph in the future green book might help. The
thing is that your explanation above is not just a suggestion, it is a
must - given the fixed numbers in the protocol.

Best regards,
Marek