[Css-csts] production status and the complete mode of the Buffered Data Delivery procedure

Tue Mar 19 12:01:53 EST 2013

Dear John,

I guess some of the points you raise are a consequence of combing the complete online and offline modes into a single complete delivery mode, but also the SLE complete online mode can be active in periods where there is no active SLS and I cannot recall that we ever had problem reports related to the production status notifications.

I do not think the situation is as obscure for the service user as you describe it. First the user will most probably know when a SLS is active and when not. Second, a service always can and should add further information to the parameters that can be monitored using the cyclic report  procedure  - the SLE return services for instance report the lock states. Such information could also be added to the notifications reporting PS changes if that is felt to be necessary. I think this is might be a better approach than introducing a special "RB State". In any case I feel that this is probably better left to the service than trying to solve potential issues at the abstract layer of FWS.

I think a user will interested in the production status only for two reasons:

a)      To understand why there is a gap in the data (for that reason the notification must be in the RB)

b)      To understand why he does not receive data "now" (this can be supported either by monitoring or querying the PS)
You are right that the user will probably never seen notifications inserted into the RB outside an active SLS but I do not see why that should be a problem. Such PS changes are only of interest to the user if he does no longer receive data and not at a later time when he retrieves the data.

Concerning the actual definition of what the PS means I think we cannot fix that for all providers and especially not at this level of abstraction.  From what I have seen in SLE Return Service implementations, the PS is either operational or halted, where the latter would be set only on operator request. As far as I understand not receiving telemetry is not a reason to set the status to interrupted unless something in the TM chain is broken. In the offline mode (or complete mode without active SLS) I guess the only thing that can happen is that the storage system does not work. If I can switch to a redundant storage I might have a transient interrupted state but if not then halted is probably more appropriate.

As regards clause 4.5.3.2.6 , if you leave it in the BDD, then would you not have to add the same statement to all other procedures that notify events? If the general specification of derivation is not clear enough maybe it has to be made clearer - or maybe better calcified in the guidelines.

Unfortunately I cannot attend the teleconference on Thursday and therefore wanted to at least respond briefly in writing.

Kind Regards, Martin

From: John Pietras [mailto:john.pietras at gst.com]
Sent: 18 March 2013 16:51
To: Martin Götzelmann; CCSDS_CSTSWG (css-csts at mailman.ccsds.org)
Subject: RE: [Css-csts] production status and the complete mode of the Buffered Data Delivery procedure

Martin,
Thanks for your reply in the message below and in the message that followed it.

Your thought about including a separate Cyclic Report procedure to inform the user of the Production Status (PS) "now" (I started to write "in real-time" but that confuses the issue when discussing complete mode) brought to mind another question - what is the Production Status when the BDD-implementing service is in complete mode *and* there is no active Space Link Session? Nominally, it would be the status of the Recording Buffer (RB) [see Footnote, below] only, but that seems problematic - if the status of the RB *does*change for some reason, the notifications will be placed into that same RB in time-order, meaning that the user could see RB status notifications interspersed with space link session production-related PS notifications, with no way to distinguish between them, and the user would only see those notifications when the Start and Stop time of the START include them. (And would the RB ever be in 'production configured', and if so how does that notification get into the RB?)

It seems to me that in order to get around this problematic situation we should refine the definition of production status for the BDD procedure to be limited to the production processes associated with the production of the Service Production Data Units (to use the terminology of the BDD Concept section) generated by the Space Link Session-associated processes and explicitly exclude the BDD. If we want to cover the case of indicating problems with the RB itself, the procedure could define a separate set of "recorded buffer status" notifications that are inserted by the procedure itself to report the RB status as it exists "now". Such an RB status would be valid whether or not the BDD-implementing service is co-incident with a space link session. However, I don't have a strong feeling one way or the other about the separate RB status notifications.

Getting back to your thought about including a separate Cyclic Report procedure (and/or  Information Query or perhaps most appropriately, Notification procedure) to report on the PS "now", how would that be determined? Presumably if there is an SLS actively providing data to the RB then that would be the "now" PS. But what if there is no such active SLS associated with the RB - 'production halted' may be technically accurate but misleading. It might be 'production configured' for some Complexes, but for other it will just be unspecified (which is not a valid value for PS). And such extra procedures would only be beneficial for complete instances - real-time instances will already be reporting the PS changes as they occur via the BDD NOTIFY operation anyway. (Indeed, in the real-time case, have multiple procedures reporting the same PS could add to the congestion of a real-time service instance.)

I now turn to your comments about events and associated notifications added by derived procedures. I completely agree with your comment that they should be removed from the BDD state table. The appropriate approach (in my opinion) is that when a derived procedure adds a notifiable event and associated notification, the state table for that derived procedure should be extended to explicitly identify that event as an incoming  event and spell out the appropriate actions in each state. This approach is not only more explicit for the implementer but also more rigorous for the definer of derived procedures - it forces him or her to ask what the effect of this particular event should be on the procedure. I'm not sure about removing clause 4.5.3.2.6 - I'm not sure that our general statement of derivation would lead implementers to believe that a derived procedure can add new events. But in any case the derived-procedure events in that state table and the references to clause 4.5.3.2.6 should be removed.

Finally, regarding your comment in your email that followed the one below, about not removing row 9 but just making it have no effect on the state of the procedure instance. I don't have a problem with that, but if we just remove the ".OR. xxx derived procedure event" phrases from rows 8 and 9 both will be left with the same incoming event, 'production status change'. These rows will have to be qualified by the mode of the BDD procedure instance, something like "'production status change' [when BDD procedure is in real-time  mode]" and "'production status change' [when BDD procedure is in real-time  mode]". Presumably there would be a real-time/complete pair of events for each new event that a derived procedure defines.

I hope that we will have a few moments to discuss these points in Thursday's telecon.

Best regards,
John

[Footnote - Somehow in my original message I not only abbreviated "Recording Buffer" incorrectly to "RC", but I continued to use the wrong abbreviation throughout my message. I've re-abbreviate it for this message.]

From: Martin Götzelmann [mailto:martin.goetzelmann at telespazio-vega.de]
Sent: Sunday, March 17, 2013 6:23 AM
To: John Pietras; CCSDS_CSTSWG (css-csts at mailman.ccsds.org<mailto:css-csts at mailman.ccsds.org>)
Subject: RE: [Css-csts] production status and the complete mode of the Buffered Data Delivery procedure

Dear John,

Here are some initial thoughts concerning your observations.

My understanding of the intended behaviour is exactly as you describe it.

I think we need to distinguish between 'interrupted' and 'halted' . Interrupted is supposed to be of transient nature. In the scenario you describe, what would happen would be that the latest transfer buffer would be transmitted once the timer has expired and then no more data would be sent. When the provider recovers from the transient fault then data would be sent again and if there was data loss due to the interruption, then the notification would be inserted in the stream where the fault occurred. If the provider was not able to insert anything into the RC at the time of failure I would expect he does so at the time the fault is cleared.

The situation is different when the fault persists. In that case the PS should be set to halted and according to the current specification adopted from SLE the notification would be inserted into the RC. Again the transfer buffer would be sent when the timer expires but then nothing more would happen.

We do distinguish between events that notified synchronously and events that are notified asynchronously and the SFW says that this is defined by the procedure. As far as I can see the BDD says that all events are notified synchronously. One could consider notifying the 'production halted' event asynchronously, such that the user is informed at the time the event occurs. On the other hand, a Service can always add an instance of the cyclic report procedure by which the user can monitor the PS and other relevant parameters or query them when he does not receive data any more.

The BDD adds only two events (end of data, data discarded due to ...) and these must be clearly synchronous. The 'end of data' event is covered by row 7 of the state table and the 'data discarded event' cannot occur in complete mode. It is not clear to me why the BDD also deals with events that might be added by derived procedures. I do not think we do that in other procedures and I feel it is problematic as we cannot really anticipate what the type of event and the suitable behaviour should be. I would therefore suggest to remove that event and to remove clause 4.5.3.2.6 as this is a statement that just repeats the general concept of derivation. If we do this then row 9 indeed becomes superfluous and should be removed.

Regards, Martin

From: css-csts-bounces at mailman.ccsds.org<mailto:css-csts-bounces at mailman.ccsds.org> [mailto:css-csts-bounces at mailman.ccsds.org] On Behalf Of John Pietras
Sent: 15 March 2013 19:53
To: CCSDS_CSTSWG (css-csts at mailman.ccsds.org<mailto:css-csts at mailman.ccsds.org>)
Subject: [Css-csts] production status and the complete mode of the Buffered Data Delivery procedure

CSTSWG colleagues ---
I mentioned in my previous email that while updating the TD-CSTS book I had come across some issues regarding production status and the complete mode of the Buffered Data Delivery procedure.

Let me start by stating my understanding of what the intended behavior is. Assuming that that understanding is correct, I'll then explain the issues that I've encountered.

My understanding of the intended behavior is that when the production status changes, the associated notifications are placed into the Recording Buffer. The effect is that when a complete-mode BDD instance pulls the notification from the Recording Buffer and sends a production change notification to the user, that notification doesn't necessarily have anything to do with the production status at the time that notification is being transferred.

Assuming that's the intent, here's my first issue. Since the Recording Buffer (RC) is part of production, what happens if for some reason the RC becomes disabled? As described, a 'production interrupted' notification should be popped into the RC, but even if the RC is capable of ingesting data it still can't pass the data to the BDD instance.

I looked again at the BDD specification in the December CSTS SFW for some possible help, but there too I found some issues in the BDD state table. There are two rows in the state table that deal with production status change, rows 8 and 9. The first  incoming event in both rows is 'production status change', but it is pretty clear from the context that the intent is that row 8 applies when the procedure is in real-time mode, and row 9 applies to the complete mode.

Let's consider row 9 first. Row 9 deals with data going into the Recording Buffer from upstream production. This is the state table for the Buffered Data Delivery "Service" provider, but putting a notification into the RC is independent of the state of the CSTS that implements the BDD procedure, and indeed the CSTS could be in no state at all. So I don't think that row 9 even belongs in this state table. The only way that the CSTS (procedure) instance deals with production status notifications is when it pops those notifications out of the Recording Buffer in the same way that it deals with "data" (or Service Production Data Units), which is covered by the 'data read from recording buffer' event (row 4). [However, this still does address or solve the issue of how to react to the interruption/halting of the Recording Buffer itself.]

Row 8 (real-time mode) covers the only case in which a CSTS instance "sees" a production status change when it actually occurs.

The final issue has to do with "derived procedure events" There are really two kinds of derived procedure events, although the BDD specification doesn't always keep the distinction clear. The first kind is the Service Production Event Notification, which is generated by service production and affects the data that is available to all services instances associated with that production process. The second kind of notification, which I'll call "procedure instance-generated", is related to the individual CSTS instance, like 'end of data'.

For the real-time mode, row 8 already covers both cases - there's no need to distinguish between Service Production Event Notifications and procedure instance-generated notifications because they are all occurring in real-time.

For the complete mode, the Service Production Event Notification case is covered by row 4, as described above. That is, the only way the procedure instance deals with the Service Production Event Notifications is when they come out of the Recorded Buffer.

But the state table does *not* cover procedure-generated notifications when the service/procedure is in complete mode. The behavior with regard to a procedure-generated notification should be that it should be put in the Transfer Buffer as soon as it is generated. This may cause a momentary delay in the transfer of data from the Recorded Buffer, but that doesn't matter because it is the complete mode. The state table should cover this case. At a quick look, it appears that the action would be the similar to that for row 4, substituting "notification" for "data". It may also require bringing out the distinction between Service Production Event Notifications and procedure instance-generated notifications in the normative behavior text.

I look forward to your comments.

Best regards,
John

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ccsds.org/pipermail/css-csts/attachments/20130319/9c196952/attachment-0001.htm