[Css-csts] Effects of varying network bandwidth on SLE service instance provision period

Wolfgang.Hell at esa.int Wolfgang.Hell at esa.int
Tue Feb 7 05:35:29 EST 2012


John,

Here is a brief summary of how the current ESA implementation works. As it is
the simpler case, let me start with addressing the offline case:

- Any mission for which we have taken the commitment to provide telemetry
also in offline delivery mode can activate an offline retrieval in parallel
with real-time support, i.e. during a scheduled pass at the given station,
because we include the Offline service instances in the what we refer to as
return spacecraft session on our SLE provider. In other words, a mission
having detected that they missed something in real time can perform the
retrieval during the next pass at that station. The risk is that the
retrieval might be aborted in case the data flow did not complete until the
end of the pass. If that risk is obvious from the beginning, the mission can
talk to the link operator during the pass and request that the offline SIs
will not be removed at the end of the pass. In the vast majority of cases the
extended accessibility of the offline SIs will be granted. We then expect the
mission to let us know when they are done and at that point the SIs will be
removed.

- In case missing data are urgently needed or a pass with the station where
the missing data have been acquired is not scheduled for quite some time, the
offline retrieval can also be requested via voice loop outside a pass of that
mission. The link operator will then load the SIs manually and the mission
can perform the offline retrieval in parallel with real time support of
another mission. Again we expect the mission to tell us when they are done so
that the SIs can be removed again.

- It would also be possible to to schedule offline return session the same
way as real time passes are scheduled. However, at least for the missions we
are currently supporting, offline retrieval is the exception rather than the
rule used to recover from some anomalies encountered during the real time
support. Therefore this possibility is currently not used in practice.

Let me address now the online complete delivery case.

- For most missions using the online complete delivery mode, this is done
only as a precaution. As long as things are working nominally, the rate at
which the mission control system reads the telemetry is at least as high as
the space link rate and therefore no backlog builds up on the provider side.

- Even if a small backlog built up, in general the post-pass or tear-down
time of the station will be long enough to transfer this backlog before the
SIs get removed. If the transfer is still ongoing, the process that would
normally remove the SIs runs into an error condition and the operator is
notified. At that point the operator will in general check with the mission
and permit the continuation of the data flow while the station is being
configured for the next mission. If the operator cannot talk to the mission
(e.g. unmanned operations), he will abort the SLE services and the mission
will have to retrieve the data later in offline mode.

- We have a few cases where the space link rate is higher than what the
mission control system can digest and where as a consequence a backlog always
builds up. For those missions the pass timelines are made such that the the
SIs are not removed at the end of the pass and therefore the mission can
continue with the data flow. Once again we expect the mission to let us know
when they are done so that the SIs can be removed.

The current SLE provider implementation supports up to four concurrent return
sessions, i.e. up to four spacecraft can be services concurrently. This limit
dictates that at some point the SIs have to be removed so that there is room
for the next mission to be supported. The prime driver for this limit is that
we want to monitor individual service instances. When allowing even more
concurrent return sessions, the SI monitoring becomes unmanageable from a
human user perspective.

In terms of bandwidth management the offline retrieval is not regarded
critical. All such transfers fall into the so called default class which will
get the bandwidth not consumed by other higher priority traffic. What we
observe for our network is that in general the mission control system is
limiting the throughput rather than the comms line. The online complete
traffic uses a dedicated higher priority throughput class where the
scheduling is done such that we support concurrency of missions only up to
the point where the sum of the concurrent data streams fits into the
guaranteed bandwidth allocated to this class.


In conclusion, what we have today works and satisfies the missions' needs.
What is not really satisfactory is the dependency on voice loop interactions.
Improvements in that area are certainly conceivable. For instance, SIs could
be removed when for a TBD time no telemetry flow was observed. Likewise, one
could build into the system a mechanism that automatically loads the required
offline SI when a BIND attempt to such SI is being made and the required
resources are available. But for now we do not have concrete plans in that
regard.

Please let me know if you have any further questions on this matter.

Best regards,

Wolfgang


                                                                                                                                              
  From:       "John Pietras" <john.pietras at gst.com>                                                                                           
                                                                                                                                              
  To:         <Wolfgang.Hell at esa.int>, "Stoloff, Michael J (318H)" <michael.j.stoloff at jpl.nasa.gov>, <css-csts at mailman.ccsds.org>,            
              <smwg at mailman.ccsds.org>                                                                                                        
                                                                                                                                              
  Date:       06/02/2012 19:58                                                                                                                
                                                                                                                                              
  Subject:    [Css-csts] Effects of varying network bandwidth on SLE service    instance provision period                                     
                                                                                                                                              
  Sent by:    css-csts-bounces at mailman.ccsds.org                                                                                              
                                                                                                                                              





CSTSWG and SMWG colleagues ---
Recently, the NASA Space Network Ground Segment Sustainment (SGSS) Project
has been considering the need to schedule terrestrial network bandwidth for
playback of telemetry data. In the legacy SN system, they schedule both the
playback process itself as well as reserve the network bandwidth. The
question was raised about the effects of knowing or not knowing the available
terrestrial bandwidth on providing SLE offline services. In a note to SGSS
Project personnel, I pointed out that there is no single definitive answer,
but that it depends on (among other things) how service instance provision
periods are “scheduled”. For illustrative purposes, I identified two ends of
a spectrum.

At one end, the network can configure offline service instances with
relatively-narrow service instance provision periods; e.g., a specific
45-minute period could be scheduled. If the terrestrial bandwidth that is
available during that window is sufficient to transfer the volume of data
requested by the user in the START operation, then everything is fine, but if
the bandwidth is not sufficient, the service instance will peer-abort when
the end of the period arrives and all requested data have not yet been
delivered. The legacy Space Network schedules data playback in a mode that is
toward this end of the spectrum.

At the other end of the spectrum, the service instance provision period is
essentially boundless. The service instance is enabled for binding whenever
the user wants to “pull” the data. As long as the requested data are still
available in the data store, they can be transferred. The service instance
can take as long as necessary to transfer (because the end of the service
instance provision period is unbounded): slower terrestrial links mean longer
transfer times but they don’t otherwise inhibit the transfer. In my
discussion with Wolfgang Hell on offline SLE several years ago, I came away
with the impression that this would be the preferred approach given
sufficient resources.

How do the today’s offline SLE implementations (e.g., Estrack and DSN)
currently fall on this spectrum? Are the access windows tightly defined,
unbounded, or somewhere in between (e.g., a service instance is scheduled for
a given mission with the same 4-hour provision period every day?

Of course, the unbounded-provision-period end of the spectrum implies that
transfer service instances are accessible to all offline users all the time,
which has implications for resources. I am under the impression that many (if
not most or all) SLE implementations must be dedicated to a single user at a
time. It seems to me that this doesn’t necessarily have to be the case – it
should be possible to implement offline SLE so that the resources are pooled
so that N offline service instances could be enabled (ready to bind) without
having an SLE “processor” dedicated to each of them. Does your network’s
implementation dedicate SLE resources on a one-for-one basis, or is there
some degree of resource sharing?

While I’m asking about offline services, let me ask a related question about
complete online delivery mode. When a user (mission) schedules a pass with
complete online return SLE service, how is the end of the service instance
provision period determined? Is it (a) requested by the mission in the
service request (or, for rule-based scheduling, pre-specified in the
scheduling rules), or (b) calculated by the network on a pass-by-pass basis
and configured accordingly?

As you may know, in the Blue-1 version of Service Management, the mechanism
for setting complete (as well as timely) online service instance provision
periods is through start and stop time relative offsets specified in Transfer
Service Profiles. The relative offsets are with respect to the space link
carriers with which they are attached. This allows the service instance
provision periods to “float” in the scheduling process with the flexibilities
applied to the scheduling of the space link carriers. For example, profile
RAF7 could have a start-time offset of 0 seconds, and a stop-time offset of
+300 seconds. If profile RAF7 is applied to a space link carrier profile that
is used to schedule a return space link carrier from 1100 to 1115, the RAF
service instance associated with profile RAF7 will be scheduled from 1100 to
1120 (1115 plus 5 (300 seconds)).

The Blue-1 approach may be too simplistic, and as we develop requirements for
the next generation of Service Management, I would like to collect
information on how it is actually done today, and more importantly, how it
might be better done in the future.

Thanks in advance for your help.

Best regards,
John


From: John Pietras
Sent: Thursday, February 02, 2012 4:18 PM
To: Stephen.Bernsee at gdc4s.com; 'Douglas.Barnhart at gdc4s.com'
Cc: 'Degumbia, Jonathan D. (GSFC-444.0)[OMITRON]'; Gawne, Bill
(GSFC-444.0)[HONEYWELL TECH. SOLUTIONS]
Subject: RE: Considerations for SLE and uncertain terrestrial bandwidth

Steve and Doug,
As you know, one of the special topics that’s being carried for the SM
Development WG is related to whether SM needs to know about terrestrial
bandwidth, especially with regard to playbacks. One facet that came up in the
discussion is how return SLE services are affected by bandwidth
considerations. Here are some thoughts on the affects of variable bandwidth
on SLE transfer services.

Return SLE service have three delivery modes: timely online, complete online,
and offline.

      1.       In the timely online mode, the SLE service instance attempts
      to send the data units (e.g., transfer frames) across the TCP
      connection as soon as it gets the data units (i.e., at the downlink
      rate). However, if the link under the TCP connection doesn’t have a
      matching data capacity, the SLE service instance will discard data that
      can’t be sent, based on a service-management-configured parameter. If
      the terrestrial comm link is running at slightly above the downlink
      rate (slightly above accounts for SLE overhead) and the link is clean,
      all frames will normally get transferred and only if the network
      experiences unanticipated (temporary) congestion will frames be
      discarded. So not all frames are guaranteed to get through, but those
      that do are guaranteed to be delivered within a defined latency (hence
      “timely”). As far as SM is concerned, it doesn’t matter if the SLE
      service is given more time than the contact – it couldn’t transfer any
      more data if it had the extra time. However, if the event is scheduled
      such that it competes for the terrestrial bandwidth with other events,
      then the MOC will experience data loss as the service instance
      routinely discards because of the too-low available bandwidth.

      2.       In the complete online mode, the SLE service instance also
      attempts to send the data units across the TCP connection as soon as it
      gets the data units, but if the link under the TCP connection doesn’t
      have a matching data capacity, the SLE service instance will buffer
      data until it can be sent, as long as the service instance is enabled.

      In the SLE transfer service specifications, the time during which the
      service instance is scheduled to be provided (and thus enabled) is the
      service instance provision period. The SLE transfer service
      specifications do not say how the service instance provision period is
      to be specified in the Service Package. In SCCS-SM, the transfer
      service profiles specify relative offsets from the space link carriers
      with which they are attached. This allows the service instance
      provision periods to “float” in the scheduling process with the
      flexibilities applied to the scheduling of the space link carriers. For
      example, profile RAF7 could have a start-time offset of 0 seconds, and
      a stop-time offset of +300 seconds. If profile RAF7 is applied to a
      space link carrier profile that is used to schedule a return space link
      carrier from 1100 to 1115, the RAF service instance associated with
      profile RAF7 will be scheduled from 1100 to 1120 (1115 plus 5 (300
      seconds)). Note that we made the start and stop offsets respecifiable
      so that, for example, for a particular Service Package Request the UM
      could change the RAF 7 stop-time offset to +600 seconds for that
      particular request.

      Note, too, that in SCCS-SM the online SLE transfer services are
      considered part of the Service Package, so the Service Package does not
      stop executing (i.e., “end”) until the completion of the last-ending
      space link carrier or SLE transfer service.

      So, how does this play when the terrestrial bandwidth available to the
      Service Package is affected by other Service Packages? First let’s
      consider the case where neither the MOC nor the SM PSE has any
      knowledge of the terrestrial data rate available to the RAF service
      instance. The stop-time offset in the transfer service profile that is
      applied (e.g., RAF7) may or may not be sufficient to transfer all of
      the data, depending on the competition for that terrestrial bandwidth.
      The profile *could* be configured with a stop-time offset that is
      calculated to transfer all of the data 95% of the time based on some
      network loading analysis that factors in probabilities for competition
      for terrestrial bandwidth, but in many Service Packages that could
      result in scheduling and allocating the SLE transfer service resources
      to that service instance much longer than is needed (e.g., there may be
      no competition for bandwidth during the execution of a particular
      Service Package).

      Now let’s assume the case where the SM PSE *does* know the terrestrial
      bandwidth that will be available. Off the top of my head, some options
      could be:
         a.       If the SM PSE can’t schedule the Service Package such that
         all transfer service instances would be able to transfer their data
         within the specified stop-time offsets, the SM PSE could reject the
         Service Package Request (“reject” could, of course, be preceded by
         conflict resolution activities);
         b.      If the SM PSE can’t schedule the Service Package such that
         all transfer service instances would be able to transfer their data
         within the specified stop-time offsets, the SM PSE could
         accept/schedule the Service Package and add some sort of annotation
         that the data on the transfer service instance will not be able to
         be delivered, and perhaps make a recommendation for the MOC to
         replace the request with one that has a sufficiently-long transfer
         service instance. This would allow the UM to hold the scarce
         resource (the space link carriers). It’s also a viable approach for
         when the UM is having the data simultaneously stored for offline
         retrieval and is willing to use an offline service to get whatever
         can’t be transferred in near-real-time.
         c.       The SM PSE could ignore the stop-time offset (or treat it
         as a soft constraint)  and attempt to schedule the Service Package
         with the transfer service instances running as long as necessary to
         get the data transferred at the terrestrial bandwidth that will
         actually be available (assuming that the SLE transfer service
         resources will also be available for that time). If the MOC finds
         the resulting end-time(s) unacceptable it is always free to delete
         the service package or attempt to modify (replace) it.

         There are probably other variations, too. Of the three above, I
         personally lean toward (c); it seems to have the highest probability
         of getting an acceptable solution the first time and avoiding
         necessary new or replace requests.

      3.       In the offline mode, the data are already stored in the data
      store and the retrieval is unrelated to the Space Link Sessions (i.e.,
      SN Events) during which the data were originally received. In SCCS-SM,
      the offline transfer service instances are scheduled via a different
      kind of Service Package request, the Retrieval Service Package request.
      Unlike the transfer service profiles used for online services, the
      offline transfer service profiles don’t need to be offset with respect
      to other entities such as space link carriers: the service provision
      period of the transfer service instance is set directly by the
      retrieval Service Package that schedules that transfer service
      instance.

      So how are offline transfer service instances affected by varying
      terrestrial bandwidth? It depends on the duration of the Retrieval
      Service Package. If Retrieval Service Packages are scheduled for narrow
      time slices, then if a MOC attempts to retrieve more data (as specified
      by the start-time and stop-time parameters of the START invocation that
      is used to retrieve data from the data store) than can get through the
      pipe, the service instance will abort at the end of the service
      instance provision period without having transferred all of the data.

      While SCCS-SM supports scheduling of short Retrieval Service Packages,
      the preference of many CCSDS members appears to be to allow long-lived
      Retrieval Service Packages (sometimes as long as the life of the
      Mission). By practically eliminating the possibility of having the
      transfer service provision period end while the data are being
      transferred, the MOC requests the data and it simply gets delivered at
      the available data rate, which may even vary during the course of the
      data transfer.

      The downside of long-lived Retrieval Service Packages is the
      possibility (depending on how the SLE service are implemented) of
      having to dedicate SLE offline transfer service resources for long
      periods of time. I believe that it is possible to implement an offline
      SLE transfer service “server” that can be configured to support
      multiple mission simultaneously, such that, for example, but I don’t
      know if any existing SLE implementations are set up to actually do so.
      This is something that I will explore within CCSDS. Note that enabling
      more or fewer SLE offline transfer service instances can be done
      independently of how much terrestrial bandwidth is available – it’s
      only when the transfer service instances are bound and active that data
      flows and therefore impacts the terrestrial bandwidth supply.

John_______________________________________________
Css-csts mailing list
Css-csts at mailman.ccsds.org
http://mailman.ccsds.org/cgi-bin/mailman/listinfo/css-csts


This message and any attachments are intended for the use of the addressee or addressees only. The unauthorised disclosure, use, dissemination or copying (either in whole or in part) of its content is not permitted. If you received this message in error, please notify the sender and delete it from your system. Emails can be altered and their integrity cannot be guaranteed by the sender.

Please consider the environment before printing this email.



More information about the Css-csts mailing list