[Moims-dai] RE: Re:Comments on preservation workflow

david.giaretta at stfc.ac.uk david.giaretta at stfc.ac.uk
Sun Nov 23 19:47:57 UTC 2014


Dear all

A few thoughts/questions. Apologies if these have been discussed previously but some of the comments suggest that there is still room for further clarification. If we can clarify things then we can avoid some iterations.

I assume that what we are aiming for initially is to produce a standardized terminology to identify some important stages in the “data lifecycle”. Once we have the main stages we will then aim to produce standards for each (or at least some) of these.
Is this right?

When we did OAIS we recognized that different communities will have different terms for very similar (or even identical) things – it is therefore impossible to keep every community “happy”. Sometimes it made sense to introduce completely new terms either because to choose an existing one would simply cause confusion, or because it was important to introduce a new, more fine grained, concept.
Am I right in suggesting that if LTDP wants to use some EO terms that should be fine – as long as we can map between terms, similarly we might not want to use EO terms because, for example, it would be confusing to others, it would make our work seem to EO specific or we think that we ned to have some more granularity in our terminology?

What should guide us about which stages be separated out? Let us assume we can agree of these stages; it is clear that there will be planning, people, repeated (cyclical) activities etc in each stage. Do we have to name these explicitly now or can they wait? OAIS uses the concept of recursion extensively to avoid having to design things down to finer and finer details.  We can look at the details when we come to the specific detailed standard about a stage.
Am I right in thinking that delaying details has a lot of advantages because then we do not get bogged down in those details and that then comes back to identifying what the really important stages are?

Looking at the various lifecycle model (50+ different ones e.g. http://<http://blogs.loc.gov/digitalpreservation/2012/02/life-cycle-models-for-digital-stewardship>blogs.loc.gov/digitalpreservation/2012/02/life-cycle-models-for-digital-stewardship<http://blogs.loc.gov/digitalpreservation/2012/02/life-cycle-models-for-digital-stewardship> and http://www.ceos.org/images/DSIG/Data%20Lifecycle%20Models%20and%20Concepts%20v13.docx  and https://dl.dropboxusercontent.com/u/6959356/ICP/Many%20models.pptx )
Am I right in thinking that each is trying to provide a checklist of activities and focus attention on specific, different, aspects that the author thinks is most important - my impression is that the ones with the most details tend to be collective works with lots of authors?

So what is our overriding concern?
Am I right in thinking that it is about how to make sure that when data is created it can be preserved – and we would want to follow the concepts in OAIS and related standards?

How do we define what separates the stages – bearing in mind that in some instances (archive/ discipline/ scenario etc)  a stage may be pretty well non-existent i.e. we do not need all stages to be big events for every instance; also in some instances there may be additional very domain specific stages. It must be a matter of collective judgement/review about which stages are sufficiently important. The stages that are not regarded as sufficiently general will no doubt appear in some of the more detailed standards as options. Presumably some stages may occur in parallel but I guess we would like to avoid making things too complex at this initial level.
Am I right in suggesting that what we are concerned about is where responsibility belongs to a person (or the same person with a different role) or organisation – a new stage is when the person/role/organisation changes. This would allow us to define the various stages. We need to make sure that the appropriate information is collected and handed over – enough to create/maintain the appropriate AIPs?

If we can agree on these then we can probably move forward quite quickly.

..David


From: Boucon Daniele [mailto:Daniele.Boucon at cnes.fr]
Sent: 23 November 2014 10:45
To: moims-dai at mailman.ccsds.org
Subject: [Moims-dai] FYI: Re:Comments on preservation workflow

Hi all,

For your information, please find below the comments on the LTDP preservation workflow sent last Friday, and the answer received from Mirko Albani.

Esther, I’ll complete these comments with your answer this Monday morning.

Best regards,

Daniele

De : Mirko.Albani at esa.int<mailto:Mirko.Albani at esa.int> [mailto:Mirko.Albani at esa.int]
Envoyé : vendredi 21 novembre 2014 22:06
À : Boucon Daniele
Cc : ltdp_wg at esa.int<mailto:ltdp_wg at esa.int>
Objet : Re:Comments on preservation workflow


Dear Daniele,

many thanks for the very valuable comments which will be duly analysed and addressed next week. Similarly to the LTDP Guidelines we will reference existing standards like OAIS where needed but will try to keep the terminology in the document as close as possible to the Earth Observation one to maximise impact and use by EO data providers. The EO preservation workflow document will be then reviewed/approved at WGISS and provided as CEOS input for consideration by CCSDS.

Thanks again,

Mirko

Inviato da IBM Notes Traveler

Boucon Daniele --- Comments on preservation workflow ---

Da:

"Boucon Daniele" <Daniele.Boucon at cnes.fr<mailto:Daniele.Boucon at cnes.fr>>

A:

ltdp_wg at esa.int<mailto:ltdp_wg at esa.int>

Data:

Ven, 21/Nov/2014 17:49

Oggetto:

Comments on preservation workflow

________________________________


[cid:23FA7FCB26C83C4098544D725A502D3A at EXCHANGE.CST.CNES.FR]
Dear all,

Please find below some high level comments:

1 Terminology

·        There are some inconsistencies in the viewgraph and in the document (see also comments in the attached document)
·        Need to compare the terminology and structure for preservation from CASPAR, LTDP and some of the other documents in the bibliography.
·        Concepts should be matched/better align with OAIS terminology (AIP, AIU, AIC. Also, section 5 of OAIS addresses software, versions, and editions.
·        Definitions should be simplified, shortened and made to parallel the figures more. This should be done to better progress on rest of document.
·        Does the labeling of ‘archive’ refers to an archive process in some general sense, rather than an actual Archive?


2 Relationships between concepts

·        Need to clearly identify the role of stewardship: global organization, coordination, decision, … . This role is broader than the mission archive concerned by the Dataset/PDSC and the preservation workflow.
·        Consolidation is a mix of acquisition (e.g gap filling), preservation and evolution (e.g transformation through reprocessing) activities, not ony preservation.
·        Preservation acts on Consolidated Data. Not sure then Preservation should be included in Curation. Shall we then consider a “Curation workflow”
·        Value adding: included in Stewardship or Curation? What’s the difference between “Data Record improvement” and “Value adding” in figure 1?

3 Steps and content

·        It should be clarified at what point the workflow begins. That means: after a decision at a management level has been adopted (that means at the stewardship level). Note: this is the first step of the PAIMAS (faisability study before preliminary decision).
·        items in the graphic don't align with the topics in the workflow chart except for consolidation: more correspondence is required
·        some segments of the chart correspond to existing ccsds standards (initialization = paimas, implementation = pais, operations = oais?).  That should be brought out somewhere and the terminology and activities from those standards incorporated into the LDTP workflow.
·        Need for an ending point in the workflow (topic of retirement: this “end point” could be, after the preservation period (defined at the initialization phase), a decision point (keep on maintaining the data set for a time period to define, delete part/all of the data set, transfer the data set).
·        How and where is hardware and software that is needed to access the data handled?
·        the LTDP workflow covers initial acquisition, consolidation and preservation activities.  While there is the opportunity to retrigger the preservation workflow in the operation phase. this aspect could be better developed. See for example the attached figure associated to the diagram dedicated to space missions under phase E and after, where stewardship/management appears in the box at the top, where the “cataloging activity” for all space missions, at the bottom, could be seen as value adding, and where different actions can trigger the process (initial, evolution, review).

Don’t hesitate to ask questions.

Best regards,

Daniele

PS: these comments are gathered and organized from CCSDS discussions and comments.




This message and any attachments are intended for the use of the addressee or addressees only.

The unauthorised disclosure, use, dissemination or copying (either in whole or in part) of its

content is not permitted.

If you received this message in error, please notify the sender and delete it from your system.

Emails can be altered and their integrity cannot be guaranteed by the sender.



Please consider the environment before printing this email.
[CNESdiagram.png]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20141123/24ed2825/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 39794 bytes
Desc: image001.png
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20141123/24ed2825/attachment.png>


More information about the MOIMS-DAI mailing list