[Moims-dai] Minutes of today telecon (20th February)

Boucon Daniele Daniele.Boucon at cnes.fr
Fri Feb 20 17:59:03 UTC 2015

Hi all,

Please find below the minutes of today telecon. Feel free to correct or comment, particularly on the ICP part (my notes are quite poor at the end).

Next telecons:
1 Friday 13th March or Friday 20th March (15h EU time, 9h US time-Washington) -> TO BE CONFIRMED by Daniele
2. CCSDS meeting from 23rd to 27th March, Caltech

Agenda for next telecon (2 different parts, 1 hour each):
1st hour: (15-16h EU time, 9-10h US time)
* METS test case
* Corot test case
* Core GB
2nd hour: (16-17h EU time, 10-11h US time)

If available, proposal to use Dave's new number for the sound:
Dave Williams
Telecon information
+1-844-467-6272      USA Toll Free
+1-720-259-6462      Others
Passcode:    841727

Best regards,


20th February meeting minutes

DB: Daniele Boucon
DG: David Giaretta
DS: Don Sawyer
ES: Esther Conway
JG: John Garrett
MM: Mike Martin (NASA)
RD: Robert Downs
SM: Stephane Mbaye
SR: Stephane Reecht
BC: Bertrand Caron (BnF)
PT: Jean-Philippe Tramoni (BnF)
WG: all

D=Decision, A=action (other = discussion)

1 METS test case

Email with attached updated METS test case description sent by SR the 17th February to the DAI mailing list. Unfortunately this email was not received (except by Daniele who was in copy), due to recurrent trouble encountered with the attached file addressed to the mailing list.

Daniele is Ok with the new version.
The discussion is postponed, comments will be sent next week.

A=>(JG): ask the CCSDS technical point of contact for a solution to solve systematic troubles encountered with attached files in DAI mailing list (emails not received by members).

A=>(WG): send comments on the updated METS test case description -> next week.

2   Corot Test case

See email 5th February from Daniele, today from Don:

Review of comments included in the document.

A=>(DS): update the Corot test case for the introduction, definitions and other personal comments.
A=>(MM): update the "benefits", and section
A=>(DB): analyze remaining comments and propose explanation text when useful.

1.      3 Core PAIS GB

See email from Don today (clarification on nested groups with multiple occurrences).
A=>(WG): send comments on Don's proposal for the core GB (nested groups with multiple occurrences).

2.      4 Information Curation Process -ICP

see framework sent by John, and emails today and yesterday from  MM, JG, EC, DG.

Purpose and scope:
Ok with MM's proposal, except the point on test cases that should be avoided in the document itself (not conform to the CCSDS requirements), even if it's agreed to use test cases in order to test the standard.

Other main areas of interest: probably no need to develop in details, but interesting to cite the words, at least to better attract the reader. It is decided to add one sentence containing the main terms of interest, without giving details (high level point of view).
Comment from DS after the telecon: the term 'Data Management Plan' is perhaps overloaded. Another term?

Use of 'preservation, 'utilization', 'curation' and 'stewardship':  hierarchical point of view ok following the LTDP document. Not sure it is useful to use all. Needs more discussion and definition.

'Process' versus 'framework':
BD: a process is a sequence of activities, while a framework is made up of elements part of a larger container not necessarily in sequence. Framework seems to fit here.
EC: prefers framework.
DB: not completely convinced at the moment, as we have identified main steps.
More discussion is needed on this point.

Proposal stage: "archive' is viewed as Long term Archive, very important to be involved at this stage.

Other comments from DG: hand over from one stage to another stage, and which deliverables are required. This should be discussed.

A=>(WG): send comments to the proposals for ICP main steps sent by MM, EC, JG.
A=>(JG): update the framework with the purpose and scope, add one sentence on areas of interest, and add the stages sent by EC, JG, MM.
A=>(DB): ask LTDP for an editable figure 1 -Preservation Workflow- (Links between stewardship/Curation/Preservation).

Other points

DEDSL XML standard: work started, draft version with XML schema planned for mid March.

Caltech agenda: BD only available on Monday.

A=>(all): send availability (and time window) for the Caltech meeting -> next week.

Actions review
 Updated actions will be sent separately by JG.

End of 20th February meeting minutes
Part of historical minutes to be reminded

        2.2 Core document

Section 4.1.1 reviewed (ok)
Tabular representation is welcome but question remains about their systematic use (need an XML version in annex ?)

CCSD0014: equivalent to TO Descriptor, and CCSD0015: equivalent to Collection Descriptor

==> Action Stephane from MM: explain more the CCSD0015: how it is registered and reference where more information could be found on that subject

Don:  descriptorModelID has to be changed on any XSD change (specialization), the Archive has to maintain the versions

Section 4.6.1 about PAIS XSD description should remain here for the time being

« open » enumeration technique is cumbersome (TBC)

Comment from previous telecons on method
* (Daniele): there are steps that conform to the PAIMAS process (first model, SIP constraints, then transfer and validation).
* Link between the data base on the Archive side, and the PAIS XML elements: example on how to match both (core document, test case?).

The following paragraph will be suppressed if ok:
XML namespace for PAIS

It is agreed that not all positions where a pais :any element are possible have to be documented in the GB. Only a few example are necessary or even one.

More concrete example should be provided than the abstract « foo »/« bar » currently proposed. Typical example would be a Collection holding a pais :any with the author of the descriptor, the Collection name/ID in the Producer semantic, or anything else that could be specific to the Producer or the Archive side but not provided in the PAIS definitions.

SM: Reminded that « true » restrictions of XML Schema guarantees that the original PAIS XML Schema's rule are still applicable. Any instance following a restriction follows the original ones.

SM: The use of restrictions does not impose any system to use the derived PAIS schemas. Therefore, the restricted schemas have not to be shared with any user of the produced SIPs. The project specific schemas could even be discarded without losing control of the produced SIPs.

=> D - (DS) The rightmost column of the table in §4.6.4 shall be renamed « Restriction » instead of « Content »

MM: It is not clear that this table should be kept in that form or discarded at the end, but the target should not spend too much pages on that topic.

WG: restrictions may be interesting for implementers and as such should be documented, but it is not clear if this should be proposed as a recommendation or a best practice.

SM: reminded that restricting elements such as the maxOccurrence's should be a recommended practice since it can be very difficult to implement interoperable software exchanging elements of xs:nonNegativeInteger type.

SM: proposed to add prepared templates of restricting XML Schemas in annex. Something that could help implementers to quickly setup the restrictions of their needs.
WG: adding XML Schema's in annex may not be so helpful because cut and paste from PDF may be very cumbersome.

=> D - (JG) these XSD shall be placed side to the originals.

SM: The problem is similar to other GB resources as the use case descriptors or the software prototypes.


Preservation process

DB: LOTAR representative is ok with "preservation", but not with "curation" (this word is not used in the community).

To be done: analyse and suggest, at different stages of the project (during data lifecycle), what should be done on an archiving point of view.

Example of main issue: keep documentation up to date (when changes in formats, processing, ... are made on the data).

Daniele explains her point of view: develop a "model" (magenta book) gathering all the basic components of the preservation activity (selection and appraisal, data and metadata preparation, access, maintenance ...) that should cover all the data lifecycle (even when data don't exist), in order to be able to answer the following question: what should be done from the beginning till the end to be able to preserve data, and at what moment? This should be done in a generic way, making links on standards related to basic components when they exist.

Suggestion to ask Barbara Sierman of existing works concerning this topic. Mail sent the 7/02 by Daniele.

Daniele underlines the need for CNES and LTDP group. CCSDS expertise will be very important.

The PDS has its internal process ; for other agencies (particularly the LTDP member agencies) this process doesn't exist and is required.

Question on how to make the link between a global process and the PAIMAS phases.

20131030 Don's email and following discussions:
Don: The process could be focussed on the Archive point of view, and seen an internal OAIS issue for workflow, using then more OAIS concepts.
The "provenance" is practically a big issue, going back to the original information.
"Reprocessing, curation, stewardship" could be maped with OAIS migrations, new versions, ...

Update procedures exist at NSSDC.

Daniele: appraisal should be the starting point. Workflow, on the Archive side, could begin early in a space project, even when data don't yet exist ; link with the Archive side of the PAIMAS.

Other subjects

OAIS Magenta Book (French version)

Complete validation to be performed now (text and figures)
This version will be also validated by the French National Archives.
Due to the amount of work on the PAIS, and priority to the PAIS green book, this will be treated after.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20150220/706b6be9/attachment.html>

More information about the MOIMS-DAI mailing list