[Moims-dai] 4th December minutes
Daniele.Boucon at cnes.fr
Mon Dec 8 17:04:08 UTC 2014
Please find below the minutes of yesterday telecon (sorry for the delay, I was not available last Friday). Feel free to comment or correct.
Agenda for next telecon:
* Corot test case
* Core GB
* METS test case
* IPC high level structure
* Thursday 11th December with BnF (dedicated to the METS practical example)
* Friday 9th January (15h EU time, 14h UK, 9h US)
If available, proposal to use Dave's new number:
+1-844-467-6272 USA Toll Free
Dave's old number ("in case"):
+1-877-954-3555 USA Toll free
4th December meeting minutes
DB: Daniele Boucon
DG: David Giaretta
DS: Don Sawyer
ES: Esther Conway
JG: John Garrett
MM: Mike Martin
RD: Robert Downs
SM: Stephane Mbaye
SR: Stephane Reecht
D=Decision, A=action (other = discussion)
1 London meeting: minutes
Minutes validated by the group and will be sent by John after the telecon.
2 Information Curation Process -ICP
The LTDP preservation workflow is in the "CEOS" format and will be reviewed by CEOS/WGISS according to the following planning:
DB has asked the LTDP the 8th December for the updated version according to the comments sent the 21st November.
Discussion: ICP is made up activities or grouping of activities, divided into levels and sub levels. Risk management is one of them. ICP should include a kind of roadmap, pointing out stages and where responsibility is handed over, instead of describing in details each activity.
Where OAIS applies should be identified.
ICP should apply for simplest case, where individuals create the data, to complex case.
A=>-(EC): send a synthetic model, from the slides presented during the London meeting (for beginning week 50)
A=>-(WG): send comments on the email sent by David the 23rd November (for next telecon 9th january at the latest).
3 PAIS GB Core
New version sent the 3rd December for review and discussion. Updates presented by SM.
SM: specialization of the schemas. The "redefine" can not be used in our case. That means that the only possibility to specialize the schemas is to copy it and perform changes in a particular context (same as XFDU). Then the conformity to the original schemas can be checked on the produced XML.
This should be simply explained in the GB, along with examples of specialization in annex.
A=>-(SM): send the "specialization kit" to the group, to be included in the annex of the GB.
A=>-(SM): provide a planning for the end of the GB: 1. completion of all topics to be dealt with, and 2. identification of sections that can be fully written by SM and/or the group.
4 PAIS Corot Practical example
New version sent the 3rd December for review and discussion. Updates presented by SM (homogenization of terms in the text and in the implementation).
Section 220.127.116.11 to be continued.
A=>-(SM): send the updated test case implementation to DB for test with the CNES proto, update the repository.
A=>-(WG): comment the Corot practical example.
5 PAIS METS Practical example
Discussion to prepare the 11th December telecon.
Analysis of the email sent by DS the 21st November.
Question from BnF: how to model the PREMIS metadata included in the METS Manifest?
DB: different parts of the PREMIS metadata, describing the process to pre ingest the issue into SPAR (BnF archival system).
One solution is to create a "Data Object Type" associated with the main data in the MOT. Another solution could be to use an extension point in the SIP to include this part. In this case, no relationship nor associated ID in the MOT.
SM: The PAIS "extra elements" have been treated as administrative metadata. Another solution could have been to use METS extension?
DS: what does the Producer want to send, what does the Archive require? In this practical case, the Producer = the Archive. If the PREMIS metadata is a requirement by the Archive to the Producer, then it could make sense to use a SIP Model extension.
=>This should be clarified next Thursday.
DS: it should be clarified how, in a general point of view, the mapping is made on an abstract point of view (not only how the instantiation in the example has been done).
The mechanism of containers should appear in the EXCEL file.
A=>-(DS): complete the EXCEL file to be closer to the Abstract SIP Model view, including containers (Monday), to be sent by DB before Thursday to BnF.
Updated actions will be sent separately by JG.
End of 4th December meeting minutes
Part of historical minutes to be reminded (slightly updated the 8th December)
2. GREEN BOOK
2.2 Core document
Section 4.1.1 reviewed (ok)
Tabular representation is welcome but question remains about their systematic use (need an XML version in annex ?)
CCSD0014: equivalent to TO Descriptor, and CCSD0015: equivalent to Collection Descriptor
==> Action Stephane from MM: explain more the CCSD0015: how it is registered and reference where more information could be found on that subject
Don: descriptorModelID has to be changed on any XSD change (specialization), the Archive has to maintain the versions
Section 4.6.1 about PAIS XSD description should remain here for the time being
« open » enumeration technique is cumbersome (TBC)
Comment from previous telecons on method
* (Daniele): there are steps that conform to the PAIMAS process (first model, SIP constraints, then transfer and validation).
* Link between the data base on the Archive side, and the PAIS XML elements: example on how to match both (core document, test case?).
The following paragraph will be suppressed if ok:
XML namespace for PAIS
It is agreed that not all positions where a pais :any element are possible have to be documented in the GB. Only a few example are necessary or even one.
More concrete example should be provided than the abstract « foo »/« bar » currently proposed. Typical example would be a Collection holding a pais :any with the author of the descriptor, the Collection name/ID in the Producer semantic, or anything else that could be specific to the Producer or the Archive side but not provided in the PAIS definitions.
SM: Reminded that « true » restrictions of XML Schema guarantees that the original PAIS XML Schema's rule are still applicable. Any instance following a restriction follows the original ones.
SM: The use of restrictions does not impose any system to use the derived PAIS schemas. Therefore, the restricted schemas have not to be shared with any user of the produced SIPs. The project specific schemas could even be discarded without losing control of the produced SIPs.
=> D - (DS) The rightmost column of the table in §4.6.4 shall be renamed « Restriction » instead of « Content »
MM: It is not clear that this table should be kept in that form or discarded at the end, but the target should not spend too much pages on that topic.
WG: restrictions may be interesting for implementers and as such should be documented, but it is not clear if this should be proposed as a recommendation or a best practice.
SM: reminded that restricting elements such as the maxOccurrence's should be a recommended practice since it can be very difficult to implement interoperable software exchanging elements of xs:nonNegativeInteger type.
SM: proposed to add prepared templates of restricting XML Schemas in annex. Something that could help implementers to quickly setup the restrictions of their needs.
WG: adding XML Schema's in annex may not be so helpful because cut and paste from PDF may be very cumbersome.
=> D - (JG) these XSD shall be placed side to the originals.
SM: The problem is similar to other GB resources as the use case descriptors or the software prototypes.
DB: LOTAR representative is ok with "preservation", but not with "curation" (this word is not used in the community).
To be done: analyse and suggest, at different stages of the project (during data lifecycle), what should be done on an archiving point of view.
Example of main issue: keep documentation up to date (when changes in formats, processing, ... are made on the data).
Daniele explains her point of view: develop a "model" (magenta book) gathering all the basic components of the preservation activity (selection and appraisal, data and metadata preparation, access, maintenance ...) that should cover all the data lifecycle (even when data don't exist), in order to be able to answer the following question: what should be done from the beginning till the end to be able to preserve data, and at what moment? This should be done in a generic way, making links on standards related to basic components when they exist.
Suggestion to ask Barbara Sierman of existing works concerning this topic. Mail sent the 7/02 by Daniele.
Daniele underlines the need for CNES and LTDP group. CCSDS expertise will be very important.
The PDS has its internal process ; for other agencies (particularly the LTDP member agencies) this process doesn't exist and is required.
Question on how to make the link between a global process and the PAIMAS phases.
20131030 Don's email and following discussions:
Don: The process could be focussed on the Archive point of view, and seen an internal OAIS issue for workflow, using then more OAIS concepts.
The "provenance" is practically a big issue, going back to the original information.
"Reprocessing, curation, stewardship" could be maped with OAIS migrations, new versions, ...
Update procedures exist at NSSDC.
Daniele: appraisal should be the starting point. Workflow, on the Archive side, could begin early in a space project, even when data don't yet exist ; link with the Archive side of the PAIMAS.
=> action all: exchange on high level process (3 main steps preparation/preservation/maintenance) and links with PAIMAS phases for return to the LTDP group.
OAIS Magenta Book (French version)
Complete validation to be performed now (text and figures)
This version will be also validated by the French National Archives.
Due to the amount of work on the PAIS, and priority to the PAIS green book, this will be treated after.
XML schema for DEDSL
Seems possible for CNES to write the document on the model on the existing other DEDSL standards. At CNES, XML schema for DEDSL is already created and implemented.
Prototypes: CNES has already tested it on operational tool. This could play the role of prototype, and this could be enough.
Nestor explained John that a 2nd prototype is required.
A possibility could be to produce not a blue book, but a document that won't required 2 prototypes (orange book). To be more discussed.
-------------- next part --------------
An HTML attachment was scrubbed...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Picture (Device Independent Bitmap) 1.jpg
Size: 41232 bytes
Desc: Picture (Device Independent Bitmap) 1.jpg
More information about the MOIMS-DAI