RE [Moims-dai] Minutes of 20th March DAI telecon

bertrand.caron at bnf.fr bertrand.caron at bnf.fr
Mon Mar 23 13:32:40 UTC 2015


Hi Danièle,

It occurs to me (maybe a bit late) that the METS Editorial board could be 
interested in reviewing the METS test case. Could I pass the document on 
to them? Or do you prefer to wait until it is validated by the group?

All the best,

Bertrand Caron
Département Information bibliographique et numérique
Bibliothèque nationale de France
Quai François Mauriac
75706 Paris Cedex 13
01 53 79 42 23
bertrand.caron at bnf.fr








Message de : Boucon Daniele <Daniele.Boucon at cnes.fr> 
                      21/03/2015 00:06

Envoyé par : 
moims-dai-bounces at mailman.ccsds.org

Veuillez répondre à MOIMS-Data Archive Ingestion 
<moims-dai at mailman.ccsds.org>



Pour
MOIMS DAI List <moims-dai at mailman.ccsds.org>
Copie
Donald & Carolann Sawyer <Sawyer at acm.org>, "execdir at codata.org" 
<execdir at codata.org>, "Sarah Jones      \(HATII\)" 
<Sarah.Jones at glasgow.ac.uk>, "joy.davidson at glasgow.ac.uk" 
<joy.davidson at glasgow.ac.uk>
Objet
[Moims-dai] Minutes of 20th March DAI telecon



Hi all,
 
Please find below the minutes of today telecon. Feel free to correct or 
comment.
 
Next telecons: 
 
1. CCSDS meeting from 23rd to 27th March, Caltech
 
2.Friday 24th April (15h EU time, 9h US time-Washington) 
To share screen:

https://webconference.cnes.fr/ccsds_24_april/
=>Agenda (2 different parts, 1 hour each): 
1st hour: (15-16h EU time, 9-10h US time)
* METS test case (if necessary)
* Core GB 
2nd hour: (16-17h EU time, 10-11h US time)
* ICP : generic description for stages, different scenarii 
 
If available, proposal to use Dave's new number for the sound:
Dave Williams
Telecon information 
+1-844-467-6272      USA Toll Free
+1-720-259-6462      Others
Passcode:    841727

 
Best regards,
 
Daniele
 
__________________________________________________________________________
20th March meeting minutes
 
DB: Daniele Boucon 
DG: David Giaretta 
DS: Don Sawyer 
ES: Esther Conway 
JG: John Garrett 
MM: Mike Martin (NASA)
RD: Robert Downs 
SM: Stephane Mbaye
SR: Stephane Reecht
BC: Bertrand Caron (BnF)
PT: Jean-Philippe Tramoni (BnF)
WG: all
 
D=Decision, A=action (other = discussion)
 
1 METS test case 
 
1.1 Review of the BnF document:
Validation of updates by all.
Comment JGG1: Should we add a sentence or two to explain the motivation 
for using METS instead of XFDU? =>Add a sentence explaining that it is 
more used in library domain.
 Comment JGG2: Are your SLAs executable or are they documents? Or 
processes? => Answer: yes, explain
Comment BC3: We allow replacement (edition) or update (new version) of an 
existing SIP. In this case, the new SIP would include two TO groups 
instead of one.
Addition JGG: Will previous editions or previous versions remain in the 
repository?  Are OCR and epub both allowed in the ?ocr? group?   The MoT 
may need to be adjusted to allow for some of these options.
=>answer: every version of the packages are preserved, add clarification. 
Replace ?completes? by ?updated?
Format/Mime type: only one allowed per Data Object.
DS: if there is a need for multiple formats allowed, the the PAIS 
descriptor extension point could be used (interesting to test).
BC: is there a possibility to indicate a date of validity?
DS: no, but also possible to define it through the PAIS descriptor 
extension point.
1.2 Review of the spreadsheet
Discussion on terminology used at BnF to define ?group?. To avoid 
ambiguity, speak for example about ?METS element?, ?BnF specific group?.
Add a sentence in the text explaining that XFDU and METS have similar high 
level structure.
 
A=>(BnF): update the document according to 20th March telecon (1.1 and 
1.2).
  
1.      2 Core PAIS GB
New updates sent by DS, but not enough time for discussion on this 
subject.
There is still a lot to do sections 4 and 5. 
The objective ?next week- is to work on this part, analyze the ?to be 
done?, and identify the priorities.
 
2.      4 Information Curation Process -ICP
MM sent a proposal for a general paragraph to introduce the 4 stages -> to 
be more discussed.
DG sent a simplified version of the synthetic spreadsheet for stages.
 
Roadmap to be clarified (what the intention is).
 
Identification of a generic skeleton, and of specific elements. In each 
stage, which topics are very important for preservation.
 
MM: there is no standard for Data Management Plan.
 
DG: collaboration between RDA and CCSDS.
*RDA: requirements for Data Management Plan. Good visibility, NO 
publication process
*CCSDS: publication process, NO good visibility.
 
Confirmation of a need for detailed use cases with supporting text.
 
Decision: 3 spreadsheet: a generic one, a copy with adaptation for library 
test case, a copy with adaptation for EO data-LTDP (space mission).
 
Discussion Data Management/Preservation/Curation and adding value. RDA 
proposes a diagram with 2 sides: support and tools.
RD: providing access and data format could be a short term issue.
 
DB: in the spreadsheet, the ?consumer? does not appear as an actor. The 
consumer provides needs and requirements for data access, services, ? and 
checks the usability of the data. Also appears ?Funder? and ?Sponsor?.
RD: a sponsor has a wider role than the funder. 
 
Objective for next week -> work on the spreadsheet.
 
Other points
 
CNES PAIS prototype: installation highly simplified (PostgreSQL replaced 
by an integrated Derby database). Tests done by DB, with review of test 
cases. New version will be sent today to JG and MM. 
 
Trouble noted with ISEE and Corot data. In all cases, data have been 
changed and/or cut for demonstration (don?t correspond to the original 
ones).
 
A=>(DB): add a sentence at the beginning of each test case explaining that 
data have been modified and cut (don?t correspond to the original ones).
 
Actions review
Please send to JG updates in actions. 
Updated actions will be sent separately by JG.
 
End of 20th March meeting minutes
___________________________________________________________________________________________________________
Part of historical minutes to be reminded
__________________________________________________________________________________________________________
 2. GREEN BOOK
 
        2.2 Core document
 
Section 4.1.1 reviewed (ok)
Tabular representation is welcome but question remains about their 
systematic use (need an XML version in annex ?)
 
CCSD0014: equivalent to TO Descriptor, and CCSD0015: equivalent to 
Collection Descriptor
 
==> Action Stephane from MM: explain more the CCSD0015: how it is 
registered and reference where more information could be found on that 
subject
 
Don:  descriptorModelID has to be changed on any XSD change 
(specialization), the Archive has to maintain the versions
 
Section 4.6.1 about PAIS XSD description should remain here for the time 
being
 
 
« open » enumeration technique is cumbersome (TBC)
 
Comment from previous telecons on method 
* (Daniele): there are steps that conform to the PAIMAS process (first 
model, SIP constraints, then transfer and validation).
* Link between the data base on the Archive side, and the PAIS XML 
elements: example on how to match both (core document, test case?).
 
The following paragraph will be suppressed if ok:
XML namespace for PAIS 
    
 
It is agreed that not all positions where a pais :any element are possible 
have to be documented in the GB. Only a few example are necessary or even 
one.
 
More concrete example should be provided than the abstract « foo »/« bar » 
currently proposed. Typical example would be a Collection holding a pais 
:any with the author of the descriptor, the Collection name/ID in the 
Producer semantic, or anything else that could be specific to the Producer 
or the Archive side but not provided in the PAIS definitions.
 
SM: Reminded that « true » restrictions of XML Schema guarantees that the 
original PAIS XML Schema?s rule are still applicable. Any instance 
following a restriction follows the original ones.
 
SM: The use of restrictions does not impose any system to use the derived 
PAIS schemas. Therefore, the restricted schemas have not to be shared with 
any user of the produced SIPs. The project specific schemas could even be 
discarded without losing control of the produced SIPs.
 
=> D ? (DS) The rightmost column of the table in §4.6.4 shall be renamed « 
Restriction » instead of « Content »
 
MM: It is not clear that this table should be kept in that form or 
discarded at the end, but the target should not spend too much pages on 
that topic.
 
WG: restrictions may be interesting for implementers and as such should be 
documented, but it is not clear if this should be proposed as a 
recommendation or a best practice.
 
SM: reminded that restricting elements such as the maxOccurrence?s should 
be a recommended practice since it can be very difficult to implement 
interoperable software exchanging elements of xs:nonNegativeInteger type.
 
SM: proposed to add prepared templates of restricting XML Schemas in 
annex. Something that could help implementers to quickly setup the 
restrictions of their needs.
WG: adding XML Schema?s in annex may not be so helpful because cut and 
paste from PDF may be very cumbersome.
 
=> D ? (JG) these XSD shall be placed side to the originals.
 
SM: The problem is similar to other GB resources as the use case 
descriptors or the software prototypes.
 
___________________________________________________________________________________________________________
 
Preservation process 
 
DB: LOTAR representative is ok with "preservation", but not with 
"curation" (this word is not used in the community).
 
To be done: analyse and suggest, at different stages of the project 
(during data lifecycle), what should be done on an archiving point of 
view.
 
Example of main issue: keep documentation up to date (when changes in 
formats, processing, ? are made on the data).
 
Daniele explains her point of view: develop a "model" (magenta book) 
gathering all the basic components of the preservation activity (selection 
and appraisal, data and metadata preparation, access, maintenance ?) that 
should cover all the data lifecycle (even when data don't exist), in order 
to be able to answer the following question: what should be done from the 
beginning till the end to be able to preserve data, and at what moment? 
This should be done in a generic way, making links on standards related to 
basic components when they exist.
 
Suggestion to ask Barbara Sierman of existing works concerning this topic. 
Mail sent the 7/02 by Daniele.
  
Daniele underlines the need for CNES and LTDP group. CCSDS expertise will 
be very important.
 
The PDS has its internal process ; for other agencies (particularly the 
LTDP member agencies) this process doesn't exist and is required.
 
Question on how to make the link between a global process and the PAIMAS 
phases.
 
20131030 Don's email and following discussions:
Don: The process could be focussed on the Archive point of view, and seen 
an internal OAIS issue for workflow, using then more OAIS concepts.
The "provenance" is practically a big issue, going back to the original 
information.
"Reprocessing, curation, stewardship" could be maped with OAIS migrations, 
new versions, ?
 
Update procedures exist at NSSDC.
 
Daniele: appraisal should be the starting point. Workflow, on the Archive 
side, could begin early in a space project, even when data don't yet exist 
; link with the Archive side of the PAIMAS.
 
___________________________________________________________________________________________________________
Other subjects 

OAIS Magenta Book (French version) 
 
Complete validation to be performed now (text and figures)
This version will be also validated by the French National Archives.
Due to the amount of work on the PAIS, and priority to the PAIS green 
book, this will be treated after.
    
____________________________________________________________________________________________
 
 
 
 
 
 _______________________________________________
Moims-dai mailing list
Moims-dai at mailman.ccsds.org
http://mailman.ccsds.org/mailman/listinfo/moims-dai



La BnF au Salon du livre 2015 
du 20 au 23 mars 2015 - Porte de Versailles - Paris 15 e Avant d'imprimer, pensez à l'environnement. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20150323/931ecd6f/attachment.html>


More information about the MOIMS-DAI mailing list