[Moims-dai] Action Items (9th October)

John Garrett garrett at his.com
Tue Oct 28 17:21:26 UTC 2014


Hi,

 

Sorry, I don’t think the action item list made it out while I was traveling.

I’m resending it now in preparation for this week’s telecon.

 

Peace and Prosperity,

-JOhn

 

From: moims-dai-bounces at mailman.ccsds.org
[mailto:moims-dai-bounces at mailman.ccsds.org] On Behalf Of Boucon Daniele
Sent: Thursday, October 9, 2014 12:06 PM
To: 'MOIMS DAI List'
Subject: [Moims-dai] Minutes of today DAI telecon (9th October)

 

Dear all,

 

Please find below the minutes of today telecon (9th October), with new
actions.

 

Don't hesitate to correct or complete. 

  

Not discussed (next telecon): 

* Core of the PAIS tutorial

* COROT test case

*METS test case: first comments, organization (email from Daniele 13th
August)

 

New topic to be discussed:

*schedule of CWE projects

6.       

Next telecons: 

  

* Thursday 30 October (9h US time, 15h EU time)

* 10th to 14th November: CCSDS fall meeting

  

If available, I propose to use Dave's new number:

Dave Williams
Telecon information 

+1-844-467-6272      USA Toll Free

+1-720-259-6462      Others

Passcode:    841727

 

 

Dave's old number (available today):

Dave Williams
Telecon information 
+1-877-954-3555      USA Toll free 
+1-517-224-3191      Others 
Passcode:    8506950

 

Best regards,

 

Daniele 

____________________________________________________________________________
_____________________________

9th October minutes

 

DB: Daniele Boucon

DG: David Giaretta

DS: Don Sawyer

JG: John Garrett

MM: Mike Martin

SM: Stephane Mbaye

WG: all

 

D=Decision, A=action (other = discussion)

 

PAIS PDS-NSSDC Wrapup Document (email from Mike, 27th August, comments from
the group)

 

MM: need for a stronger commitment or mechanism for getting feedback.

 

A=>-(JG): resend his comments

A=>-(MM): update the document

 

ICP terminology and content: discussion on the document from the LTDP/CCSDS
meeting (minutes sent today 8th October) with CCSDS action, and information
from Mike (email sent the 7th October)

 

Discussion on Data Set, Data Records (and other terms)

 

MM: is PDSC similar to OAIS AIP?

 

DB and JG: not really, probably closer to the set of AIPs.

 

DS: PDSC could be a Collection of AIPs. 

 

MM: Archive Information Collection -AIC?

 

Data set and Data records: many different possible interpretation.

 

What term shall we use in the ICP?

 

Data Record-> generic “Data Collection”?

 

Data records -> links with preservation (data prepared and ready for
preservation). More context should be put in the definition.

 

Relate this to a kind of flow timeline of information: before the ingest
till it is in the archive: Data records.

 

D=> draw up a table with 3 or more columns: generic for ICP, LTDP (Earth
Observation), other communities.

 

A=>-( DB): initiate this table taking into account the timeline –see also
the last slide of presentation sent by MM (data once in the Archive: use
archival terminology, outside the Archive: more related to
domains/communities).

 

A=>-(all): look at the RASIM documentation, make a link with the ICP.

 

“Digital Preservation” -> changed to “Long Term Preservation”.

 

Basic definitions should not be specific to a domain (as the process should
be generic). For example the term “Calibration”, “L0”, “browse” 
 is more
specific to domain of images (could be used as an example).

 

ISEE test case (new version sent today by DS, comments from MM 1st October)

 

Discussion on comments.

 

A=>-(DS): provide new version of ISEE test case with latest comments.

 

ESA SAFE test case (new version sent yesterday by DD)

 

This version includes the comments from Mike.

 

A=>-(MM): check the last version of the ESA SAFE test case.

 

Agenda for CCSDS meeting (draft sent during the telecon)

 

5 hours different between US (east coast) and London (6 with France)

13h30 (London Time) -> 8h30 (US time)

 

A=>-(all), by next Monday: review the agenda.

 

Actions review

 

Information given about actions closed.

 

End of 9th October minutes

____________________________________________________________________________
_______________________________

Part of historical minutes to be reminded 

____________________________________________________________________________
_______________________________

 2. GREEN BOOK

        2.1 TOC for test cases

 

* on the content of the sections,

* on the title of the sections

* on the following question: is that better to describe the SIP constraints
with the MOT (in section 6.1.3), or in a global section on SIPs (section
6.1.4)?

  

Discussion: 

Brief introduction about the sub-TOC by Stephane.

John and Don:         6.1.4.1 SIP constraints, included in 6.1.3

 

Current Structure:

6... Use Cases. 6-1

6.1    NAME – TITLE.. 6-1

Description

6.1.1    TEST_CASE_NAME DATA SET

6.1.2    TEST_CASE_NAME MOT AND SIP CONSTRAINTS

6.1.4    TEST_CASE_NAME SIPS

 

 

Decision on the Structure:

6... Use Cases

6.1    NAME – TITLE.

6.1.1    Context And Benefits

         Contains description + what this test case shows

                give explanation at the beginning of each test case of the
specificity of the test case, and how the method applies

6.1.2    Objects to be transferred

         same as TEST_CASE_NAME DATA SET: contains the description of all
the information (data, documents, 
) that have to be transferred, and how
this  information is organized on the producer side

6.1.3    Model OF OBJECTS For transfer and sip constraints

         contains the description of the mot and the sip types and
sequencing constraints

6.1.4    sips

         Contains the description if  SIP IMPLEMENTATION AND TRANSFER

         Contains an example of SIP manifest.

 

____________________________________________________________________________
_______________________________

2. GREEN BOOK

 

        2.2 Test cases review

  

 

COROT  test case

 

Daniele explains that CNES is pushing to get the CoRoT L0 data use case
(from Daniele and Stephane) – for information this has to be finished by 13
May to get a chance to be used for L1 data.

 

METS  test case

 

Daniele had a meeting (9 April) with BNF that could help building a METS
implementation of SIPs (documents).

Daniele introduces the need from BnF of transferring references to objects
instead of the target objects themselves.

Stephane: this is ok for XFDU

 

=> Action Stephane:   Provide example of XFDU with referenced data objects
(remote URL or URI)

 

Don finds a priori relevant

 

Decision: Nothing to do on that topic until further inputs from BnF

 

____________________________________________________________________________
______________________________

 2. GREEN BOOK

 

The Green Book has been split into several files, one for the core, and one
per test case.

 

=> Action Stephane: provide a description of the document breakdown and
links to the shared repository

 

        2.3 Core document

 

Section 4.1.1 reviewed (ok)

Tabular representation is welcome but question remains about their
systematic use (need an XML version in annex ?)

 

CCSD0014: equivalent to TO Descriptor, and CCSD0015: equivalent to
Collection Descriptor

 

==> Action Stephane from MM: explain more the CCSD0015: how it is registered
and reference where more information could be found on that subject

 

Don:  descriptorModelID has to be changed on any XSD change
(specialization), the Archive has to maintain the versions

 

Section 4.6.1 about PAIS XSD description should remain here for the time
being

 

John: The SANA registry is supposed to reference the (latest ?) PAIS XSD
only (TBC). It is however sure that it should not contain any other resource
e.g software, XML examples, etc.

 

« open » enumeration technique is cumbersome (TBC)

 

==> Action (All): table in section 4.6.4 should be reviewed for (during)
next telecon

 

Comment from previous telecons on method 

* (Daniele): there are steps that conform to the PAIMAS process (first
model, SIP constraints, then transfer and validation).

* Link between the data base on the Archive side, and the PAIS XML elements:
example on how to match both (core document, test case?).

 

The following paragraph will be suppressed if ok:

XML namespace for PAIS 

    

John suggested that we pass the proposal on namespaces by SANA and Nestor
and Peter as XML Co-chairs to make sure they agree.

 

==> action Stephane (20130821): send proposal on namespace to SANA, Nestor
and Peter, and ask if they have any objection to our proposal. 

Use previous email sent by John to introduce Stephane in the group.

 

It is agreed that not all positions where a pais :any element are possible
have to be documented in the GB. Only a few example are necessary or even
one.

 

More concrete example should be provided than the abstract « foo »/« bar »
currently proposed. Typical example would be a Collection holding a pais
:any with the author of the descriptor, the Collection name/ID in the
Producer semantic, or anything else that could be specific to the Producer
or the Archive side but not provided in the PAIS definitions.

 

For time constraints, the WG jumped to the section 4.6.4 of the draft GB.

 

SM: Reminded that « true » restrictions of XML Schema guarantees that the
original PAIS XML Schema’s rule are still applicable. Any instance following
a restriction follows the original ones.

 

SM: The use of restrictions does not impose any system to use the derived
PAIS schemas. Therefore, the restricted schemas have not to be shared with
any user of the produced SIPs. The project specific schemas could even be
discarded without losing control of the produced SIPs.

 

=> D – (DS) The rightmost column of the table in §4.6.4 shall be renamed «
Restriction » instead of « Content »

 

MM: It is not clear that this table should be kept in that form or discarded
at the end, but the target should not spend too much pages on that topic.

 

WG: restrictions may be interesting for implementers and as such should be
documented, but it is not clear if this should be proposed as a
recommendation or a best practice.

 

SM: reminded that restricting elements such as the maxOccurrence’s should be
a recommended practice since it can be very difficult to implement
interoperable software exchanging elements of xs:nonNegativeInteger type.

 

SM: proposed to add prepared templates of restricting XML Schemas in annex.
Something that could help implementers to quickly setup the restrictions of
their needs.

WG: adding XML Schema’s in annex may not be so helpful because cut and paste
from PDF may be very cumbersome.

 

=> D – (JG) these XSD shall be placed side to the originals.

 

SM: The problem is similar to other GB resources as the use case descriptors
or the software prototypes.

 

____________________________________________________________________________
_______________________________

 

Preservation process 

  

DB: LOTAR representative is ok with "preservation", but not with "curation"
(this word is not used in the community).

 

MM: links should be made between MTDP and warehousing. Not clear with the
structure of the process for the moment.

 

Discussion on terminology: preservation vs curation.

 

LTDP definitions: 

*        Preservation: aims at the generation of a single, consistent,
consolidated and validated “EO Missions/Sensors Dataset” and at ensuring its
long term integrity, discovery, accessibility and usability. It is focused
on an individual Mission/Sensor or on a multi-mission Dataset (when one
Master Dataset is made up of data coming from different missions/sensors)
and tailored according to its specific preservation/curation requirements.
It consists of all activities needed to ensure “EO Missions/Sensors Dataset”
bit integrity over time and to optimize (in terms of format and coverage)
its (re)use in the long term (e.g. through metadata and catalogue
improvement, algorithms evolutions and related (re)processing, linking and
improvement of context/provenance information).

*        Curation: aims at establishing and increasing the value of “EO
Missions/Sensors Datasets” over their lifecycle, at favouring their
exploitation through the combination with other Datasets and at extending
their user base. It includes the activities for the definition of the
preservation objectives, for the coordination and management of Data Time
Series and Collections (e.g. from similar sensor family) in support to
specific applications. It includes international cooperation activities

 

OAIS definitions:

Long Term Preservation: The act of maintaining information, Independently
Understandable by a Designated Community, and with evidence supporting its
Authenticity,

over the Long Term.

Authenticity: The degree to which a person (or system) regards an object as
what it is purported to be. Authenticity is judged on the basis of evidence.

 

Authenticity: in the sense of "original". This is not crucial in the domain
of scientific data, but is an issue. Integrity could be a way to prove
authenticity.

 

We note that the LTDP definitions are not clear nor completely coherent.
There is a mixture of both preservation/curation concepts in both
definitions.

 

Furthermore, the  group thinks that the term "curation" used in the LTDP
definition does not fit the usual usage, another term should be used. The
sens is nearer of knowledge management. 

Daniele will send tomorrow a summary of her comments.

 

=> Action Daniele: ask David for preservation/curation definition.

 

=> Action all: give comments and proposals as input for the LTDP terminology
and steps of workflow. -> for Monday at the latest.

 

 

Previous discussion:

 

To be done: analyse and suggest, at different stages of the project (during
data lifecycle), what should be done on an archiving point of view.

 

Example of main issue: keep documentation up to date (when changes in
formats, processing, 
 are made on the data).

 

The LTDP has written a document "PDSC" (Preservation Data Set Content),
explaining what kind of information (data, software, documents, 
) should be
collected and at what step of the project.

 

The subject is wide, Mike asks to focus on specific parts.

 

Daniele explains her point of view: develop a "model" (magenta book)
gathering all the basic components of the preservation activity (selection
and appraisal, data and metadata preparation, access, maintenance 
) that
should cover all the data lifecycle (even when data don't exist), in order
to be able to answer the following question: what should be done from the
beginning till the end to be able to preserve data, and at what moment? This
should be done in a generic way, making links on standards related to basic
components when they exist.

 

Suggestion to ask Barbara Sierman of existing works concerning this topic.
Mail sent the 7/02 by Daniele.

 

==> Action Daniele: write and send more precise elements on this process to
the group.

  

==> Action all: send comments.

 

Nestor will send for the new project approval once the ¨PAIS BB has been
published.

 

Daniele underlines the need for CNES and LTDP group. CCSDS expertise will be
very important.

 

The PDS has its internal process ; for other agencies (particularly the LTDP
member agencies) this process doesn't exist and is required.

 

Question on how to make the link between a global process and the PAIMAS
phases.

 

20131030 Don's email and following discussions:

Don: The process could be focussed on the Archive point of view, and seen an
internal OAIS issue for workflow, using then more OAIS concepts.

The "provenance" is practically a big issue, going back to the original
information.

"Reprocessing, curation, stewardship" could be maped with OAIS migrations,
new versions, 


 

Update procedures exist at NSSDC.

 

Daniele: appraisal should be the starting point. Workflow, on the Archive
side, could begin early in a space project, even when data don't yet exist ;
link with the Archive side of the PAIMAS.

 

=> action all: exchange on high level process (3 main steps
preparation/preservation/maintenance) and links with PAIMAS phases for
return to the LTDP group.

 

==> action Daniele: follow the work of LTDP group on preservation process,
and send all available information to the DAI group on this subject.

  

Need for return on the preservation process from all.

 

____________________________________________________________________________
_______________________________

Other subjects 


OAIS Magenta Book (French version) 

 

Daniele has received the complete updated French version.

Complete validation to be performed now (text and figures)

This version will be also validated by the French National Archives.

Due to the amount of work on the PAIS, and priority to the PAIS green book,
this will be treated after.

    

XML schema for DEDSL 

Seems possible for CNES to write the document on the model on the existing
other DEDSL standards. At CNES, XML schema for DEDSL is already created and
implemented.

Prototypes: CNES has already tested it on operational tool. This could play
the role of prototype, and this could be enough.

 

Nestor explained John that a 2nd prototype is required.

 

A possibility could be to produce not a blue book, but a document that won't
required 2 prototypes (orange book). To be more discussed.

 

____________________________________________________________________________
________________

 

 

 

 

 

 

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20141028/9238969c/attachment.html>


More information about the MOIMS-DAI mailing list