[Moims-dai] RE: Possible scope reduction
david.giaretta at stfc.ac.uk
david.giaretta at stfc.ac.uk
Thu Sep 24 06:24:50 UTC 2015
I'm not sure I'll be able to join the call today so I thought I should contribute my view by email.
I think the proposal to cut things down is largely based on a "big project" point of view.
My understanding was that we needed to cover both large and small data creation projects.
Of course we need to map to AIP components because that is what is needed for long term preservation, and that is what all the topic lists have at their core. However we also need to look as how and when the information is collected and also how the data creators can provide information that will help/encourage the re-use and adding value to the data.
From: Mike Martin [mailto:tahoe_mike at sbcglobal.net]
Sent: 22 September 2015 16:39
To: Boucon Daniele <Daniele.Boucon at cnes.fr>
Cc: John Garrett <garrett at his.com>; Giaretta, David (STFC,RAL,RALSP) <david.giaretta at stfc.ac.uk>; Sawyer at acm.org; rdowns at ciesin.columbia.edu; Conway, Esther (STFC,RAL,RALSP) <esther.conway at stfc.ac.uk>; moims-dai at mailman.ccsds.org; stephane.reecht at bnf.fr
Subject: Re: Possible scope reduction
Hi Daniele and others
Overall I think our goal should be to get the topic/issue list down to about 10 topics, otherwise this will be too complicated for us to work on and too complicated for anyone to read. I have included a modified version of my spreadsheet where I have tried to show how your topic list correlates to my list. Here is some discussion.
1. My thoughts was that line 9, Data Inventory... was included in my line 4 "What types of data (raw, processed etc) will be produced, and the total volume of data which will be produced. What is the schedule for major project milestones and deliveries to the archive."
2. Line 5, 6, 11, 13, 14, 15, 17 and 27 all deal with standards for and descriptions of data and metadata. I think we need a cleaner way of dealing with these topics. That's why I suggested we focus on the diagram that describes the components of the AIP. The topics in the diagram include:
Data and Representation Info:
other representation information
access rights information
So I was thinking maybe we would have one topic that covered all the components of the AIP, with subtopics for all the components. Or maybe it would be better to have three topics, Data and RI, PDI and Packaging.
3. I included my line 10 "What special products need to be created..." to try to cover your line 23, 24, 25. However, I really think all those topics discuss things that need to be included in AIP components.
4. My line 11 "Processing Workflow" would encompass your lines 12, 29, 34. This topic could be included in the AIP components discussion.
5. My line 12 "Inputs, parameters..." would encompass your line 7 and 37. This topic could be included in the AIP components discussion.
6. My line 17 "How will the deliveries be packaged" covers your line 41 "handover - pointers to the components to be transferred.
7. If we are focusing on what the data provider must do during the lifecycle then I don't see how your line 26 or 43 relate to this document. They seem to be internal archive issues.
8. I put line 49 on the list, but now I don't think we have the resources to go into that topic.
Looking at my list, I also think that 13 "What calibration and system test tools and data will be delivered" and 14 "Provide a bibleography...", 15 "How do we verify...", 18 "Who is the owner" should be incorporated in the AIP topic or topics discussion.
On 9/18/2015 3:24 AM, Boucon Daniele wrote:
Hi Mike and all,
As only John and myself responded to move the today ILF telecon, I suggest that we keep a short slot, and we can plan another one in a short delay.
Here are my comments on Mike's proposal:
1. I find that the mapping you propose between the life cycle activity and the topics interesting.
2. I find your point of view in the Info life cycles stage interesting. If we adopt
Gathering the lines are ok, considering the fact that they will be expanded in sub bullets.
I'm not convinced that we have to cut the exploitation part.
I'm in favour of keeping the "adding value" and "discoverability", as this is what is a great interest for the users.
The lines from David's table-topicsand subtopicsDB- that I have not found in Mike's table are the following:
Line 5: Data Semantics of the data elements repInfo? Needed to support interpretation and interoperability
Line 6: Data Context documentation Context Ex: description of mission, instrument, .
Line 7: Data Quality specifications ?? DI ?? Future confidence in data. Definition of quality parameters. Ex: QA4EO . => related to quality parameters, could ne enlarged in Mike's proposal
Line 9: Data Inventory of data produced/expected What level of granularity is needed? List of info + relationships among all the info (network)
Line 11: Representation Information Applicable standards RepInfo Complex topic, maybe too big for general document
Line 12: Representation Information Software dependencies RepInfo Minimize as archives don't have resources to maintain software
Line 13: Representation Information Data dictionaries and other semantics RepInfo Need clean mechanism to transfer to archive, support interoperability
Line 14: Representation Information Format definitions and formal descriptions RepInfo Needed to provide access to data.
Line 15: Representation Information Information Model RepInfo Information Model includes relationships between classes - important for automated production, validation, sw development
Line 17: Descriptive Information Outline of background concepts needed to understand the project Descriptive Info Needed for future use of the data.
Line 23: (Adding) value Related data which may in the future be combined with this data This may be too broad.
Line 24; (Adding) value Other software which may be used on the data Tutorials are very valuable for future access to the data.
Line 25: (Adding) value Interfaces that are applicable to the data e.g. images, tables. I don't understand this issue.
Line 26: (Adding) value Potential value of the data and likely business case for sustainability Archive Support Plan
Line 27: Proposal Record of origins of the project e.g. in a CRIS system Context Descriptive information, record of commitments, evaluation of project success
Line 29: Provenance Documentation about the hardware and software used to create the data, including a history of the changes in these over time Provenance How do we capture this?
Line 34: Provenance Record of any special hardware needed Provenance Submission Agreement?
Line 37: Authenticity Who was responsible for each stage of processing To allow archive to contact preparer? Not generally collected.
Line 41: Handover Pointers to the components to be transferred to the archive How does this differ from inventory in Data Topic
Line 43: Handover Potential preservation aims of the archive Part of Archive Support Plan
Line 49: Handover Resident Archives Intermediary between provider and archive in some cases for maintaining specialized h/w or s/w.
Mike, could you please tell me if this analysis is correct or not? Are some of them included in other lines of your table?
As an aim for our telecon, I suggest that we all give opinion on this list (grouping, what more to include or expand).
De : Mike Martin [mailto:tahoe_mike at sbcglobal.net]
Envoyé : mercredi 16 septembre 2015 15:32
À : Boucon Daniele
Cc : John Garrett; david.giaretta at stfc.ac.uk<mailto:david.giaretta at stfc.ac.uk>; Sawyer at acm.org<mailto:Sawyer at acm.org>; rdowns at ciesin.columbia.edu<mailto:rdowns at ciesin.columbia.edu>; esther.conway at stfc.ac.uk<mailto:esther.conway at stfc.ac.uk>; moims-dai at mailman.ccsds.org<mailto:moims-dai at mailman.ccsds.org>
Objet : Possible scope reduction
If you haven't started looking at the attachments here are two slightly edited versions.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the MOIMS-DAI