[Moims-dai] IFL/ICF: 50 topics?
david.giaretta at stfc.ac.uk
david.giaretta at stfc.ac.uk
Thu Oct 15 12:27:56 UTC 2015
Where did the number of 50 topics come from? Even my list only had 15.
From: John Garrett [mailto:garrett at his.com]
Sent: 04 October 2015 22:47
To: 'Mike Martin' <tahoe_mike at sbcglobal.net>; Giaretta, David (STFC,RAL,RALSP) <david.giaretta at stfc.ac.uk>; Daniele.Boucon at cnes.fr
Cc: Sawyer at acm.org; rdowns at ciesin.columbia.edu; Conway, Esther (STFC,RAL,RALSP) <esther.conway at stfc.ac.uk>; moims-dai at mailman.ccsds.org; stephane.reecht at bnf.fr
Subject: RE: Possible scope reduction
I totally agree with Mike that we need to limit the number of topics. Obviously we cannot adequately cover 50 topics in the time we have. So let's continue to work to get to the correct list of topics for this pass at the standard.
I have not read the book, so my questions may be covered in it.
I would agree that all other things being equal, having short name rather than long names aids the thinking process.
However, it is more important to understand and think correctly. So when we are speaking to an audience that has differing experience or limited experience with the subject then a longer name may be useful. Or the longer name may be more useful when the shorter name is already mapped to different concepts/objects/etc.
Certainly in some situations where all the listeners understand the same understanding, then a shorter name is appropriate. That is why we all develop a shorthand to discuss things within our own communities. But when new people enter the community, we must train them up into the jargon of that community.
I think even the example chosen shows that. System 1 has almost no meaning to me. Even automatic system doesn't mean much, but it conveys more to me than System 1 does.
If anyone has worked around the military (or even government), you find out how often short acronyms are used. Conversations there only make sense once you understand the community, the context, and have the acronyms defined for you.
Live, Laugh, Love, and Work for Peace,
From: Mike Martin [mailto:tahoe_mike at sbcglobal.net]
Sent: Friday, September 25, 2015 4:09 PM
To: david.giaretta at stfc.ac.uk<mailto:david.giaretta at stfc.ac.uk>; Daniele.Boucon at cnes.fr<mailto:Daniele.Boucon at cnes.fr>
Cc: garrett at his.com<mailto:garrett at his.com>; Sawyer at acm.org<mailto:Sawyer at acm.org>; rdowns at ciesin.columbia.edu<mailto:rdowns at ciesin.columbia.edu>; esther.conway at stfc.ac.uk<mailto:esther.conway at stfc.ac.uk>; moims-dai at mailman.ccsds.org<mailto:moims-dai at mailman.ccsds.org>; stephane.reecht at bnf.fr<mailto:stephane.reecht at bnf.fr>
Subject: Re: Possible scope reduction
Hi David and others
My intent in trying to cut the list down is only to get to something we can work with in the next three months. I think a list with 50 items is intractable for us to even talk about. We started going through the list item by item last Friday and we spent 1.5 hours on 4 or 5 items with no real resolution on any of them. That would mean we would need 10 more meetings just to briefly discuss each topic/issue.
Another issue that we talked about Thursday is my desire to have set of short names for all the activities in the life cycle. It is linked to the following thought from Daniel Kahneman, "Thinking, Fast and Slow", bottom of page 29, in response to the question "why use short names?"
"The reason is simple. "Automatic system" takes longer to say than "System 1" and therefore takes more space in your working memory. It matters because anything that occupies your working memory reduces your ability to think." I think having a simple clean representation of the life cycle components will make it easier for people to comprehend and remember.
On 9/23/2015 11:24 PM, david.giaretta at stfc.ac.uk<mailto:david.giaretta at stfc.ac.uk> wrote:
I'm not sure I'll be able to join the call today so I thought I should contribute my view by email.
I think the proposal to cut things down is largely based on a "big project" point of view.
My understanding was that we needed to cover both large and small data creation projects.
Of course we need to map to AIP components because that is what is needed for long term preservation, and that is what all the topic lists have at their core. However we also need to look as how and when the information is collected and also how the data creators can provide information that will help/encourage the re-use and adding value to the data.
From: Mike Martin [mailto:tahoe_mike at sbcglobal.net]
Sent: 22 September 2015 16:39
To: Boucon Daniele <Daniele.Boucon at cnes.fr><mailto:Daniele.Boucon at cnes.fr>
Cc: John Garrett <garrett at his.com><mailto:garrett at his.com>; Giaretta, David (STFC,RAL,RALSP) <david.giaretta at stfc.ac.uk><mailto:david.giaretta at stfc.ac.uk>; Sawyer at acm.org<mailto:Sawyer at acm.org>; rdowns at ciesin.columbia.edu<mailto:rdowns at ciesin.columbia.edu>; Conway, Esther (STFC,RAL,RALSP) <esther.conway at stfc.ac.uk><mailto:esther.conway at stfc.ac.uk>; moims-dai at mailman.ccsds.org<mailto:moims-dai at mailman.ccsds.org>; stephane.reecht at bnf.fr<mailto:stephane.reecht at bnf.fr>
Subject: Re: Possible scope reduction
Hi Daniele and others
Overall I think our goal should be to get the topic/issue list down to about 10 topics, otherwise this will be too complicated for us to work on and too complicated for anyone to read. I have included a modified version of my spreadsheet where I have tried to show how your topic list correlates to my list. Here is some discussion.
1. My thoughts was that line 9, Data Inventory... was included in my line 4 "What types of data (raw, processed etc) will be produced, and the total volume of data which will be produced. What is the schedule for major project milestones and deliveries to the archive."
2. Line 5, 6, 11, 13, 14, 15, 17 and 27 all deal with standards for and descriptions of data and metadata. I think we need a cleaner way of dealing with these topics. That's why I suggested we focus on the diagram that describes the components of the AIP. The topics in the diagram include:
Data and Representation Info:
other representation information
access rights information
So I was thinking maybe we would have one topic that covered all the components of the AIP, with subtopics for all the components. Or maybe it would be better to have three topics, Data and RI, PDI and Packaging.
3. I included my line 10 "What special products need to be created..." to try to cover your line 23, 24, 25. However, I really think all those topics discuss things that need to be included in AIP components.
4. My line 11 "Processing Workflow" would encompass your lines 12, 29, 34. This topic could be included in the AIP components discussion.
5. My line 12 "Inputs, parameters..." would encompass your line 7 and 37. This topic could be included in the AIP components discussion.
6. My line 17 "How will the deliveries be packaged" covers your line 41 "handover - pointers to the components to be transferred.
7. If we are focusing on what the data provider must do during the lifecycle then I don't see how your line 26 or 43 relate to this document. They seem to be internal archive issues.
8. I put line 49 on the list, but now I don't think we have the resources to go into that topic.
Looking at my list, I also think that 13 "What calibration and system test tools and data will be delivered" and 14 "Provide a bibleography...", 15 "How do we verify...", 18 "Who is the owner" should be incorporated in the AIP topic or topics discussion.
On 9/18/2015 3:24 AM, Boucon Daniele wrote:
Hi Mike and all,
As only John and myself responded to move the today ILF telecon, I suggest that we keep a short slot, and we can plan another one in a short delay.
Here are my comments on Mike's proposal:
1. I find that the mapping you propose between the life cycle activity and the topics interesting.
2. I find your point of view in the Info life cycles stage interesting. If we adopt
Gathering the lines are ok, considering the fact that they will be expanded in sub bullets.
I'm not convinced that we have to cut the exploitation part.
I'm in favour of keeping the "adding value" and "discoverability", as this is what is a great interest for the users.
The lines from David's table-topicsand subtopicsDB- that I have not found in Mike's table are the following:
Line 5: Data Semantics of the data elements repInfo? Needed to support interpretation and interoperability
Line 6: Data Context documentation Context Ex: description of mission, instrument, .
Line 7: Data Quality specifications ?? DI ?? Future confidence in data. Definition of quality parameters. Ex: QA4EO . => related to quality parameters, could ne enlarged in Mike's proposal
Line 9: Data Inventory of data produced/expected What level of granularity is needed? List of info + relationships among all the info (network)
Line 11: Representation Information Applicable standards RepInfo Complex topic, maybe too big for general document
Line 12: Representation Information Software dependencies RepInfo Minimize as archives don't have resources to maintain software
Line 13: Representation Information Data dictionaries and other semantics RepInfo Need clean mechanism to transfer to archive, support interoperability
Line 14: Representation Information Format definitions and formal descriptions RepInfo Needed to provide access to data.
Line 15: Representation Information Information Model RepInfo Information Model includes relationships between classes - important for automated production, validation, sw development
Line 17: Descriptive Information Outline of background concepts needed to understand the project Descriptive Info Needed for future use of the data.
Line 23: (Adding) value Related data which may in the future be combined with this data This may be too broad.
Line 24; (Adding) value Other software which may be used on the data Tutorials are very valuable for future access to the data.
Line 25: (Adding) value Interfaces that are applicable to the data e.g. images, tables. I don't understand this issue.
Line 26: (Adding) value Potential value of the data and likely business case for sustainability Archive Support Plan
Line 27: Proposal Record of origins of the project e.g. in a CRIS system Context Descriptive information, record of commitments, evaluation of project success
Line 29: Provenance Documentation about the hardware and software used to create the data, including a history of the changes in these over time Provenance How do we capture this?
Line 34: Provenance Record of any special hardware needed Provenance Submission Agreement?
Line 37: Authenticity Who was responsible for each stage of processing To allow archive to contact preparer? Not generally collected.
Line 41: Handover Pointers to the components to be transferred to the archive How does this differ from inventory in Data Topic
Line 43: Handover Potential preservation aims of the archive Part of Archive Support Plan
Line 49: Handover Resident Archives Intermediary between provider and archive in some cases for maintaining specialized h/w or s/w.
Mike, could you please tell me if this analysis is correct or not? Are some of them included in other lines of your table?
As an aim for our telecon, I suggest that we all give opinion on this list (grouping, what more to include or expand).
De : Mike Martin [mailto:tahoe_mike at sbcglobal.net]
Envoyé : mercredi 16 septembre 2015 15:32
À : Boucon Daniele
Cc : John Garrett; david.giaretta at stfc.ac.uk<mailto:david.giaretta at stfc.ac.uk>; Sawyer at acm.org<mailto:Sawyer at acm.org>; rdowns at ciesin.columbia.edu<mailto:rdowns at ciesin.columbia.edu>; esther.conway at stfc.ac.uk<mailto:esther.conway at stfc.ac.uk>; moims-dai at mailman.ccsds.org<mailto:moims-dai at mailman.ccsds.org>
Objet : Possible scope reduction
If you haven't started looking at the attachments here are two slightly edited versions.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the MOIMS-DAI