[Moims-dai] RE: Possible scope reduction

david.giaretta at stfc.ac.uk david.giaretta at stfc.ac.uk
Thu Oct 15 12:12:30 UTC 2015


I just notice this in the email:

Daniel Kahneman, "Thinking, Fast and Slow", bottom of page 29, in response to the question "why use short names?"

"The reason is simple.  "Automatic system" takes longer to say than "System 1" and therefore takes more space in your working memory.  It matters because anything that occupies your working memory reduces your ability to think."  I think having a simple clean representation of the life cycle components will make it easier for people to comprehend and remember.

My understanding is that the human memory relies on linkages between ideas - just look at any book on memory. Short names with no linkages are very hard to remember unless one uses them every day and everyone is on the same page.

In terms of a standard that goes across disciplines we have to be talking in terms that many can understand at first reading.

..David

From: John Garrett [mailto:garrett at his.com]
Sent: 04 October 2015 22:47
To: 'Mike Martin' <tahoe_mike at sbcglobal.net>; Giaretta, David (STFC,RAL,RALSP) <david.giaretta at stfc.ac.uk>; Daniele.Boucon at cnes.fr
Cc: Sawyer at acm.org; rdowns at ciesin.columbia.edu; Conway, Esther (STFC,RAL,RALSP) <esther.conway at stfc.ac.uk>; moims-dai at mailman.ccsds.org; stephane.reecht at bnf.fr
Subject: RE: Possible scope reduction

Hi,
I totally agree with Mike that we need to limit the number of topics.  Obviously we cannot adequately cover 50 topics in the time we have.  So let's continue to work to get to the correct list of topics for this pass at the standard.

I have not read the book, so my questions may be covered in it.
I would agree that all other things being equal, having short name rather than long names aids the thinking process.
However, it is more important to understand and think correctly. So when we are speaking to an audience that has differing experience or limited experience with the subject then a longer name may be useful.  Or the longer name may be more useful when the shorter name is already mapped to different concepts/objects/etc.

Certainly in some situations where all the listeners understand the same understanding, then a shorter name is appropriate.  That is why we all develop a shorthand to discuss things within our own communities.  But when new people enter the community, we must train them up into the jargon of that community.

I think even the example chosen shows that.  System 1 has almost no meaning to me.  Even automatic system doesn't mean much, but it conveys more to me than System 1 does.
If anyone has worked around the military (or even government), you find out how often short acronyms are used.  Conversations there only make sense once you understand the community, the context, and have the acronyms defined for you.

Live, Laugh, Love, and Work for Peace,
-JOhn


From: Mike Martin [mailto:tahoe_mike at sbcglobal.net]
Sent: Friday, September 25, 2015 4:09 PM
To: david.giaretta at stfc.ac.uk<mailto:david.giaretta at stfc.ac.uk>; Daniele.Boucon at cnes.fr<mailto:Daniele.Boucon at cnes.fr>
Cc: garrett at his.com<mailto:garrett at his.com>; Sawyer at acm.org<mailto:Sawyer at acm.org>; rdowns at ciesin.columbia.edu<mailto:rdowns at ciesin.columbia.edu>; esther.conway at stfc.ac.uk<mailto:esther.conway at stfc.ac.uk>; moims-dai at mailman.ccsds.org<mailto:moims-dai at mailman.ccsds.org>; stephane.reecht at bnf.fr<mailto:stephane.reecht at bnf.fr>
Subject: Re: Possible scope reduction

Hi David and others

My intent in trying to cut the list down is only to get to something we can work with in the next three months.  I think a list with 50 items is intractable for us to even talk about.  We started going through the list item by item last Friday and we spent 1.5 hours on 4 or 5 items with no real resolution on any of them.  That would mean we would need 10 more meetings just to briefly discuss each topic/issue.

Another issue that we talked about Thursday is my desire to have set of short names for all the activities in the life cycle.  It is linked to the following thought from Daniel Kahneman, "Thinking, Fast and Slow", bottom of page 29, in response to the question "why use short names?"

"The reason is simple.  "Automatic system" takes longer to say than "System 1" and therefore takes more space in your working memory.  It matters because anything that occupies your working memory reduces your ability to think."  I think having a simple clean representation of the life cycle components will make it easier for people to comprehend and remember.

Thanks, Mike
On 9/23/2015 11:24 PM, david.giaretta at stfc.ac.uk<mailto:david.giaretta at stfc.ac.uk> wrote:
Dear all

I'm not sure I'll be able to join the call today so I thought I should contribute my view by email.

I think the proposal to cut things down is largely based on a "big project" point of view.
My understanding was that we needed to cover both large and small data creation projects.

Of course we need to map to AIP components because that is what is needed for long term preservation, and that is what all the topic lists have at their core. However we also need to look as how and when the information is collected and also how the data creators can provide information that will help/encourage the re-use and adding value to the data.

...David



From: Mike Martin [mailto:tahoe_mike at sbcglobal.net]
Sent: 22 September 2015 16:39
To: Boucon Daniele <Daniele.Boucon at cnes.fr><mailto:Daniele.Boucon at cnes.fr>
Cc: John Garrett <garrett at his.com><mailto:garrett at his.com>; Giaretta, David (STFC,RAL,RALSP) <david.giaretta at stfc.ac.uk><mailto:david.giaretta at stfc.ac.uk>; Sawyer at acm.org<mailto:Sawyer at acm.org>; rdowns at ciesin.columbia.edu<mailto:rdowns at ciesin.columbia.edu>; Conway, Esther (STFC,RAL,RALSP) <esther.conway at stfc.ac.uk><mailto:esther.conway at stfc.ac.uk>; moims-dai at mailman.ccsds.org<mailto:moims-dai at mailman.ccsds.org>; stephane.reecht at bnf.fr<mailto:stephane.reecht at bnf.fr>
Subject: Re: Possible scope reduction

Hi Daniele and others

Overall I think our goal should be to get the topic/issue list down to about 10 topics, otherwise this will be too complicated for us to work on and too complicated for anyone to read.  I have included a modified version of my spreadsheet where I have tried to show how your topic list correlates to my list.  Here is some discussion.

1.  My thoughts was that line 9, Data Inventory... was included in my line 4 "What types of data (raw, processed etc) will be produced, and the total volume of data which will be produced.  What is the schedule for major project milestones and deliveries to the archive."

2.  Line 5, 6, 11, 13, 14, 15, 17 and 27 all deal with standards for and descriptions of data and metadata.  I think we need a cleaner way of dealing with these topics.   That's why I suggested we focus on the diagram that describes the components of the AIP.  The topics in the diagram include:

Data and Representation Info:
digital object
structure information
semantic information
other representation information
PDI:
reference information
provenance information
context information
fixity information
access rights information
Packaging:
packaging information
package description

So I was thinking maybe we would have one topic that covered all the components of the AIP, with subtopics for all the components.  Or maybe it would be better to have three topics, Data and RI, PDI and Packaging.

3.  I included my line 10 "What special products need to be created..." to try to cover your line 23, 24, 25.  However, I really think all those topics discuss things that need to be included in AIP components.

4.  My line 11 "Processing Workflow" would encompass your lines 12, 29, 34.  This topic could be included in the AIP components discussion.

5.  My line 12 "Inputs, parameters..." would encompass your line 7 and 37.  This topic could be included in the AIP components discussion.

6.  My line 17 "How will the deliveries be packaged" covers your line 41 "handover - pointers to the components to be transferred.

7.  If we are focusing on what the data provider must do during the lifecycle then I don't see how your line 26 or 43 relate to this document.  They seem to be internal archive issues.

8.  I put line 49 on the list, but now I don't think we have the resources to go into that topic.

Looking at my list, I also think that 13 "What calibration and system test tools and data will be delivered" and 14 "Provide a bibleography...", 15 "How do we verify...", 18 "Who is the owner" should be incorporated in the AIP topic or topics discussion.

Thanks, Mike
On 9/18/2015 3:24 AM, Boucon Daniele wrote:

Hi Mike and all,



As only John and myself responded to move the today ILF telecon, I suggest that we keep a short slot, and we can plan another one in a short delay.



Here are my comments on Mike's proposal:



1. I find that the mapping you propose between the life cycle activity and the topics interesting.



2. I find your point of view in the Info life cycles stage interesting. If we adopt



Gathering the lines are ok, considering the fact that they will be expanded in sub bullets.

I'm not convinced that we have to cut the exploitation part.

I'm in favour of keeping the "adding value" and "discoverability", as this is what is a great interest for the users.



The lines from David's table-topicsand subtopicsDB- that I have not found in Mike's table are the following:



Line 5: Data     Semantics of the data elements repInfo?       Needed to support interpretation and interoperability

Line 6: Data     Context documentation  Context Ex: description of mission, instrument, .

Line 7: Data     Quality specifications ?? DI ?? Future confidence in data. Definition of quality parameters. Ex: QA4EO .  => related to quality parameters, could ne enlarged in Mike's proposal

Line 9: Data     Inventory of data produced/expected               What level of granularity is needed? List of info + relationships among all the info (network)

Line 11: Representation Information  Applicable standards   RepInfo Complex topic, maybe too big for general document

Line 12: Representation Information  Software dependencies  RepInfo Minimize as archives don't have resources to maintain software

Line 13: Representation Information     Data dictionaries and other semantics RepInfo Need clean mechanism to transfer to archive, support interoperability

Line 14: Representation Information     Format definitions and formal descriptions     RepInfo Needed to provide access to data.

Line 15: Representation Information  Information Model      RepInfo Information Model includes relationships between classes - important for automated production, validation, sw development

Line 17: Descriptive Information Outline of background concepts needed to understand the project  Descriptive Info       Needed for future use of the data.

Line 23: (Adding) value  Related data which may in the future be combined with this data             This may be too broad.

Line 24; (Adding) value  Other software which may be used on the data         Tutorials are very valuable for future access to the data.

Line 25: (Adding) value  Interfaces that are applicable to the data e.g.  images, tables.          I don't understand this issue.

Line 26: (Adding) value  Potential value of the data and likely business case for sustainability           Archive Support Plan

Line 27: Proposal        Record of origins of the project e.g. in a CRIS system   Context Descriptive information, record of commitments, evaluation of project success

Line 29: Provenance      Documentation about the hardware and software used to create the data, including a history of the changes in these over time  Provenance     How do we capture this?

Line 34: Provenance      Record of any special hardware needed  Provenance     Submission Agreement?

Line 37: Authenticity    Who was responsible for each stage of processing            To allow archive to contact preparer?  Not generally collected.

Line 41: Handover        Pointers to the components to be transferred to the archive          How does this differ from inventory in Data Topic

Line 43: Handover        Potential preservation aims of the archive             Part of Archive Support Plan

Line 49: Handover        Resident Archives         Intermediary between provider and archive in some cases for maintaining specialized h/w or s/w.



Mike, could you please tell me if this analysis is correct or not? Are some of them included in other lines of your table?



As an aim for our telecon, I suggest that we all give opinion on this list (grouping, what more to include or expand).



Regards,



Daniele



-----Message d'origine-----

De : Mike Martin [mailto:tahoe_mike at sbcglobal.net]

Envoyé : mercredi 16 septembre 2015 15:32

À : Boucon Daniele

Cc : John Garrett; david.giaretta at stfc.ac.uk<mailto:david.giaretta at stfc.ac.uk>; Sawyer at acm.org<mailto:Sawyer at acm.org>; rdowns at ciesin.columbia.edu<mailto:rdowns at ciesin.columbia.edu>; esther.conway at stfc.ac.uk<mailto:esther.conway at stfc.ac.uk>; moims-dai at mailman.ccsds.org<mailto:moims-dai at mailman.ccsds.org>

Objet : Possible scope reduction



Hi everyone



If you haven't started looking at the attachments here are two slightly edited versions.



Thanks, Mike






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20151015/998eb4e4/attachment.html>


More information about the MOIMS-DAI mailing list