[Moims-dai] Response to rejection of SC234

D or C Sawyer Sawyer at acm.org
Tue Aug 28 00:37:40 UTC 2018


Dear All,

This is a brief response to the thread of messages precipitated by my response to the decision to retain SC222.

Despite David’s belief that there is something that I’m missing, I believe the situation is just the opposite.  In fact his response, and then Bruce’s response to David, have highlighted where the major disconnect lies.  In short, I believe that David has a picture for the application of Provenance (and PDI more generally), that is at variance to the actual text and modeling contained in OAIS RM 2012.  Therefore I find his view of the impact of SC222 is also at variance to the OAIS RM with SC222 incorporated.  I’m providing a short description of the issue in the following text and 2 accompanying diagrams.

Bruce has made a valuable contribution pointing out the long standing use of Provenance in the Cultural Heritage community.  Bruce was a valuable contributor to the development of the initial OAIS RM and therefore it is not surprising that we deliberately associated Provenance Information with the Content Information.  The Content Information is the original target of preservation and it is this information, whether physically or digitally based, whose Provenance is of particular interest.  This Content Information is the particular information that the OAIS is most concerned with preserving.

Here is the key point:  there is all kinds of other information that an Archive will need to maintain, and much of it will likely be digital.  We did NOT define Provenance to be associated with any of this other information. Specifically it is NOT defined with respect to a general Information Object, and appropriately so.  (A proposal to define Content Information as any Information Object within the Archive was firmly rejected.) Real archives need to distinguish that information that is the original target of preservation from all other information because that information is treated differently in terms of the human effort and other resources applied.  Thus Provenance is properly focused on the Content Information, and because Content Information is defined as a Content Data Object plus its Representation, Provenance can be applied to distinguish the possible different histories of these two components as appropriate.  David’s statement that the OAIS is about the preservation of Information Objects, in general, is simply not true.  This is clear from the Terminology section, the information modeling, and its associated text even with the incorporation of SC222 which focuses on the Content Data Object and NOT a general Data Object.  There is no room for debate on this point.

David has pointed out, and appropriately so, that information other than Content Information needs to be maintained by the Archive. He prefers to use the term ‘preserved’, which is not incorrect, but this does not mean that it should get the same attention as the Content Information. It has been noted that Provenance Information will have some history and therefore its provenance may be of interest. This provenance is NOT our defined Provenance, even though the underlying concept for its usage is the same.  The interest of an Archive in providing provenance on its Provenance, for example,  is going to be much less than that on the original Provenance and it is very useful to have Provenance defined as it has been while requiring communicators to distinguish when they might mean provenance associated with other than Content Information or its components. 

I’m attaching 2 figures that summarize the PDI association reality. The first, consistent with OAIS RM 2012,  gives an example Content Information object and its associated Provenance Information object. The various additional ways the Provenance Information might be associated are labeled A through F.

In the second figure, the result of incorporating SC222 is shown.  It restricts the association to only the Content Data Object as given by the new definition of Provenance Information (and PDI generally).  This may or may not be the intent of those supporting SC222, but it is the result.  My critique below stands.

Cheers-
Don


= = = = = 



> On Aug 27, 2018, at 12:14 PM, David Giaretta <david at giaretta.org> wrote:
> 
> Hi Bruce
>  
> I think we are talking at cross-purposes – see the bold, red text below at your penultimate point, which goes to the heart of the confusion.
>  
> ..David
>  
> From: MOIMS-DAI <moims-dai-bounces at mailman.ccsds.org <mailto:moims-dai-bounces at mailman.ccsds.org>> On Behalf Of Bruce Ambacher
> Sent: 27 August 2018 16:35
> To: moims-dai at mailman.ccsds.org <mailto:moims-dai at mailman.ccsds.org>
> Subject: Re: [Moims-dai] Response to rejection of SC234
>  
> David, 
>  
> My concern is with your labeling Provenance as 1) an object and 2)  Representation Information.
>  
> I am not doing that. What I said was that the way OAIS models things is that a Provenance Information object is made up of a Data Object plus Representation Information. 
>  
> As I noted there may be, and in most cases there will be, no data object that is Provenance, that comes to the archives.  
>  
> Surely there must be _some_ Data Object – either a simple text file or a hand-written note, both of which would count as Data Objects. If no Provenance Information comes from the Producer to accompany some Content Information then at the very least the OAIS will add some Provenance Information in the form of something noting where Content Information comes from.
>  
> There more likely will be Representation Information including Provenance information to accompany a data object but no separate object that is Provenance or a larger Representation Information. 
>  
> I don’t understand what you mean. I think it would help if you used “Provenance Information” and stuck to OAIS terminology, one purpose of which was to avoid misunderstanding – as long as the terms are used consistently.
>  
> As I noted Provenance is an accumulation of information that validates the object to be what it purports to be.  
>  
> Sure – if I translate that into OAIS terms.
>  
> I think in terms of cultural objects whose Provenance is documented by a chain of custody, a deed of purchase or gift, authenticated handwriting, etc.; you think in terms of collected instrument readings or results from a series of experiments whose Provenance is documented by appropriated funds to support the information collection, information about the purpose and results of a mission or experiment, etc.  
>  
> I understand both and OAIS concepts cover both.
>  
> The Provenance of an object can stand without, and be understood without, the object itself.  
>  
> Ah, maybe here is the root of the misunderstanding. 
> When I say Provenance Information consists of a Data Object plus its Representation Information, the Data Object I mean here is what encodes (in a general sense) the Provenance Information, for example, a deed of purchase or gift, authenticated handwriting, a digital signature etc, and the Representation Information is whatever is needed for a member of the Designated Community to understand those things. 
> I do not mean the Data Object whose Provenance we are talking about. 
>  
> Conversely, I do not think it is possible to understand an object without its Representation Information.
>  
> Sure – understanding object as Data Object. Sorry to be picky but if we do not stick to agreed terminology then we will get lost.
>  
> Regards
> 
>  
> 
> ..David
> 
> -----Original Message-----
> From: David Giaretta <david at giaretta.org <mailto:david at giaretta.org>>
> To: 'MOIMS-Data Archive Interoperability' <moims-dai at mailman.ccsds.org <mailto:moims-dai at mailman.ccsds.org>>
> Sent: Mon, Aug 27, 2018 11:09 am
> Subject: Re: [Moims-dai] Response to rejection of SC234
> 
> Hi Bruce
>  
> I wrote “Provenance Information is made up of a Data Object and its Representation Information, and, since it is being preserved, has PDI associated with the Data Object.” 
> Apologies if “Information” was inadvertently dropped at some point further down.
>  
> OAIS defines Data Object and Digital Object in a way which means that, for example, a collection of digital files can be referred to as a single Digital Object (“an object composed of a set of bit sequences”) – we did this to simplify the terminology later on. 
>  
> So I agree with what you wrote about Provenance and I believe that the way I used the term is consistent with that and with the rest of OAIS.
>  
> Regards
>  
> ..David
>  
> From: MOIMS-DAI <moims-dai-bounces at mailman.ccsds.org <mailto:bounces at mailman.ccsds.org>> On Behalf Of Bruce Ambacher
> Sent: 27 August 2018 15:17
> To: moims-dai at mailman.ccsds.org <mailto:dai at mailman.ccsds.org>
> Subject: Re: [Moims-dai] Response to rejection of SC234
>  
> David, 
> I do not accept your view that Provenance is a data object coupled with Representation Information.  Provenance is the accumulated knowledge, possibly codified in Representation Information, that provides the reasons for a data object's existence, who created it, who has been responsible for its longevity, the chain of custody, what preservation actions have been taken prior to its ingest into the long term preservation system, and possibly any special actions taken on the object.
>  
> We must remember Provenance is a term  from the cultural heritage world and is especially important in the art world and in cultural heritage collections to verify the authenticity of the object, be it a painting or a collection of personal papers, and to establish its chain of custody from origin to current "ownership."  We certainly can adapt a term and even trim it around the edges to fit OAIS but we must remain true to the origins and meaning of its original use.
> 
> -----Original Message-----
> From: David Giaretta <david at giaretta.org <mailto:david at giaretta.org>>
> To: 'MOIMS-Data Archive Interoperability' <moims-dai at mailman.ccsds.org <mailto:moims-dai at mailman.ccsds.org>>
> Sent: Mon, Aug 27, 2018 7:03 am
> Subject: Re: [Moims-dai] Response to rejection of SC234
> Hi Don
>  
> You keep missing the main point, which is that OAIS talks about Information Objects (PDI, each of its components and Representation Information are all Information Objects – see Figure 4-12 in OAIS).
> Each Information Object is made up of a Data Object and Representation Information. 
>  
> Any Information Object being preserved by the OAIS also needs PDI – change #222 says this PDI is associated with the Data Object. Perhaps we should clarify the text to make this clearer if it is causing confusion.
>  
> This means, for example, that Provenance Information is made up of a Data Object and its Representation Information, and, since it is being preserved, has PDI associated with the Data Object.
> Representation Information is made up of a Data Object and its own Representation Information, and, if it is being preserved, has PDI associated with the Data Object.
> Fixity Information is made up of a Data Object and its own Representation Information, and, if it is being preserved, has PDI associated with the Data Object.
>  
> OAIS was always meant to be testable – that is why we introduced the Designated Community concept early on. That is why we defined conformance in the way we did. The audit standard was in the OAIS roadmap from the start. OAIS is the keystone to a whole set of standards.
>  
> Change #222 simply means that it the model is easier to test by clarifying the concepts, which I think is a good objective, and is clearer as long as one only remembers that we are referring to Information Objects which have Data Objects and Representation Information.
>  
> I’ve put some additional comments below, just to make it clear.
>  
> Regards
>  
> ..David
>  
> -----Original Message-----
> From: MOIMS-DAI <moims-dai-bounces at mailman.ccsds.org <mailto:bounces at mailman.ccsds.org>> On Behalf Of D or C Sawyer
> Sent: 27 August 2018 01:29
> To: MOIMS DAI List <moims-dai at mailman.ccsds.org <mailto:dai at mailman.ccsds.org>>
> Subject: [Moims-dai] Response to rejection of SC234
>  
> Dear All,
>  
> As I was not able to participate in last Tuesday’s teleconference where my proposal to eliminate SC222 was addressed, and given their conclusion that SC222 should be retained, I’m using this note to summarize my response. I believe this decision is short sighted and I hope that upon further reflection this decision will be reversed. However I will not submit another SC on this topic.
>  
> I believe that you are misunderstanding the change proposed.
>  
> My response is divided into 3 categories:
>  
> - Problems with arguments for SC222
>  
> - Implications for the evolution of the OAIS RM
>  
> - Some problems with the current incorporation of SC222
>  
> A.  Problems with arguments for SC222
>  
> The OAIS RM 2012 identifies an information grouping called PDI ( categories of Provenance, Context, Reference, Rights, and Fixity) as being important to the preservation of the associated Content Information.  These categories support preservation by documenting authenticity (Provenance, Fixity), identifying how the associated information is related to other information (Context) and thus improving understanding, identifying restrictions on usage (Rights) that may be in effect, and how information may be uniquely identified (Reference) in support of requests by Consumers.  Content Information itself is composed of the Content Data Object (CDO) and its associated Representation Information (Rep.Info.). This high level PDI association concept says nothing about how PDI is to be associated with the Content Information components CDO and Rep.Info. It also says nothing about how this association might be implemented. The various PDI categories could be applied separately to the CDO and to the Rep. Info, as well as to the combination, as makes sense based on the nature of the CDO and the Rep.Info.  In contrast, SC222 now restricts the PDI to only be associated with the CDO.
>  
> I’ve seen only two arguments put forward in support of this major change.
>  
> 1.       It is argued that it is virtually impossible to apply PDI to Rep. Info. and therefore it makes sense to limit its application in the Information Package to the CDO and thus exclude the Rep. Info. or the Content Information as a whole.  However in one of my email exchanges on this topic I outlined how databases and pointers could be used to track updates to Provenance, Context, and Rights Information for a Representation Network, and they also provide a degree of Fixity.   So clearly PDI application to a Rep.Info. network is implementable.  
>  
> In the very very long email chain, and I think also on the review site, I proposed changes to the wording of OAIS which would show how Fixity could apply to the Content Information as a whole. The particular issue was to apply Fixity Information to a Representation Information Network, which may be distributed. It is not impossible but certainly difficult and would be hard to keep up to date. However that suggestion was rejected. Hence the proposal in #222.
>  
> Also there can be no argument that the preservation of Representation Information, which varies greatly in complexity at the structural and semantic levels, and comes from a wide variety of sources, and often evolves, doesn’t benefit from an appropriate level of PDI application.  
> I can only speculate that the view that PDI for Representation Information is not implementable is based on a view that such implementations are not sufficiently close to the concept. However actual implementations must, by definition, be practical.  Any valid concept, and certainly the application of PDI to Representation Information is a valid preservation concept, will have some type of practical implementation.   
>  
> As noted at the start of this email – Representation INFORMATION which is being preserved must have PDI – see above. 
>  
> 2.       It was also argued that the auditors have not seen PDI being applied to Representation Information, so it is o’k to now take away that concept.  
>  
> Your understanding of what is being proposed is entirely wrong – see above.
>  
> Assuming this is true, and most certainly it is not wholly true for all the PDI components, it must surely be strange to suggest that the practices of a few archives should now be taken to be the basis for what is conceptually significant for preservation.  A major, and so far successful, goal of the OAIS RM has been to encourage thought about what is involved in good preservation practices.  The OAIS RM has attempted to be the ‘Gold Standard’ context for the discussion of preservation issues both at its founding and in the previous updates.This has led to the recognition of the need for improvements in implementation practices and subsequently to the ISO auditing effort.  Thus this second argument is, at best, counter to the history of OAIS RM development to this point.
>  
> Your understanding of what is being proposed is entirely wrong – see above.
>  
> It is not a matter of whether applying SC222 is correct or not. It is a matter of whether this application is an improvement for the original objectives set out for the OAIS RM, or for some new objectives. The line of thinking that suggests the ease or difficulty of some aspect of implementation, rightly or wrongly perceived, should impact the evolution of the OAIS RM is totally new.  Following this line of thinking moves one from consideration of the OAIS RM as the conceptual ‘Gold Standard’ for preservation to something less.
>  
> It seems clear, at least to this author, that the use of the OAIS RM as the framework for ISO 16363 auditing, which in this context necessarily puts much of the OAIS RM into an implementation perspective, has resulted in some individuals active in the auditing to push an implementability perspective, rightly or wrongly conceived, back into the OAIS RM itself.   As previously noted, this was never a consideration in its original development or the previous updates.  What might this mean for the evolution of the OAIS RM?
>  
> My view is that we are clarifying the model, just as we did when we introduced Transformational Information Properties. 
>  
> B. Implications for the evolution of the OAIS RM
>  
> It is understandable that the ISO auditing process wants to have as much specificity as possible to aid both the auditors and the Archives. If the OAIS RM is now going to be evolved to be more closely aligned with the detailed experience of auditors, it will logically take a narrower view of what is supported and not supported through its concepts. This will evolve with auditor experience and with technology changes and implementation practices.  This OAIS RM could no longer be considered the conceptual ‘Gold Standard’ for the discussion of preservation issues.  For example, the adoption SC222 has just removed a very valid preservation concept, the use of PDI in helping preserve Rep.Info., from the OAIS conceptual model.  My view is that the preservation community needs to have a ‘Gold Standard’ reference model and PDI for Rep.Info. needs to be in it.
>  
> Your understanding of what is being proposed is entirely wrong – see above.
>  
> I believe that, lacking such a reference model, there may very well be competing models put forward, particularly in various disciplines, such as the library community, where there is already some concern that the OAIS RM is not always a good fit. One approach to this possibility is to clone an auditing version of OAIS that can evolve to better support the auditing while keeping a ‘Gold Standard’ OAIS that evolves more generally. Of course this would take resource that may not be available. Another approach is to take a rigorous approach to the separation of a ‘Gold Standard’ from the needs of auditing. If others have concerns along these lines, now would be a good time to speak up.  However in the grand scheme of important issues, the future of the OAIS RM is not significant.
>  
> C. Some problems with the current incorporation of SC222
>  
> If SC222 were not a major change to fundamental OAIS concepts, this could pretty well be ignored.  However as John has pointed out, this has resulted in over 200 changes from ‘Content Information’ to ‘Content Data Object’.  If I’ve counted correctly, through section 4.2 there were 33 reference to the CDO and now there are about 3 times as many (97). There were about 200 references to the Content Information and now there are about 138. This has put a greatly increased focus on the preservation of the Content Data Object,  which for a digital archive is mostly a matter of preserving the bits.  This is totally contrary to our past, successful, efforts to get the importance of the Representation Information more fully recognized. This has to be considered, at a minimum, ‘not helpful’ in this regard.
>  
> Your understanding of what is being proposed is entirely wrong – see above.
>  
> In addition, this change has complicated some relationships that show up in the new Terminology section.
>  
> 1.  AIC definition:  The addition of the view that an AIC must include PDI describing the collection criteria and process, along with the view that all OAISs have at least one AIC which is the collection of all its AIPs, together with the new view that PDI only applies to the CDO and NOT the Rep Info, does not form a consistent picture.   PDI describing the collection criteria and process for a collection of AIPs is not PDI applied to the CDO.
>  
> Your understanding of what is being proposed is entirely wrong – see above – PDI is needed for any Information Object that is being preserved.
>  
> Note also that this proposal, that an AIC must have this type of PDI, overlaps with the existing Collection Description that is supposed to provide this type of information.
>  
> 2. Context definition: This is supposed to apply  only to the CDO and not the Content information.  But what is really going to be be documented, in many if not most cases, is the context for the Information, not just the CDO.  Generally people are not going to single out the CDO (when digital) versus the CDO + Rep when writing this information.  For example, a music performance has a relationship to other performances and this may well be documented Context.  To say this is about the CDO is not believable.  This attempt to focus on CDO instead of the Information has introduced this contrived awkwardness and is totally unnecessary.  This results from trying to use the CDO as a handle to refer to the Content Information without actually saying so.
>  
> This point deserved further consideration.
>  
> 3.       Fixity Information: The new definition only applies to the CDO.  Apparently it is not needed for Rep Info., but of course it I must be relevant to Rep. Info. preservation.  Any reasonable Archive will take some steps to preserve its Rep. Info. from undocumented alteration.
>  
> Your understanding of what is being proposed is entirely wrong – see above – PDI is needed for any Information Object that is being preserved.
>  
> 4.       Preservation Description Information (PDI). As now proposed, it is only necessary for the preservation of the CDO.  However it clearly is needed for many types of Rep Info. as well.
>  
> Your understanding of what is being proposed is entirely wrong – see above – PDI is needed for any Information Object that is being preserved.
>  
> 5.       Provenance Information: The new definition ignores that the history of Rep. Info. can be very important to its perceived authenticity.
>  
> Your understanding of what is being proposed is entirely wrong – see above – PDI is needed for any Information Object that is being preserved.
>  
> 6. Reference Information:  The new definition, applying only to the CDO, rules out its use for external references to the Rep. Info. and to the Content Information as a whole.  The example of ISBN clearly applies to the Content Information, not just to the CDO.
>  
> Your understanding of what is being proposed is entirely wrong – see above – PDI is needed for any Information Object that is being preserved. By the way I believe that the ISBN points to the book, not its Representation Information.
>  
> 6.       Transformation:  There is a new Note added that is incorrect.  The Content Information can be changed by updating of the associated Rep. Info. without requiring a change to the CDO.  For example, a new, broader, version of a standard format may be linked to the CDO that does not alter those aspects of the CDO that are present.  Another example is a new set of Semantic Information that does not alter the CDO.  According to the new definition of PDI, there is no need to track these changes because the PDI only applies to the CDO.  This seems clearly deficient.
>  
> Your understanding of what is being proposed is entirely wrong – see above – PDI of the Data Object is needed for any Information Object that is being preserved.
>  
> Cheers-
> Don
>  
> _______________________________________________
> MOIMS-DAI mailing list
> MOIMS- <mailto:MOIMS-DAI at mailman.ccsds.org>DAI at mailman.ccsds.org <mailto:DAI at mailman.ccsds.org>
> https://mailman.ccsds.org/cgi-bin/mailman/listinfo/moims-dai <https://mailman.ccsds.org/cgi-bin/mailman/listinfo/moims-dai>
> _______________________________________________ MOIMS-DAI mailing list MOIMS-DAI at mailman.ccsds.org <mailto:DAI at mailman.ccsds.org>https://mailman.ccsds.org/cgi-bin/mailman/listinfo/moims-dai <https://mailman.ccsds.org/cgi-bin/mailman/listinfo/moims-dai>
> _______________________________________________ MOIMS-DAI mailing list MOIMS-DAI at mailman.ccsds.org <mailto:DAI at mailman.ccsds.org>https://mailman.ccsds.org/cgi-bin/mailman/listinfo/moims-dai <https://mailman.ccsds.org/cgi-bin/mailman/listinfo/moims-dai>_______________________________________________
> MOIMS-DAI mailing list
> MOIMS-DAI at mailman.ccsds.org
> https://mailman.ccsds.org/cgi-bin/mailman/listinfo/moims-dai

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20180827/26c13de1/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Slide2.jpeg
Type: image/jpeg
Size: 185600 bytes
Desc: not available
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20180827/26c13de1/attachment.jpeg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Slide3.jpeg
Type: image/jpeg
Size: 170932 bytes
Desc: not available
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20180827/26c13de1/attachment-0001.jpeg>


More information about the MOIMS-DAI mailing list