[Moims-dai] A Critique of draft OAIS RM version 650x0w2x1JGG20181009.doc

D or C Sawyer Sawyer at acm.org
Tue Oct 16 05:53:10 UTC 2018

Dear All,

In this critique I address three topics for consideration: Preservation Description Information, AIP fixity, and the Role of OAIS RM in Auditing.

Preservation Description Information:

The most recent version of the draft revised OAIS RM still has the problem that it has removed a long standing preservation concept despite the concepts continuing to be of critical importance.  This concept involves the roles that Fixity, Reference, Provenance, Context, and Rights Information (collectively called Preservation Description Information) play in supporting the long term preservation of the Content Information, including both the Content Data Object and the Representation Information. The removal of Representation Information from this concept in the current draft revision is an extradorinaiy development that is a major deviation not only from the original and long standing information modeling in OAIS, but also from the prior seminal 1996 report “Preserving Digital Preservation” by the 21 member Task Force on Archiving of Digital Information co-chaired by Don Waters and John Garrett (a different John Garrett than out CCSDS colleague). The report can be found at:  

The terminology and information object concepts presented in the above report (hereafter Report) played a major role in the development of our more formalized AIP concept.  I believe it is useful to briefly review those information object concepts from the Report and to compare them with the original AIP concept.  Key information objects from the Report, along with some highlights discussing them that I’ve extracted in very brief summaries, are given in italics below:

content:  It consists of multiple levels of abstraction. At the lowest level of abstraction, digital information objects consist of strings of 0’s and 1’s. At higher levels of abstraction there are the issues of characters and higher organizations of layout and structure, and ultimately including the knowledge or ideas they contain.  

From this, and our collective experience, we formalized the concept of Content Information, in the digital case, as consisting of the digital Content Data Object (CDO) and its Representation Information (RepInfo). An understanding of the RepInfo and its application to the CDO is necessary to unlock the information inherent in the bits of the CDO. Corruption of bits in the CDO or corruption of bits or of understanding of the RepInfo results in corruption of the Content Information.

fixity: Addresses how the content is fixed as a discrete object.

From this, and our collective experience, we formalized Fixity Information as a mechanism applicable to the Content Information. Since the CDO and RepInfo may each be composed of many separable digital objects, the concept implies that they all need the application of Fixity.

reference: Information objects must have a consistent source of reference. One must be able to locate it definitely and reliably over time. URLs and URNs are some examples given.

From this, and our collective experience, we formalized the concept of Reference Information to be an identifier of the Content Information. Thus there may be any number of actual identifiers depending on the nature of the CDO and the RepInfo.

provenance: Provenance has become one of the central organizing concepts of modern archival science.  The assumption underlying the principle of provenance is that the integrity of an information object is partly embodied in tracing from where it came.  Digital archives must preserve a record of its origin and chain of custody, including within the archive itself.  They note that the archival concern with provenance is intimately related to the notion of context as a matter of information integrity.

From this, and our collective experience, we formalized the concept of Provenance Information as the information that documents the source and history of the Content Information.  This is fully consistent with the traditional use of Provenance with non-digital materials and reflects the fact that both the digital CDO and the RepInfo are essential components whose source and history are equally relevant to assessing the authenticity of the resulting Content Information.

context:  This addresses the ways in which information objects interact with elements in the wider digital environment. The Report see it as involving 4 dimensions: technical, linkage to other objects, communication, and a wider social dimension.  The technical dimension is about hardware and software dependencies, the linkage dimension is concerned with information objects that have links to other objects and how to preserve this, the communications dimension is about how the influence of communication network features, such as physical media and bandwidth, will affect features of the digital objects, and finally the social dimension is about the purpose of the information objects such as whether they are intended, for  example, for informal or formal communication of information.

From this, and our collective experience, we formalized the concept of Context Information as the Information that documents the relationships of the Content Information to its environment. While this is very broad I believe we have treated their technical dimension using the perspectives of the RepInfo and hardware/software to display or otherwise present the Information.  The linkage issue has been treated from the perspective of needing to clearly define the CDO and then the RepInfo so that the Content Information is clearly defined. I believe their issues of communications have been viewed in OAIS as external factors not directly affecting  the Archive’s ability to preserve identified Content Information.  Their wider social dimension seems most relevant as this understanding, or lack thereof, could significantly affect an understanding of the intent of the information.

The Report refers to this supporting information, collectively, as providing integrity to the preservation of the content information objects, and in our modeling that clearly includes the RepInfo.  As OAIS is a conceptual model for the purpose of communication, there is no basis for removing RepInfo from the concept of needing the supporting information discussed above unless it is unimplementable, which surely is not the case unless one demands a perfect, fool proof, implementation.  Its removal leaves a clear integrity hole in the OAIS concepts. Therefore I believe it would be wise to fill this hole by returning to the original concept with Preservation Description Information applicable to the Content Information.

AIP Fixity:

The current and the draft revision of the OAIS RM do not address the concept of applying fixity mechanisms to the AIP itself. I recommend augmenting the information modeling to recognize the utility of applying fixity mechanisms to the AIP as a whole, but not including the associated Descriptive Information. By inheritance his would facilitate communication about applying fixity to the Packaging Information, and to the components in the AIP that do not currently have associated fixity: Provenance Information, Reference Information, Context Information, Rights Information, and even to Fixity Information. I believe this would be widely understood as a conceptual improvement in the integrity of the AIP.

Role of OAIS RM in Auditing:

The advent of the ISO auditing process quite naturally makes use of the concepts and terminology in the OAIS RM.  However it seems clear the proper place to address constraints on Archive implementations is in auditing documents, not in a conceptual communication model such as the OAIS RM.  I believe the OAIS RM should continue as a conceptual framework by which to discuss preservation, and particularly digital preservation, issues and implementations that are seen to be of general interest. It can not be expected to address the details of specific implementations but should facilitate communication by allowing the context of a discussion to be narrowed.  

For example, consider an Archive that will preserve a number of historical works of fiction expressed in a standard format such as PDF/A. Such as Archive may reasonably argue, for some such works, that it does not need to maintain additional Context Information for these documents because the Provenance Information already provides sufficient context (Provenance is a particular type of Context as the Report also recognized) and Consumers might easily find additional Context Information in the historical record, should it be desired. I believe an auditor would be wise to consider such a statement as a possible exception to providing explicit Context Information.  In this example, a case is made that the auditing requirements should not be rigidly tied to every aspect of the conceptual communications framework.  Some flexibility is needed to meet the vast range of implementation circumstances.

Further I believe the best way to ensure a robust future for the auditing effort is to ensure that each auditing requirement make clear what intent is to be achieved and that it be one that the Archive, and therefore the auditor, can see has value.  


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20181016/b717e807/attachment.html>

More information about the MOIMS-DAI mailing list