[Moims-dai] Todays Telecon: PDI on PDI
david at giaretta.org
Tue Jun 26 12:20:26 UTC 2018
I’m afraid that I will not be able to join the Webex today therefore, although direct conversation is the best way to resolve issues, I think in this email I must correct some points that Don has made. He seems to have missed some essential points.
However, before doing that, at the last Webex Eld mentioned that some people on the mailing list were a little disturbed by the tone of the exchanges. Therefore, I should start by offering some reassurance - people should not worry; remember that many of us have been working together on these standards for decades and we know each other very well – well enough to express our views very forcefully without cause offence. Indeed, we generally have found it a very effective way to get to the heart of issues!
That said – let us have at it.
Perhaps it would be clearer to look at some specific claims Don makes:
* Don writes “There is to be no association of PDI with the Representation Information of the Content Information. “
* Incorrect – my original and continued intention is to make to abundantly clear that Representation Information, and indeed the components of PDI itself, are to be preserved and therefore explicitly need all the “metadata” needed for preservation. I had originally hoped to simply make it clear that each of these components could be considered as “Content Information”. Mark and Don objected to this, and Mark clarified his view as arising because what comes from an “external” Producer should be distinguished from the Information Objects created by the OAIS in order to successfully preserve. Yet clearly the Representation Information and PDI need to be preserved and so the consistent way to do this would be to regard each of those Information Objects as Content Information for the purposes of their own preservation. To meet the point about being clear what the “external” Producer gives to the OAIS I agreed with Mark’s suggested additional text; also clearly the Data Management Functional Entity would need to allow the OAIS to make the distinctions between that “external” content and “internally” produced objects, just as it presumably supports the distinction between AIP Versions etc – there are lots of details which OAIS omits, simply because they are too detailed.
* Why make the change suggested in SC222? The reason is not because of some mis-apprehension over the past 25 years; the current version is entirely consistent – but the point of the update is to improve the standard. In particular since we are now in a position to look at the implications for ISO 16363 and the audit process, it is clear that we need to offer a way to make it abundantly clear how the components of an AIP are themselves to be preserved, and the sorts of evidence an auditor can seek. Making the change proposed is meant to give a way to make this possible. Given Don’s comments, perhaps we do need to add an Annex to explain that Representation Information and PDI are to be preserved using all the Information Model concepts.
* The proposed change does not, as far as I can ascertain, cause any implementation issues for existing repositories since they seem to focus on individual files, for example when calculating the Fixity.
* Don also writes “Perhaps the response will be something like ‘but we will say in the text that some aspects of PDI will be important for Representation Information’. “
* To even imagine that I would say that would be to miss entirely the point at issue, which is that the components needed for preservation, such as Representation Information and Provenance, themselves need to be preserved. In other words they need to be (logically) contained in their own, full, AIPs. It is true that then the natural question arises about how the recursion of AIPs ends – I provided examples in earlier emails.
* I apologise if I have been unclear about this but I realy do think I have re-iterated this point in pretty well every email in this exchange.
* Don also wrote “Again, this is all about having a common framework for discussion and not for implementation.”
* Yes, but there is always a balance to be struck. For example, why did we introduce the terms Representation Information and the various components of PDI – why not just use the term “metadata”? Specifying the components does not specify an implementation but instead gives a greater granularity to the terminology, in part so that one can ask, and answer, the question – do I have the “right types” of “metadata” and enough of each type? In the case of SC222 we are essentially providing additional concepts (actually re-using concepts) to make it possible to ask, and answer, the question “how does the repository preserve the Representation Information, PDI and so forth, that it needs to preserve Content Information”?
* Similarly why did we bother to provide a Functional Model, with all its details? Did we include it to define an implementation? No! it was to provide an answer to questions like – “What kinds of functionality should an OAIS provide?”.
* The re-use of concepts in this case is entirely consistent with the re-use of concepts in many aspects of OAIS.
* Don concludes “adopting SC222 will not allow the maintenance of the OAIS RM as a conceptual model intended to be a broadly applicable framework for the purpose of discussion”
* I believe that my responses above and in previous emails show that this conclusion is incorrect and arises from missing the essence of the proposed change.
I have provided detailed responses and examples in response to Don’s comments throughout these email exchanges. I cannot see the same level of explanation of his views in Don’s emails, despite their length; rather Don re-iterates what is in OAIS right now rather than engaging with the potential of the new ideas. The distinction he seems to have between modelling and implementation seems not to recognise what OAIS modelling is there for, namely to answer important questions so that implementations can be guided as well as checked – hence the relationship to ISO 16363.
From: MOIMS-DAI <moims-dai-bounces at mailman.ccsds.org> On Behalf Of Sawyer at acm.org
Sent: 26 June 2018 04:25
To: MOIMS DAI List <moims-dai at mailman.ccsds.org>
Subject: Re: [Moims-dai] Todays Telecon: PDI on PDI
Perhaps I can describe my issue with SC222 more succinctly and hopefully more clearly, and perhaps it can be discussed at the DAI telecon tomorrow.
The OAIS RM was developed as a communication aid to facilitate discussion among a diverse audience who need to be concerned with the preservation of information that has been, or will be, encoded into digital form. The OAIS RM information modeling was chosen to be as broadly applicable as possible so as to cover all likely situations at a general (high) level and thus provide a starting point to discuss archival issues and to compare and contrast implementations. Of course the concepts would not be able to be both broadly applicable and also describe the details of implementations. Cal Lee did his PhD thesis on the development of the OAIS RM and I again recommend his related paper <https://ils.unc.edu/callee/p4020-lee.pdf> Open Archival Information System (OAIS) Reference Model. The OAIS RM modeling has been successful beyond anyone’s expectations. During the previous 5-year review, a few additional concepts were added to cover new topics.
During this current 5-year review there is now a proposal to narrow the Information Package (IP) concept which is the most fundamental OAIS RM information modeling concept apart from the Information Object (Data Object + Representation Information). The IP, when it appears as the AIP in Archival Storage, has all the components of the IP. It is simply a package with Content Information and associated PDI (Context, Provenance, Rights, Reference and Fixity Information), and the package is associated with Descriptive Information that is used to support discovery of the package. The current proposal (SC222) proposes to narrow this concept by restricting the PDI to only be associated with the Content Data Object of the Content Information. There is to be no association of PDI with the Representation Information of the Content Information. Making this change to this fundamental model says that the OAIS RM no longer believes that any archivist or any Archive implementation need be concerned with the Context surrounding any of the Representation components, nor the Authenticity of any of the components, which includes the structural information essential for decoding and, in some cases, semantic information that is critical for understanding. Clearly this can not be true (except possibly for a few selected implementations, but even those need at least Fixity) and I don’t believe any of the current participants think that this reflects reality (after all, we’ve been discussing PDI applied to PDI, so it must be even more important for Representation Information), but this is what the narrowed IP concept says when it appears as an AIP in Archival Storage.
Perhaps the response will be something like ‘but we will say in the text that some aspects of PDI will be important for Representation Information’. Clearly the AIP model will be saying otherwise. And if it is still important, then why has it been removed from the model? Is it to be added somewhere else? But this AIP is THE fundamental model for Content Information preservation and it is supposed to conceptually cover all cases. NOT all cases as implemented, but ALL cases as may exist for the purpose of discussion. Perhaps the problem creating this disconnect is that some are no longer viewing this model as a framework for discussion (i.e., conceptual) and are now thinking (possibly subconsciously) that it is a framework for implementation and THEREFORE all implementations must have all the components - which is not realistic and not what the conceptual modeling is about. Recall that, for the functional modeling, we say that actual implementation may break out functionality differently. Again, this is all about having a common framework for discussion and not for implementation. If there is a desire for an implementation architecture, this should be a separate document. However I can’t imagine there could ever be just one such architecture.
I conclude that adopting SC222 will not allow the maintenance of the OAIS RM as a conceptual model intended to be a broadly applicable framework for the purpose of discussion. I believe this would be a very significant loss.
On Jun 19, 2018, at 11:20 PM, D or C Sawyer <Sawyer at acm.org <mailto:Sawyer at acm.org> > wrote:
Hi John, et al.,
Thanks for the response. I think I’ve laid out a reasonably clear summary view of my issue (conceptual vs implementation in the OAIS RM update) in the text (5 short paragraphs) before my signature. However it may be clear to me but not to others and so I’d appreciate hearing about anything in those 5 paragraphs that does not seem clear. I think this would save time prior to any webex session.
The rest of the text with comments on comments are details, apart from the concluding example of PDI applied to Content Information. Of course I’m happy to respond to the need for clarifications for any of my text.
On Jun 19, 2018, at 1:22 AM, John Garrett <garrett at his.com <mailto:garrett at his.com> > wrote:
Hi Don (and others),
OK. Have a safe trip.
I think we should probably hold the discussions of this then until hopefully we can get everyone together at once.
I don’t think we’re all that far apart, but there has been so much back and forth that the discussion is becoming hard to follow. I think some of us are misunderstanding the proposals from the others, but it may take a bit of discussion to tease that out. I know that some of my responses have been inexact in the language I used and I see how that has caused complications. A piece of that is because we don’t have terms for PDI-like information that gets recursively applied to PDI Information Objects.
Hopefully a short discussion during a webex will clear this up easier than series of ever expanding email messages.
With a number of these remaining issues, we will need to determine if we can come to a consensus or if there is too much disagreement. If a consensus cannot be reached, we may have to decide what portion we can get consensus on and how soon we can get agreement and agree that is what goes into this issue of OAIS. We can keep a record of the issue and immediately start working it for a following issue of OAIS (we don’t need to wait 5 years to put out a new issue) (but we should try to get out updates more often than every 10 years).
Peace and joy,
From: MOIMS-DAI [mailto:moims-da i-bounces at mailman.ccsds.org <mailto:i-bounces at mailman.ccsds.org> ] On Behalf Of D or C Sawyer
Sent: Monday, June 18, 2018 2:31 PM
To: MOIMS DAI List <moims-dai at mailman.ccsds.org <mailto:moims-dai at mailman.ccsds.org> >
Subject: Re: [Moims-dai] Todays Telecon: PDI on PDI
(Unfortunately I will be unavailable to participate in tomorrow’s DAI telecon as I’ll be traveling to Canada)
Dear David, John, et al.,
I believe the current OAIS RM review has a serious flaw. The comments below from David and John, who are both good friends and colleagues and have been for long before we started the OAIS RM, now make clear that we have a significant disconnect that I can’t ignore. For those who may not be familiar with the history of the OAIS RM, we started it to provide a common communication framework within which to describe the preservation of digital information. (For an independent study of its original development, see Cal Lee’s 2009 paper <https://ils.unc.edu/callee/p4020-lee.pdf> Open Archival Information System (OAIS) Reference Model.) This lack of a common vocabulary and communication framework is why we developed it as a conceptual model and why we required (and still do) so little regarding information modeling for an Archive to call itself an “OAIS”. Although we say this very clearly, the extensive modeling of both information and function has led many to interpret it as an architecture for implementation, not just for communication.
The current 5-year review process for the OAIS RM is the first one where I find some active participants applying implementation perspectives to alter communication of long-standing archival concepts, as evidenced by David and John’s comments below that are endorsing this approach to the original, long standing, Information Package concept. The problem with this approach is that implementations are specific and to an extent unique while archival concepts are expected to be broadly relevant and informative regardless of implementations. Concepts are to be evaluated at the conceptual level, such as whether they are still valid, fit in with other related concepts, and are clearly expressed. Using a few implementations to narrow a concept is virtually certain to leave out other implementations and is thus illogical. This is particularly of concern with the OAIS RM Information Package concept, one of only two information concepts (the other being the Information Object) that an Archive needs to ‘support’ to be considered an ‘OAIS’. In this context, ‘support’ means that the Archive can use the concept to discuss its implementations and can do so in a way consistent with the meaning of the concept. It can’t prescribe any features of an implementation because it is only a concept. I discuss this in more detail below in response to David and John’s comments. I also discuss and refute David’s implication that applying PDI to Representation Information is not implementable, or not reasonably implementable.
In conclusion, I find the proposal (SC222) to modify the Information Package concept, so as to exclude the understanding that PDI could be relevant to Representation Information, to be contrary to valid and widely acknowledged archival concepts and also inconsistent with OAIS RM statements on the role of PDI (Context, Provenance, Reference, Rights, Fixity) in the preservation of Content Information (and by extension any information to be preserved) within an Archive. I don’t know anyone who I think believes that PDI is not important to the preservation of its associated information.
Therefore I call for SC222 to be revisited and rejected. It may be necessary to revisit other SCs if the response to those SCs has not kept the proper perspective between concept and implementation. I would be very uncomfortable with an OAIS RM that mixes Conceptual and Implementation models while calling itself a Conceptual Model. It would open the OAIS RM to significant, and I feel valid, future community criticism.
Responses to John’s and David’s specific comments are included below, ending with an example implementation of PDI application to Content Information, with recursion.
On Jun 12, 2018, at 8:53 AM, David Giaretta < <mailto:david at giaretta.org> david at giaretta.org> wrote:
From: MOIMS-DAI < <mailto:moims-dai-bounces at mailman.ccsds.org> moims-dai-bounces at mailman.ccsds.org> On Behalf Of John Garrett
Sent: 12 June 2018 06:39
To: 'MOIMS-Data Archive Interoperability' < <mailto:moims-dai at mailman.ccsds.org> moims-dai at mailman.ccsds.org>
Subject: Re: [Moims-dai] Todays Telecon: PDI on PDI
I think we are converging on an understanding of preservation of PDI objects.
I still support SC#222 which we had previously agreed on. Some comments on Don’s comments below.
Peace and joy,
From: MOIMS-DAI [ <mailto:moims-dai-bounces at mailman.ccsds.org> mailto:moims-dai-bounces at mailman.ccsds.org] On Behalf Of D or C Sawyer
Sent: Monday, June 11, 2018 11:57 PM
To: MOIMS DAI List < <mailto:moims-dai at mailman.ccsds.org> moims-dai at mailman.ccsds.org>
Subject: Re: [Moims-dai] Todays Telecon: PDI on PDI
I’m pleased that you like the ‘Preserved Data Object’ modeling view (not surprised as it is much of what you’ve been pushing), but I can’t agree to SC222 for the reasons given below. I think my proposal below combines the best of both and should eliminate the Content Information controversy discussed in the last 2 telecoms.
On Jun 9, 2018, at 4:47 PM, David Giaretta < <mailto:david at giaretta.org> david at giaretta.org> wrote:
You almost have it but, to be logically consistent, it needs another step.
The “Preserved Data Object (PDO)” term would be very useful, especially as an intermediate term between the Information Object and AIP.
Yes, I think I expressed this when I mentioned that successive application should lead to an AIP, except that it doesn’t include the Packaging Information and the Descriptive Information which are an important part of our defined AIP.
At first look, I like the idea of PDO. But of course will need more time to think it through.
One concern I have is that there could be lots of places where the concept could be used and it could cascade at this late date into many needed updates in OAIS. So I will reserve my agreement until I can get an idea of how many updates are needed and whether and how quickly we can finish them. Our deadline for completing OAIS was our last CCSDS meeting.
Good point John
I’m was not proposing to discuss PDO throughout the document, just in an Annex to satisfy David’s desire to have a recursive Data Object view available.
“The only provision I would add is that both the Rep. Info. and the PDI would need to be optional in the sense that a ‘naked’ or ‘partially dressed’ Data Object needs to be allowed in order to stop the recursion. Of course the stopping criteria are different between the Rep. Info. and the PDI.”
Yes, that is why in the Information Object diagram the RepInfo recursion shows 1 to * since * can be anything from 0 upwards. The link to PDI is 1:1 currently but we would have to change this to 1:*. We would need some text to discuss how the recursion ends.
I fully agree.
I disagree. I think an Information Package always requires PDI. Otherwise what is the difference between an Information Object and an Information Package?
John’s comment is missing the context that the discussion is the AIP, not the more general Information Package. Further, the Information Object concept and the Information Package concept, as defined in Section 2.2, are generic CONCEPTs defined clearly in the OAIS RM to have all the components that are clearly visible. There is no debate about what is included and the Information Object and Information Package CONCEPTS are always distinct. However for an Archive implementation to be an OAIS, we say it must SUPPORT this Information Object concept and Information Package concept. It requires nothing more regarding information modeling. What does this SUPPORT mean? As regards the Information Object, it certainly can be satisfied as long as the Archive recognizes that one of its Data Objects together with its Representation Information is an Information Object. It can’t use Information Object to mean something different. As regards Information Package, the exact same type of recognition satisfies this requirement. It can’t use ‘Information Package’ to mean something different. We kept these requirements to a minimum so as not to constrain implementations. Note that the OAIS RM does not actually use the Information Package concept of Section 2.2 as given, but in Section 2.3 it introduces specialized forms as SIPs, AIPs, and DIPs. When the modeling shows it as a SIP, it may or may not have ALL the components of the Section 2.2 Information Package and also may not be an Information Object. When it is an AIP it MUST have all the components and then it WILL BE an Information Object because it has a Data Object and its Representation Information. When it is a DIP it may not have all the components and may not be an Information Object. These are all CONCEPTS and not implementations. Actual OAIS implementations may or may not follow these concepts exactly. Note that NONE of the more detailed information modeling in Section 4.2 is REQUIRED to be ’supported’ by an OAIS.
Of course we hope that OAIS implementations will closely follow the SIP, AIP, and DIP models and thus be able to easily address their implementations using these terms. We also hope that the information modeling in Section 4.2 will also be adopted for describing their implementations. We do NOT say that implementations must treat the information modeling as an ARCHITECTURE to be followed, although in many cases this can work. This fact makes it easy for people to slip from concept to implementation views when reviewing the concepts.
I was making the analogy with Representation Information – the “0” would only apply where the iteration ends, just as with Representation Information. But it is an interesting point about needing to keep the difference between Info Object and Info Package. For RepInfo the same would apply at the “0” where Data Object and Info Object become indistinguishable. We certainly need to make sure the meaning of “0” is clear.
As noted above, the Information Object and Information Package are ALWAYS unique concepts. I have no problem adding a conceptual model showing PDI recursion. This is not ruled out by the current modeling but it seems useful to make this recursion view explicit. An actual implementation may find many of the PDI components not necessary, apart from Fixity.
So far so good.
But, in order to actually preserve something, we really do need all the components of an AIP i.e. we need to know how the Data Object, RepInfo and PDI are connected together, and have some overall description. In other words, whatever you want to call it, it is an AIP.
Yes, we need the full AIP and the current modeling provides this, apart from showing PDI can be recursive, which as you note could easily be added.
I don’t agree that we need all the components of an AIP in each and every instance to preserve something. It is definitely useful to have it all, but is not necessary in every situation for every Information Object. As we’ve all agreed, we need to break the recursion as some point by not having all of PDI for everything.
You are talking about an IMPLEMENTATION. The conceptual model says an AIP has ALL the components. If you can make the case that your AIP implementation doesn’t need all the components, then fine. The current AIP, based on an IP, strictly speaking says nothing about recursion of either the Representation Information or the PDI. However they both can be recursive and implementations will deal with this as they see fit. They may or may not pass an audit but this is a totally different issue. This confusion between the CONCEPTUAL model and an IMPLEMENTATION model, as regards what should appear in the OAIS RM, is the fundamental underlying issue that allows the SC222 proposal.
I think we need to carefully document the “edge” cases and distinguish these from the more general case where all the AIP components are needed.
Again, this is a valid consideration for an implementation but has NO bearing on the conceptual model. Based on the OAIS RM, an AIP implementation does not need to have ALL the components even in the general case (i.e., because the AIP concept is not addressing implementations). This could be made more clear in the OAIS RM. This says nothing about what may be required in an audit.
In terms of an Annex – that would be helpful to contain the bulk of the discussion and examples. If we move the discussion to a Normative Annex, then the only change we need is that agreed in <http://review.oais.info/show_bug.cgi?id=222> http://review.oais.info/show_bug.cgi?id=222 but there would have to be a small amount of additional explanatory text in the body of the standard.
I can not agree with SC222. I note that SC222 is titled “Change PDI to be describing CDO rather than Content Information”. This is stated to be a ‘significant’ change and I would say it is a radical, unnecessary, and counter productive change, for the following reasons:
I on the other hand feel it is a necessary change. I think conceptually the change makes more sense than the current situation.
I also think it a necessary change.
I believe I’ve clarified that not only is it not necessary for the conceptual modeling, but to do so reduces the generality of what is now a perfectly valid and widely accepted archival concept (Information Package) that is one of the two most fundamental information concepts in the OAIS RM.
1. SC222 provides the following rationale : “While discussing other SCs, we are often confronted with situations where applying PDI to Representation Information raises significant problems. I think this change may make it easier to resolve some of the other SCs.”
This statement clearly states that discussions leading to this proposal were concerned with implementations of particular views. Further text discusses issues with possible approaches to applying the Fixity component of PDI to Representation Information and to the combined CDO with Representation Information. All of the text clearly expresses concerns with a set of implementation approaches and therefor should immediately be suspect as a rationale for changes. In fact it appears that not only were the approaches being considered too rigid, but the view of the information models was also too rigid because there are implementation approaches that can work. The OAIS RM presents CONCEPTUAL MODELS and not IMPLEMENTATION MODELS. The OAIS RM was generated to provide a common framework of terms and concepts to facilitate communication. This is easy to forget as it can easily be viewed, and often has been, as an implementation model. For example, Section 2.2 is stated to contain the only information models that an OAIS implementation needs to support in order to be an OAIS. But what do we say in 2.2, and specifically regarding Figure 2-3, in this context? We take a top-down approach and give a very high level (i.e.,with little detail) view of an Information Package involving four basic types of information (Content Information, PDI, Packaging Information, and Descriptive Information) and their relationships. We show Packaging Information as a simple container with a small divider, one side holding Content Information and the other PDI, and we state that PDI is needed to preserve the Content Information, to ensure it is clearly identified, and to understand the environment in which it was created. We show Descriptive Information as being associated with this Information Package to facilitate finding the Content Information of interest.
I think the Conceptual Model was designed to be and should reflect real implementations.
This was never the case and should never be the case, as I’ve pointed out. The conceptual modeling in the OAIS RM was based on the insight of those with archival experience, recognizing deficiencies and trying to apply good archival concepts to digital information. That is why we were fortunate to have the participation of Bruce Ambacher from the US National Archives and why the result gained the approval of Ken Thibedaue, then head of the US National Archives. The archival concepts we’ve included have been throughly reviewed and accepted by both digital and non-digital archivists. We expected future implementations to adopt the concepts that make sense in their particular situations. The auditing of digital archives may take a more rigid perspective, but that is not the function of the OAIS RM. I think the statement above probably reflects digital auditing concerns.
I agree with John – the Conceptual Model should be implementable and real implementations should be mappable to the Conceptual Model if the implementation conforms.
Almost: An Architectural model should be implementable and real implementation should be mappable to the Architectural model if the implementation conforms. The Conceptual model is at a more abstract level and should be useful as a framework for discussing and comparing concepts, architectural models and implementations.
Note what it does NOT say. It does NOT say how PDI is supposed to be related to Content Information, which is defined to be the Content Data Object and its Representation Information. It could be implemented as only applied to the Content Data Object (as SC222 proposes), it could also be implemented as applied individually to the Content Data Object and the Representation Information. For example, a registry of Representation Information objects must perform preservation and must be concerned with source, version history, and fixity (i.e., PDI) and therefore needs to maintain some level of PDI. One would expect PDI applied to a Content Data Object and to differ from that applied to its Representation Information. There is no OAIS requirement other than being able to describe an OAIS implementation as using these high level concepts, and this can be done regardless of the implementation approach as long as these information types can be identified. On the other hand, adopting SC222 says that a conforming OAIS is expected to be able to relate PDI to its Content Data Object, AND there is no reason to expect any PDI to be applied to the Representation Information. This is limiting the concept of PDI applicability that is contrary to actual implementations (e.g., Representation Information Registries) while such a limitation is not there now and thus would be a step backward.
I think this is stretching understanding of normal readers. I think a normal person would interrupt a statement that PDI is associated with Content Information to mean that PDI applied to the total Content Information and was not applied sometimes to part of the Content Information and sometimes to all of it. And if I can apply PDI to only part of the Content Information, why can’t I just provide PDI for the Representation Information and not the CDO?
Certainly a ‘normal reader’ would understand a statement that ‘PDI associated with Content Information’ to mean that PDI applies to the whole Content Information AT THE CONCEPTUAL LEVEL. This is based on a fundamental and widely accepted archival concept. But how implementations deal with this concept is a different matter. An implementation could apply PDI only to the Representation Information, but this would seem hard to justify and would seem unlikely to pass a creditable audit. Since both the CDO and the Representation Information need preservation, I would apply it to both except where local circumstances suggest otherwise.
Also changing the relationship of PDI to only the CDO instead of the whole Content Information, does not mean that we can’t have PDI-like information applied to Representation Information. We’ve just been agreeing that we can preserve and applied PDI-like information to the components of PDI Information. What is the difference with doing the same thing for RepInfo?
The underlying question is “ How does the OAIS preserve the PDI and the Representation Information?”
An actual OAIS will apply PDI to the Content Information to the extent that makes sense in its particular domain and for the specific Content Information under consideration. This can include applying PDI (some or all of the components) to the Content Data Object and separately to the Representation Information, again to the extent it makes sense. Since some seem to think this can’t be done, I’m providing a brief summary below of an approach I might use if I were designing such an implementation.
2. SC222 proposes that PDI is to be associated only with the Content Data Object. The only way this can be reasonable (given that clearly some Representation Information will have PDI), is to view Content Information and its Content Data Object as ANY Information Object and ANY Data Object.
I missed last week so I guess I’m missing something here. I don’t understand the problem here.
I think what I was saying was that any information that the archive wants/needs to preserve should be viewable at Content Information.
I think that transforming Content Information into a label that would also apply separately to Representation Information and separately to PDI is highly confusing and therefore unproductive, to say the least. That you want to see a recursive view is why I offered the Preserved Data Object conceptual model. However it can not take the place of Content Information which is an extremely useful concept that would be lost if turned into a label that could be applied to any information that the archive wants/needs to preserve.
This would be a very major and clearly controversial, (as per your proposal and 2 recent telecons), revision to what most everyone understands Content Information to be. It is defined as the ‘original target of preservation’, and widely understood to refer to the primary information that external providers are submitting to the OAIS for preservation. I believe it would be a major step backward to loose the ability to clearly refer to this information category.
I think that in most cases the Producers (and the DC) think to “target of preservation” is the CDO (not the CDO as well as the RepInfo). They expect RepInfo to understand the CDO, but I don’t think most people are really considering that to be the target of the preservation.
Some examples of this, if you migrate the CDO from a particular format (which has an associated standards document describing that base format) to a new, more modern format (which has an associated standards document describing that base format), I don’t think most people would care if the standards document (or maybe the software decoding it) describing the original format is preserved.
Again, the above comment reflects a particular implementation perspective and can not refute the obvious fact that information that has been encoded is useless unless the decoding information (e.g., Representation Information) is available. For your example, the decoding information is being preserved elsewhere - e.g., by some standards organization. The reality is not up for debate - it is just a question of who is doing the preservation of the Representation Information and whether it is remaining adequate. We know standards evolve and, for example, there are multiple versions of PDF/A in use. An archive relying on PDF/A should know which versions it is using and where, and should maintain appropriate associated PDI.
3. As you’ve agreed, there is utility in a model that takes a Data Object central approach to it’s preservation as I described below and which I called a “Preserved Data Object” model. Adding this to an Annex with a proper discussion, and as normative if you’d like, together with maintaining the current information model views augmented with making PDI recursive, provides retention of all the good and historic OAIS information modeling and terminology while allowing for the addition of this alternative view for an enhanced perspective. It is usually productive to have more than one way to view a topic. I see this as a win-win situation and it should remove the Content Information controversy.
As I wrote previously – what we need for logical consistency if SC222 as agreed previously, and a normative annex to explain the issue more fully.
I believe I have logically and practically refuted SC222 and the only reason I can think these implementation concerns have intruded is because several participants are very active in the auditing efforts.
EXAMPLE IMPLEMENTATION: PDI APPLICATION TO CONTENT INFORMATION
A couple of participants have taken the view that the application of PDI to Content Information is not implementable or is impractical, despite arguing that conceptually PDI is important for all information being preserved. If it is a valid concept, which it surely is, then some form of it must be implementable as long as one takes a reasonable and practical approach as any Archive must do. Perhaps it is the potentially recursive nature of Representation Information and PDI, along with the possibility that PDI may need Representation Information, etc. that has some concerned. Real implementations, apart from conceptual applications that seem to never end, will deal with these recursions in practical ways. Here I offer one approach, part of which was implemented at the former National Space Science Data Center and managed by John, now augmented to include PDI and its potential recursion in a practical fashion.
The Content Information of interest is set of NASA satellite observations of local magnetic field values during a period of one minute, each minute. It has been provided to the OAIS as a set of digital files (Content Data Objects, or CDOs) along with a Representation Information component (RP1) addressing the format of a CDO file and a component (RP2) addressing the meaning of the structural fields identified in RP1. These components are in PDF/A format and contain English descriptions where the English criteria, as Archive policy, are also described in a PDF/A file. PDF/A hardware and software is included to terminate representation networks. The CDOs and the Representation Information components all have associated PDI files, apart from the English criteria file as it is judged to be unnecessary. The PDI files contain their own PDI components of Context, Provenance, and Rights information, and are thus ‘self-recursive’. Fixity checksums are maintained in associated databases, one for the CDOs and one for the Representation and PDI information. All files have object identifiers in their file names. A schematic of the objects, object identifiers, and the relationships are show in the PDF file attached below this text. The resulting scheme shows a set of AIP components (but without all the Packaging Information), with PDI applicable to the CDOs, Representation Information, and the PDI itself.
A schematic of the above implementation is included in this attachment:
MOIMS-DAI mailing list
MOIMS-DAI at mailman.ccsds.org <mailto:MOIMS-DAI at mailman.ccsds.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the MOIMS-DAI