[Moims-dai] FW: CDO text

garrett at his.com garrett at his.com
Tue Apr 16 05:29:02 UTC 2019


I’ll try one more pass at inserting comments

 

From: MOIMS-DAI <moims-dai-bounces at mailman.ccsds.org> On Behalf Of Mark Conrad
Sent: Friday, April 12, 2019 4:16 PM
To: MOIMS-Data Archive Interoperability <moims-dai at mailman.ccsds.org>; bambacher at verizon.net
Subject: Re: [Moims-dai] FW: CDO text

 

Didn't have time to get back to this earlier in the week. See my inserted comments below. 




Mark Conrad
NARA Information Services

Systems Engineering Division (IT)
The National Archives and Records Administration
Erma Ora Byrd Conference and Learning Center
Building 494, Room 225
610 State Route 956
Rocket Center, WV  26726

Phone: 304-726-7820
Fax: 304-726-7802
Email: mark.conrad at nara.gov <mailto:mark.conrad at nara.gov>  

 

 

On Tue, Apr 9, 2019 at 2:52 AM <garrett at his.com <mailto:garrett at his.com> > wrote:

Hi,

 

Responses interspersed below.

 

Peace and joy,

-JOhn

 

From: MOIMS-DAI <moims-dai-bounces at mailman.ccsds.org <mailto:moims-dai-bounces at mailman.ccsds.org> > On Behalf Of Mark Conrad
Sent: Thursday, April 4, 2019 11:43 AM
To: MOIMS-Data Archive Interoperability <moims-dai at mailman.ccsds.org <mailto:moims-dai at mailman.ccsds.org> >
Subject: Re: [Moims-dai] FW: CDO text

 

John,

 

Please see my comments interspersed in your text.




Mark Conrad
NARA Information Services

Systems Engineering Division (IT)
The National Archives and Records Administration
Erma Ora Byrd Conference and Learning Center
Building 494, Room 225
610 State Route 956
Rocket Center, WV  26726

Phone: 304-726-7820
Fax: 304-726-7802
Email: mark.conrad at nara.gov <mailto:mark.conrad at nara.gov>  

 

 

On Thu, Apr 4, 2019 at 3:34 AM <garrett at his.com <mailto:garrett at his.com> > wrote:

Hi,

 

I favor dropping that sentence.

 

One of my objections is that it is that I  would prefer that we would define things by their information model not by how  the particular object is used.

 

- This would defeat the entire purpose of this document from an archival perspective. A few of the key sentences expressing the intent of the authors - going back to the first version of the OAIS - are: "For example, archival science focuses on preservation of the ‘record’. This term is not used in the OAIS Reference Model, but one mapping might approximately equate it with ‘Content Information

within an Archival Information Package’ (see definitions below, as well as 2.2 and 4.2 for context)." The current proposed revision of OAIS changes Content Information to Content Data Object, but as I noted previously no archives I have personal experience with would consider a Content Data Object without Representation Information a record.

 

JGG:  I don’t understand why the change of PDI terms applying only to CDO rather than CI would nullify the whole OAIS document.   One main objection I keep hearing is that the change implies that an Archive can no longer preserve RepInfo.  The change is just that the target of the PDI term was proposed to be updated to be the CDO instead of Content Information.  Defining PDI as applying to Content Information did not mean that Fixity Information or Provenance Information did not need to be preserved as a normal course of action.  Defining PDI as applying to CDO does NOT mean that RepInfo does not need to be preserved.

 

MC: I believe you are missing the point. From an archival perspective you cannot preserve the original target of preservation (i.e., the Content Information) without preserving the CDO and the RepInfo. In order to preserve the RepInfo you must have PDI for the RepInfo in addition to the PDI for the CDO. You need to preserve the RepInfo to make the CDO understandable. Otherwise you would just have a string of bits with no way of interpreting them. To preserve the RepInfo you need evidence that the RepInfo is the correct RepInfo for understanding the CDO and that the RepInfo has not been tampered with since it was Ingested. To do these things you need PDI for the RepInfo in addition to the PDI for the CDO.

 

 

JGG:  I think I understand your point specifically that you cannot preserve the information that is intended to be preserved without preserving both the CDO and the associated RepInfo (and also the information that says they’re associated).

You also need to preserve other things to preserve the information intended to be preserved.  Such as the fixity for the CDO, the provenance for the CDO, etc.  You may also need to have fixity for some pieces of the Rep Info.  However, I don’t think it is as important to have fixity, provenance, etc. for all the pieces of RepInfo as it is to have fixity, provenance, etc. for the CDO.

 

As I’ve said, I think that Content Information (CDO + RepInfo) should be considered a target of preservation, but I still have problems with the intent of the word “original”.  

Several people have argued original means from the Producer.  I’ve argued that at least parts of the RepInfo may not come from the Producer.  And many other pieces of PDI may also come from the Producer.  What makes the portion of RepInfo from the Producer more important than portions of fixity or provenance that the Producer provides?  Those other pieces are also important to convince the DC of the information’s authenticity.

 

For me, original would be better interpreted to mean what information is the object of what is in the AIP and the primary (or first order) information that is intended be preserved.  The CDO is the source of why the RepInfo is created/supplied.  I would strive to preserve the CDO even if I lost part (or even all) the RepInfo with the hope that I could at some point rediscover the RepInfo.  Conversely, I would  not have much desire to continue to preserve RepInfo if I lost the CDO.  I don’t think there would be much likelihood of ever recovering the CDO.

 

 

I believe the intent of the original authors was to create a standard that would allow a number of different communities (Archivves, Libraries, science information centers, company repositories, military, etc.) to communicate about digital preservation so they could all learn best practices from the others.  Those communities sometimes used the different terms for the same things or used the same terms differently.  In such cases, the authors tried to pick terms that did not have as many preconceived ideas associated with them and tried to come to consensus on what they meant. MC: Agreed. This is stated in the text. The authors tried to find ???

JGG: Hmm.  I think the authors tried to find terms that were not overloaded with different meanings in different communities that we hoped would pay attention to the standard.

 

If you objection is primarily to the change in the one sentence giving examples of how the record equates to Content Data Object, I am fine with changing CDO back to Content Information in that instance. MC: That is not my primary objection. See above.  When I proposed that change as one of many updates that we agreed to last year that was one that I wasn’t sure of and I thought we specifically discussed it.  I certainly am OK with deferring to your judgement as to how the formal Archives community views that term.  But certainly there are differences in the definition of “record” which is why that term is not used in OAIS. 

I think there are many “archives” (but probably not National Archives) where every RepInfo object has full PDI for it. MC: Is there a "not" missing in this sentence? Most National Archives and other institutional archives that I am familiar with insist that RepInfo have full PDI.   

JGG: Yes, I believe National Archives (but probably not a lot of other repositories) require full PDI for the first layers of RepInfo.

However at some point the RepInfo network cuts off and there is no more RepInfo added.  Is there PDI at that last layer?  I suspect that full PDI is no longer provided in many cases for some of the last layers of RepInfo.  

 

I think the DC for many scientific “archives” are quite happy to just have RepInfo describing the CDO without necessarily caring if that RepInfo had PDI information for it.

 

MC: I hope this is not the case! If I gave a member of this DC a csv file and RepInfo that contained the wrong interpretation of the codes in the csv, I think they would be pretty unhappy.

JGG: That is a red herring.  No one is proposing that wrong RepInfo is provided. And many things beyond the CDO or Content Information need to be appropriately preserved in order to preserve the intended information.

The DC would also be upset if the an incorrect fixity value was provided.

The DC would also be upset if  incorrect Reference Info or incorrect provenance was provided.  

Why is it required that we provide PDI for RepInfo and not for the Fixity Info?

Maybe, we consider RepInfo the most important information associated with a CDO.  On the other hand others may consider having completely accurate Fixity Information more important than having completely accurate RepInfo. Others may be more interested in Provenance Information (e.g. lawyers in copyright suit or ownership dispute)  the most important.

 

 

 

I assume meant the current proposed revision of OAIS changes THE APPLICATION OF THE PDI TERMS FROM Content Information to CDO. MC: Yes.

We certainly aren’t trying to equate a CDO with Content Information. 

 

 

It also directly contradicts Section 4.2.1.4 Taxonomy of Information Object Classes Used by OAIS and its sub-sections. "This subsection builds on the discussions in 2.2 about the types of supporting information needed to enable Long Term Preservation and the discussion in the previous subsection on the role of Representation Information. The information modeling in this subsection discusses several types of Information Objects that are used in the OAIS. The objects are categorized by their content and function in the operation of an OAIS including Content Information objects, Preservation Description Information objects, Packaging Information objects, and Descriptive Information objects." Content Information has different content and function in an OAIS than PDI, Packaging Information, Descriptive Information, etc.

 

                JGG: I agree.  I wasn’t trying to say that there are not different types of Information Objects.

 

                I believe the difference in content and function have meaning primarily within a single AIP. MC: What is the basis for this belief? Can you show me some place in the text where it says content and function apply only within an AIP.   Surely identical (or substantially identical) Information Objects COULD be a Content Information in one AIP and RepInfo in another AIP. MC: Not sure how to parse this sentence. If you are saying that the Content Information in one AIP could serve as the RepInfo in another AIP, that could happen (e.g., Content Info in one AIP is an English-Arabic dictionary. This could serve as part of the RepInfo for another AIP.)

                All the listed types of Information Objects are defined by how they are used within an AIP. Does that mean that they should not be referred to outside an AIP.  Perhaps that would help clear up some understandings. MC: I have no idea how this makes any difference to whether or not RepInfo needs PDI.

 

   I would prefer to be able to call every combination of a Content Data Object and Representation  Information Content Information (even if that object wasn’t a target of preservation).

 

I assume you mean every combination of a Data Object and Representation Information. Every Content Data Object and its Representation Information is Content Information.  While you might prefer to treat all combinations of a Data Object and its Representation Information as Content Information, to do so contradicts many sections of the OAIS.

 

JGG: I think you are right that every combination of a Data Object and RepInfo is not Content Information.

Any Data Object with its RepInfo is an Information Object.

One type of Information Object is Content Information, which includes a Data Object, which also happens to be a Content Data Object.

MC: Agreed. 

                       

Additionally to address the other issue from the last telecon, we should probably address the new wording for the Producer definition.

“Producer: The role played by those persons or client systems that provide the information to be preserved. This can include internal or external OAIS persons or systems.” 

 

If people are now understanding that this indicates that the Producer can be inside an OAIS, then that contradicts all the diagrams we have that show the Producer as external to the OAIS.

 

I don't know who has this understanding. The proposed definition says, "The role played by those persons or client systems that provide the information to be preserved. This can include internal or external OAIS persons or systems."  The 2002 definition was, "The role played by those persons, or client systems, who provide the information

to be preserved. This can include other OAISs or internal OAIS persons or systems."  The 2012 definition is the same as the 2002 definition. Both these definitions say basically the same thing - an OAIS person or system can take on the role of Producer. It does not say that because an OAIS person or system takes on the role of Producer that the Producer is then internal to the OAIS.

 

To start with see Section 2.1 which starts with our simplest environment diagram and that includes the text,

“Outside the OAIS are Producers, Consumers, and Management.”

 

We have consistently said that anytime anything internal to an OAIS that takes on the role of Producer, then while it is executing that role, it is considered external to the OAIS.  That means it has to have Submissions Agreements, etc. and the data that it submits has to undergo checks.

 

I agree. I didn't hear anyone contradict these points.

 

 

JGG: I think we agree.

 

 

 

Peace and joy,

-JOhn

 

From: MOIMS-DAI <moims-dai-bounces at mailman.ccsds.org <mailto:moims-dai-bounces at mailman.ccsds.org> > On Behalf Of Robert Downs
Sent: Tuesday, April 2, 2019 4:19 PM
To: MOIMS-Data Archive Interoperability <moims-dai at mailman.ccsds.org <mailto:moims-dai at mailman.ccsds.org> >
Subject: Re: [Moims-dai] FW: CDO text

 

Please let me add my voice of agreement to Don's point and Mark's agreement that Content Information should be defined as "the original target of preservation".

 

Thanks,

 

Bob

Robert R. Downs, PhD
Senior Digital Archivist and Senior Staff Associate Officer of Research
Acting Head of Cyberinfrastructure and Informatics Research and Development
Center for International Earth Science Information Network (CIESIN),
The Earth Institute, Columbia University
P.O. Box 1000, 61 Route 9W, Palisades, NY 10964 USA
Voice: 845-365-8985; fax: 845-365-8922
E-mail: rdowns at ciesin.columbia.edu <mailto:rdowns at ciesin.columbia.edu> 
Columbia University CIESIN Web site: http://www.ciesin.columbia.edu
ORCID: 0000-0002-8595-5134

 

 

On Tue, Apr 2, 2019 at 3:30 PM Mark Conrad <mark.conrad at nara.gov <mailto:mark.conrad at nara.gov> > wrote:

Well said, Don. I have become the lone voice at our weekly meetings defending Content Information as the information originally provided by the Producer. 




Mark Conrad
NARA Information Services

Systems Engineering Division (IT)
The National Archives and Records Administration
Erma Ora Byrd Conference and Learning Center
Building 494, Room 225
610 State Route 956
Rocket Center, WV  26726

Phone: 304-726-7820
Fax: 304-726-7802
Email: mark.conrad at nara.gov <mailto:mark.conrad at nara.gov>  

 

 

On Tue, Apr 2, 2019 at 2:47 PM D or C Sawyer <Sawyer at acm.org <mailto:Sawyer at acm.org> > wrote:

Hi David,

 

There is no valid reason to change the Content Information definition from ‘the original target of preservation’ to ‘a target of preservation’.  This has been there from the beginning to make clear it is referring to the original information provided by the Producer and not just any information within the Archive.  This is important to ensure that when people are discussing preservation within an OAIS context, everyone understands this is the information originally provided by the Producer.  

 

Of course we understand why you want to make this change because you still want to move Content Information to being ANY information in the Archive so you can claim the whole process is recursive by definition ‘and been there from the beginning'.  This would be a radical, ambiguity enhancing,  change to the long standing common understanding of the OAIS information and functional modeling.  For these reasons this proposed change should be thoroughly rejected by the group.

 

Cheers-

Don

 

 

 

On Apr 2, 2019, at 10:06 AM, David Giaretta <david at giaretta.org <mailto:david at giaretta.org> > wrote:

 

 

 

From: David Giaretta < <mailto:david at giaretta.org> david at giaretta.org> 
Sent: 02 April 2019 13:12
To: 'Mark Conrad' < <mailto:mark.conrad at nara.gov> mark.conrad at nara.gov>; 'John Garrett' < <mailto:garrett at his.com> garrett at his.com>
Subject: CDO text

 

My suggestions for the additional text:

 

Change in current draft from:

Preservation Description Information (PDI): The information, which along with Representation Information, is necessary for adequate preservation of the Content Data Object and which can be categorized as Provenance Information, Context Information, Reference Information, Fixity Information, and Access Rights Information.

Note: Defining PDI (as well as its components - Provenance Information, Context Information, Reference Information, Fixity Information, and Access Rights Information) as relevant to the Content Data Object does not mean that those concerns are any less important for other data objects or at other levels, for example, it is important to apply reference, fixity, provenance, context and access rights to Representation Information, or to any other information the Archive is preserving. Definition of these terms as relevant to the Content Data Object is simply to ease discussion of these concepts at the Content Data Object level.

 

To:

Preservation Description Information (PDI): The information, which along with Representation Information, is necessary for adequate preservation of the Content Data Object and which can be categorized as Provenance Information, Context Information, Reference Information, Fixity Information, and Access Rights Information.

Note: Defining PDI (as well as its components - Provenance Information, Context Information, Reference Information, Fixity Information, and Access Rights Information) as relevant to the Content Data Object does not mean that those concerns are any less important for other data objects or at other levels, for example, it is important to apply reference, fixity, provenance, context and access rights to Representation Information, or to any other information the Archive is preserving. Definition of these terms as relevant to the Content Data Object is simply to ease discussion of these concepts at the Content Data Object level.

 

I suggest deleting the last sentence because it does not make sense to me.

 

Change

 

Content Information: A set of information that is the original target of preservation. It is an Information Object composed of its Content Data Object and its Representation Information.

To

Content Information: A set of information that is a the original target of preservation. It is an Information Object composed of its Content Data Object and its Representation Information.

There are a few other places the change “the original” to “a” would also be needed.

 

Add to the end of section 4.2.1.4 Taxonomy of Information Object Classes Used by OAIS

Content Information is any Information Object which is being preserved by the Archive. 

 

I don’t know an easy way to put this into the UML diagram.

 

Also add the end of section 4.2.1.4.1 Content Information: 

Any Information Object being preserved by the Archive, such as Representation Information, PDI etc., may also be considered to be Content Information.

 

..David

 

 

 

_______________________________________________
MOIMS-DAI mailing list
 <mailto:MOIMS-DAI at mailman.ccsds.org> MOIMS-DAI at mailman.ccsds.org
 <https://mailman.ccsds.org/cgi-bin/mailman/listinfo/moims-dai> https://mailman.ccsds.org/cgi-bin/mailman/listinfo/moims-dai

 

_______________________________________________
MOIMS-DAI mailing list
MOIMS-DAI at mailman.ccsds.org <mailto:MOIMS-DAI at mailman.ccsds.org> 
https://mailman.ccsds.org/cgi-bin/mailman/listinfo/moims-dai

_______________________________________________
MOIMS-DAI mailing list
MOIMS-DAI at mailman.ccsds.org <mailto:MOIMS-DAI at mailman.ccsds.org> 
https://mailman.ccsds.org/cgi-bin/mailman/listinfo/moims-dai

_______________________________________________
MOIMS-DAI mailing list
MOIMS-DAI at mailman.ccsds.org <mailto:MOIMS-DAI at mailman.ccsds.org> 
https://mailman.ccsds.org/cgi-bin/mailman/listinfo/moims-dai

_______________________________________________
MOIMS-DAI mailing list
MOIMS-DAI at mailman.ccsds.org <mailto:MOIMS-DAI at mailman.ccsds.org> 
https://mailman.ccsds.org/cgi-bin/mailman/listinfo/moims-dai

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20190416/52f45cd6/attachment-0001.html>


More information about the MOIMS-DAI mailing list