[Moims-rac] Re: Proposed amendment to B1.3

Cal Lee callee at ils.unc.edu
Wed Nov 14 08:04:37 EST 2007


> The repository's written standard operating procedures and actual
> practices must ensure the digital objects are obtained from the expected
> source, that the appropriate provenance has been maintained *prior to
> submission* [delete: and that the objects are the expected objects].

I think this might be overly prescriptive about implementation.  I could
foresee two (potentially legitimate) types of scenarios that would
violate this:

- An individual submits the SIP and his/her credentials are then
obtained after the fact.  In cases when the repository wants to make the
barrier to entry very low or there is a risk of data being lost if it is
not immediately submitted, ex-poste capture of provenance information
could be preferable to requiring producers to provide all provenance
information in advance.

- Numerous collecting approaches that are based on crawling networks of
resources (e.g. broad web capture) will almost necessarily require a lot
of provenance checking, verification and documentation to happen after
getting the SIPs.  A repository can set up the set of seeds or queries
that it will use to capture content, making educated guesses about the
sources of the materials that come back.  But much of this will need to
be determined ex-poste, such as recognizing cases of redirects, changes
in domain ownership, and embedded files from other sources.

Perhaps there's a way to reword this so that repositories obtain some
minimal set of provenance information before submission, but recognizing
that a trustworthy repository could also have very legitimate reasons to
provide much of the provenance information and substantially correct,
supplement or enhance the provenance information after the submission
has taken place.

- Cal Lee




More information about the Moims-rac mailing list