[Moims-dai] Desirable preservation ecosystem characteristics

Mark Conrad mark.conrad at nara.gov
Wed Jul 11 19:50:19 UTC 2018

"One example: many organisations are still struggling with the surprises
they get when migrating file formats, why can’t we have tools that
understands the peculiarities of the file formats and can deal with that. "

To add a little detail to this example, here are a few specific issues
related to file format migrations:

Many formats are proprietary so we are not allowed to see the inner
workings of files created in such a format.

"Open" standards for file formats often allow "extensions" that can lead to
surprises when you migrate files written to the standard (e.g., TIFF 6
files with local extensions).

Many formats are actually containers that can include bit streams in
multiple formats (e.g., Word files, PDF and PDF/A files).

Some "formats" (e.g., ESRI shape files) actually require data to be stored
in multiple files in different formats to create a single "file."

Existing tools can only accurately* identify *(i.e., by something other
than file name extension) a very small percentage of the tens of thousands
of formats currently in use.

Existing tools can only *validate* (i.e., determine if a file is
well-formed according to the specification allegedly used to create the
file) an even smaller percentage of the formats currently in use.

This is not an exhaustive list.

Mark Conrad
NARA Information Services
Systems Engineering Division (IT)
The National Archives and Records Administration
Erma Ora Byrd Conference and Learning Center
Building 494, Room 225
610 State Route 956
Rocket Center, WV  26726

Phone: 304-726-7820
Fax: 304-726-7802
Email: mark.conrad at nara.gov

On Wed, Jul 11, 2018 at 3:38 AM, Barbara Sierman <Barbara.Sierman at kb.nl>

> Dear Mike,
> On my list would be:  practices that have proven to lead to a result that
> is really robust enough to be called a long term solution.
> One example: many organisations are still struggling with the surprises
> they get when migrating file formats, why can’t we have tools that
> understands the peculiarities of the file formats and can deal with that.
> met vriendelijke groet / kind regards,
> Barbara Sierman
> Digital Preservation Consultant, Research Department
> Innovation & IT Division
> E barbara.sierman at kb.nl
> <https://webmail.kb.nl/owa/redir.aspx?C=JbefRmj98kalS1V9O-qez1h1EJxAhNAI-VIqna3LR4x6t2ne8xMZCn63aHmEa9bxppapc1F7Nrg.&URL=mailto%3abarbara.sierman%40kb.nl>
> T +31 70 314 01 09
> *Blogs on digitalpreservation.nl <http://digitalpreservation.nl/>*
>   [image: Koninklijke Bibliotheek, National Library of the Netherlands]
>   Prins Willem-Alexanderhof 5 | 2595 BE Den Haag
> Postbus 90407 | 2509 LK Den Haag | (070) 314 09 11 | www.kb.nl
> English version <https://www.kb.nl/en/email> | Disclaimer
> <https://www.kb.nl/disclaimer>
>   <https://www.facebook.com/koninklijkebibliotheek/>
> <https://twitter.com/KB_Nederland>
> <https://www.facebook.com/koninklijkebibliotheek/>
> <https://twitter.com/KB_Nederland>
> <https://www.linkedin.com/company/164625>
> <https://www.instagram.com/koninklijkebibliotheek/>
> *Van:* MOIMS-DAI <moims-dai-bounces at mailman.ccsds.org> * Namens *Mike
> Kearney
> *Verzonden:* dinsdag 10 juli 2018 22:36
> *Aan:* 'MOIMS-Data Archive Interoperability' <moims-dai at mailman.ccsds.org>
> *Onderwerp:* [Moims-dai] Desirable preservation ecosystem characteristics
> Vint Cerf is a preservation evangelist, and in discussions and
> presentations he makes, he’s considering adding this topic:  Desirable
> preservation ecosystem characteristics.  The idea is that if we know what
> we want the preservation environment to be 50 or 100 years from now, we
> will have be better equipped to chart a course (at least the first steps)
> to get there.  Vint asked me my opinion on that, and I’m asking the DAI WG
> community.
> I’m doing this off the top of my head right now, so here goes my first cut
> at it:
> Trustworthiness of archives:
> ·         A majority of archives that are certified trustworthy.
> ·         Broad community agreement on what constitutes trustworthiness
> and the process for certifying it.
> ·         Global prioritization of the need for certification of key
> archives.
> Interoperability of archives:
> ·         Well understood and widely available user interface systems
> that function broadly for multi-archive access.
> ·         Archive-to-Archive communications that are easily and quickly
> configured.
> ·         Distributed archives that allow transparent access to all
> preserved data within their distributed “realm.”
> Access to archives (besides above):
> ·         (From the RDA) Data sharing without boundaries.  Social and
> technical infrastructure that enables open sharing of data.
> ·         Well-understood and easily-waiverable system of Intellectual
> Property management – for example, a system to quickly secure rights to
> download YouTube videos by national archivists.
> ·         Effective archive security that does not interfere with
> authorized user access.
> Future tech:
> ·         Storage capacity:  A better, faster, cheaper More’s Law.
> ·         AI Access to preservation archives; AI helpers that can assist
> archive users whether they are part of the Designated Community or not.
> ·         Auto-preservation…  Systems that automatically compile the *
> *correct** metadata (Representation Info) as object data is generated.
> ·         AI systems that can generate warnings like “Uh oh… you’re not
> compiling the correct metadata!”
> What’s on your list?
>    -=- Mike
> Mike Kearney
> Huntsville, Alabama, USA
> _______________________________________________
> MOIMS-DAI mailing list
> MOIMS-DAI at mailman.ccsds.org
> https://mailman.ccsds.org/cgi-bin/mailman/listinfo/moims-dai
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20180711/54054eca/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Facebook20px_9dd7d952-1ada-4a1d-8dbf-5aa30ec2f12b.png
Type: image/png
Size: 16980 bytes
Desc: not available
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20180711/54054eca/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: In-2C-21px-R_5e254eac-04d0-4b80-acbd-4eeed04bef88.png
Type: image/png
Size: 1266 bytes
Desc: not available
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20180711/54054eca/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Twitter20px_8d537f87-058b-4083-8b42-bef30889a763.png
Type: image/png
Size: 16598 bytes
Desc: not available
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20180711/54054eca/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: instagram20px_efaebc95-7a10-4502-8c38-8cb641eae7bb.png
Type: image/png
Size: 16198 bytes
Desc: not available
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20180711/54054eca/attachment-0003.png>

More information about the MOIMS-DAI mailing list