[Moims-dai] RE: LTDP document
John Garrett
garrett at his.com
Tue Jan 19 05:17:53 UTC 2016
Hi,
Without having read the new version, I think at this point we have to decide what color book we are generating. Is this a Magenta Book which will provide actual requirements for which we expect conformance or will it be a tutorial Green Book that is just providing examples. I’ll have to check the CCSDS Publication Manual again (in case they’ve changed it again), but I think Tom will object to having examples in the main text of a MB. So those will probably have to be moved into an annex.
Our intention has been that we will make use of OAIS terminology (and extend it as needed). There were some differences of terminology in the originating LTDP material and part of the conversion was supposed to be changing the terminology to OAIS terminology.
I think the LTDP had more discussion of how the stages were developed (I think we had more of that in earlier version of the CCSDS document, but we’ve discarded too much of it now.)
Yes, as part of our adopting OAIS terminology, we do need to make clear the distinction between the Designated Community and other users (including the project members if they are not part of the Designated Community.)
Wishing you prosperity and peace,
-JOhn
From: David Giaretta [mailto:david at giaretta.org]
Sent: Tuesday, January 12, 2016 3:54 AM
To: 'John Garrett' <garrett at his.com>; 'MOIMS-Data Archive Ingestion' <moims-dai at mailman.ccsds.org>; 'Boucon Daniele' <Daniele.Boucon at cnes.fr>; 'Mike Martin' <tahoe_mike at sbcglobal.net>; 'D or C Sawyer' <Sawyer at acm.org>; 'Mark Conrad' <mark.conrad at NARA.GOV>; 'Robert Downs' <rdowns at ciesin.columbia.edu>
Subject: RE: LTDP document
Hi John
I tried to put the project (i.e. large project) oriented material (mostly Mike’s text) in separate blocks under the heading “Example”. The thought was that we could then either add other examples or else move such text to an Annex.
I also tried to:
1) Make it clear that the terminology comes from OAIS – many of Rosemarie’s comments were about the distinction between data and information and I believe that OAIS provides that distinction
2) Add explanations/rationale for the way we split the stages etc
3) Align the terminology with OAIS e.g. change Mike’s “Descriptive Information” to “Package Description Information”
4) Make a clear distinction between Designated Community and other users – exploitation of the data could involve those other users
I know the document I circulated is not “final” but I hoped that t least the use of terminology was clearer and the “large project” text was sufficiently separate.
What I did not do was to address the Annex of mapping LTDP to this document, but John has pointed out that there was some earlier work done on that. As I indicated above, one option for the “large project” text would be to put it in another annex.
Regards
..David
From: John Garrett [mailto:garrett at his.com]
Sent: 12 January 2016 05:09
To: 'MOIMS-Data Archive Ingestion' <moims-dai at mailman.ccsds.org <mailto:moims-dai at mailman.ccsds.org> >; 'David Giaretta' <david at giaretta.org <mailto:david at giaretta.org> >; 'Boucon Daniele' <Daniele.Boucon at cnes.fr <mailto:Daniele.Boucon at cnes.fr> >; 'Mike Martin' <tahoe_mike at sbcglobal.net <mailto:tahoe_mike at sbcglobal.net> >; 'D or C Sawyer' <Sawyer at acm.org <mailto:Sawyer at acm.org> >; 'Mark Conrad' <mark.conrad at NARA.GOV <mailto:mark.conrad at NARA.GOV> >; 'Robert Downs' <rdowns at ciesin.columbia.edu <mailto:rdowns at ciesin.columbia.edu> >
Subject: LTDP document
Hi,
Yes, I think the current version is still very project oriented. The intention of at least some of the participants was to make it more general, but the source document was from EU Earth Observation projects. I’ll send what I think is the most current version of their document (Rosemarie can let us know if there is a more recent document). I believe this version was also adopted by one of the working groups from the international Committee on Earth Observation (CEOS).
The approved CCSDS Project was standardize that document (see project description below).
The purpose of this recommendation is to provide a standard method structured as a complete process to formally define the steps and associated activities required to preserve digital information objects. The process thus defined along with the activities, is linked with the data lifecycle. This activity will work to standardize materials fed into the process by the EU Project - Long Term Digital Preservation (for Earth Science Data). It is likely that participants in EU Project will also participate in the CCSDS efforts on behalf of their Agencies.
But as you’ve seen our current document has changed quite a bit. We need to decide where we are going with the document going forward.
Wishing you Prosperity and Peace,
-JOhn
From: moims-dai-bounces at mailman.ccsds.org <mailto:moims-dai-bounces at mailman.ccsds.org> [mailto:moims-dai-bounces at mailman.ccsds.org] On Behalf Of Mark Conrad
Sent: Thursday, January 7, 2016 5:02 PM
To: MOIMS-Data Archive Ingestion <moims-dai at mailman.ccsds.org <mailto:moims-dai at mailman.ccsds.org> >; David Giaretta <david at giaretta.org <mailto:david at giaretta.org> >
Subject: Re: [Moims-dai] NASA Guidance (Records Schedule) for Project/Program Files
Hi Mike,
I think it would actually be more useful (and a whole lot easier) to simply make it clear that the information lifecycle described in the document is for a specific context (i.e., project/experiment). If you try to re-write the document for a more generic lifecycle it would be very difficult. The current document is far too prescriptive in terms of workflow/responsibilities for all of the different contexts that records/information/data are created under.
As I said before, I think the document would be very useful for the specific context. I just think the document scope should be qualified to indicate the context in which it can be applied.
Mark
Mark Conrad
NARA Information Services/Applied Research
IXA
The National Archives and Records Administration
Erma Ora Byrd Conference and Learning Center
Building 494 Second Floor
610 State Route 956
Rocket Center, WV 26726
Phone: 304-726-7820
Fax: 304-726-7802
Email: mark.conrad at nara.gov <mailto:mark.conrad at nara.gov>
http://www.facebook.com/NARACAST
http://www.archives.gov/applied-research/
Twitter: @lmc1990
On Thu, Jan 7, 2016 at 4:12 PM, Mike Martin <tahoe_mike at sbcglobal.net <mailto:tahoe_mike at sbcglobal.net> > wrote:
Hi Mark
Thanks for your comments. In the "much more generic lifecycle framework" would all the topics still apply? If so, then maybe the paper can be worded to be more inclusive and to make sure that an individual could see that he/she was the "project", and that sometimes the "sponsor" would be one's boss or oneself.
Thanks, Mike
On 1/7/2016 9:58 AM, Mark Conrad wrote:
Hi Mike,
I am also the one responsible for generating the action item from the
December 22nd meeting as well. As an archivist I am used to a much more
generic information lifecycle framework. Archivists and records managers
use more generic frameworks because we have to deal with
records/information/data that are created in many different contexts.
For example, records/information/data are created in many organizations
on a daily basis in contexts that don't have someone in a formal role of
sponsor. Records/information/data are also generated outside the context
of a particular project.
I guess my main objection was that the title of Information Lifecycle
Framework was not sufficiently qualified to distinguish it from more
generic frameworks like those used by archivists and records managers.
The document as it currently exists could be entitled something like,
Information Lifecycle Framework for Major Projects/Experiments.
I think the document would be very useful in this qualified context.
Many archivist or records managers can tell you horror stories about
receiving calls like, "We have shut down this experiment/project/system,
do you want any of the information." The archivist ends up doing "data
archaeology" trying to see what can be salvaged. Having information
reuse considered from the initiation of a project would make our lives
so much easier - not to mention making the results of the work
accessible and usable to a much wider audience.
Hope this helps explain where my comments come from.
Mark
Mark Conrad
NARA Information Services/Applied Research
IXA
The National Archives and Records Administration
Erma Ora Byrd Conference and Learning Center
Building 494 Second Floor
610 State Route 956
Rocket Center, WV 26726
Phone: 304-726-7820 <tel:304-726-7820>
Fax: 304-726-7802 <tel:304-726-7802>
Email: mark.conrad at nara.gov <mailto:mark.conrad at nara.gov> <mailto:mark.conrad at nara.gov <mailto:mark.conrad at nara.gov> >
http://www.facebook.com/NARACAST
http://www.archives.gov/applied-research/<http://www.archives.gov/ncast/>
Twitter: @lmc1990
On Tue, Jan 5, 2016 at 6:01 PM, Mike Martin <tahoe_mike at sbcglobal.net <mailto:tahoe_mike at sbcglobal.net>
<mailto:tahoe_mike at sbcglobal.net <mailto:tahoe_mike at sbcglobal.net> >> wrote:
Hi Mark and others
On 1/5/2016 11:52 AM, Mark Conrad wrote:
Second, the schedule identifies 8 stages of a project - Formulation,
Approval, Design Development, Manufacture, Fabrication and
Assembly, Pre-launch System Integration and Verification,
Implementation
and Operations, Observational Data, and Evaluation and Termination.
Related to this, there was an action item from the meeting on the
22nd of Dec:
Action: clarify why we need another lifecycle
I spent many hours going through all the lifecycles in:
http://www.pnamp.org/sites/default/files/data_life_cycle_models_and_concepts.pdf
and looking at other archiving documents provided a summary in late
2014 for the DAI group which is included below.
Most lifecycles don't really consider the interactions of the three
participants (sponsor/project/archive). I wanted our lifecycle to
point out the importance of the sponsor and archive being involved
in the initiation of the project and then to point out the need for
bringing in requirements and tools to the specify and design
stages. The Exploitation activities aren't covered in most
lifecycles. I didn't think that all the themes in the LTDP (PDSC
definition and appraisal, archive operation and organization,
security, ingestion, maintenance, access and interoperability,
exploitation and reprocessing, purge prevention) were applicable to
this document so came up with a shorter list of activities.
Another thing to mention, the topics/issues came from a list David
provided from his work on the Active Data Management Plan, plus
evaluation of all the LTDP Common Guidelines, plus evaluation of all
the activities in the PAIMAS standard, plus looking at the ESDIS
Earth Science Content Specification, plus other issues that group
members raised.
-------------------------------------------------------------------
Nov 20, 2014
Hi Everyone
I've gone through all the reference documents we have seen and the
articles in our bibliography and tried to summarize the unique life
cycles that are presented. Here are some summaries with more
details below:
David's: Planning and Creation Stage->Consolidation Stage->Long
Term Preservation Stage->Adding Value, Re-Use and Sustainability
LTDP: Consolidation->Implementation->Operations
OAIS+: Planning->Collection->Analysis->Packaging->Ingest->Data
Management->Archival Storage->Access->Preservation Planning
DCC: Conceptualize->create or receive->appraise and
select->ingest->preservation action-> store->access->use and
reuse->transform
USGS: Plan->Acquire->Process->Analyze->Preserve->Publish/Share
SDMW: Plan->Collect->Integrate and transform->Publish->Discover and
inform->Archive or discard
DataOne;
Collect->Assure->Describe->Deposit->Preserve->Discover->Integrate
->Analyze
DMF: Planning and Production->Data Management Activities
->Dissemination->Usage Activities
Can we come up with an optimal set of categories based on all these
various views?
Thanks, Mike
More detail from the various documents:
1. The LTDP preservation workflow includes:
Initialization (appraisal, define designated community,
specification of preservation/curation requirements, consolidation
procedure, tailoring content, consult with community, cost and risk
assessment),
Consolidation (implement consolidation, gather missing content and
update), Implementation (data ingestion and catalog generation,
dissemination),
Operations (operations and maintenance, curation and stewardship -
adding value).
2. The OAIS model includes Ingest, Data Management, Archival
Storage, Access, Management and Preservation Planning. It is
missing Planning (meaning enterprise planning), Collecting (Mission
Operations, building and running the enterprise), Analyzing
(producing knowledge) and maybe Packaging. All these occur prior to
OAIS, but OAIS should be involved. Consolidation could be part of
Ingest or possibly an separate activity outside the OAIS. Adding
Value could be part of or a combination of Preservation Planning or
Access. This model syncs up with RASIM which builds advanced
information management objects in terms of five services which
correlate with OAIS components, archive service (ingest), repository
service (archival storage), registry service (data management),
product service (access plus archival storage), and query service
(access plus data management).
3. The Data Curation Centre life cycle includes conceptualize,
create or receive, appraise and select (with potential to dispose),
ingest, preservation action (migrate or reappraise), store, access,
use and reuse, transform (with potential to migrate).
4. The NOAA Environmental Data Life Cycle Functions include
planning new systems, then stewardship which includes observing
operations, archive, access, use. Overarching themes are
governance, requirements management, architecture management,
security; developing rich metadata; and mechanisms for user and
requirements and feedback. Each of the major categories has many
sub-activities.
5. The Global Change Science Requirements for Long-Term Archiving
Workshop (USGCRP) identified the following components: User
Involvement, Data Administration, Documentation, Data Ingest and
Verification
Data Preservation and Maintenance, Data Processing/Reprocessing,
Data Access and User Support.
6. The USGS Life Cycle includes Plan, Acquire, Process, Analyze,
Preserve, Publish/Share with three activities running through all
phases: Describe (Metadata and Documentation), Manage Quality,
Backup and Secure.
7. The ESA Heterogenous Missions Accessibility Report really
focuses on data access and not the other phases.
8. The Harnessing the Power of Digital Data: Taking the Next Step,
Science Data Management Workshop report provides a number of models:
FGDC life cycle: Define, Inventory/Evaluate, Obtain, Access,
Maintain, Use/Evaluate, Archive.
Linear data lifecycle: Plan, Collect, Integrate and Transform,
Publish, Discovery with two activities running through all phases,
Governance and Stewardship and Communications.
Basic science model: plan, collect, integrate and transform,
publish, discover and inform, archive or discard.
The topics that are identified in the report include: data
governance, stewardship, sharing, access, security, version control,
metadata management, content and format, document and content
management, preservation, transfer of responsibility, data
architecture, database operations management, reference and master
data management, data warehousing and business intelligence, data
quality management, provenance, usability, value added services,
workflow systems.
9. The LPDAAC Lifecycle Plan identifies the phases: Inception,
Active Archive, Long-Term Archive which each have four elements,
characterization, critical data and information, applicable
standards, transition.
The WBS is broken into phases, inception-planning (embed in producer
team, provide data management plan), inception-production (laison to
science stakeholders, collection inception checklist, support
production, repeat experiment, determine approach to tools/services,
authorize to migrate, provide NASA data template), active archive
transition from producer (obtain authorization to migrate, plan
migration, install new product line, migrate, advertise new
products, assume primary access and discovery role), active archive
transition to long-term (obtain authorization to migrate, plan
migration), long term archive transition to long-term (enable
migration, execute migration, advertise new products, transfer
primary access and discovery role, obtain authorization for
certification, sunset products).
10. DataOne includes Collect, Assure, Describe, Deposit, Preserve,
Discover, Integrate, Analyze
11. Jeff de La Beaujardičre's Data Management Framework
Planning and Production (Requirements Definition, Planning,
Development, Deployment, Operations);
Data Management Activities (Collection, Processing, Quality Control,
Documentation, Dissemination, Cataloging, Preservation, Stewardship,
Usage Tracking, Final Disposition);
Usage Activities (Discovery, Reception, Analysis, Product
Generation, User Feedback, Citation, Tagging, Gap Analysis).
_______________________________________________
Moims-dai mailing list
Moims-dai at mailman.ccsds.org <mailto:Moims-dai at mailman.ccsds.org> <mailto:Moims-dai at mailman.ccsds.org <mailto:Moims-dai at mailman.ccsds.org> >
http://mailman.ccsds.org/mailman/listinfo/moims-dai
_______________________________________________
Moims-dai mailing list
Moims-dai at mailman.ccsds.org <mailto:Moims-dai at mailman.ccsds.org>
http://mailman.ccsds.org/mailman/listinfo/moims-dai
_______________________________________________
Moims-dai mailing list
<mailto:Moims-dai at mailman.ccsds.org> Moims-dai at mailman.ccsds.org
<http://mailman.ccsds.org/mailman/listinfo/moims-dai> http://mailman.ccsds.org/mailman/listinfo/moims-dai
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ccsds.org/pipermail/moims-dai/attachments/20160119/3c27373e/attachment.html>
More information about the MOIMS-DAI
mailing list