File-basedstorageofDigitalObjectsandconstituentdatast.ppt
文本预览下载声明
Disclaimer The term Digital Object (DO) will be used as in Kahn/Wilensky: Compound object Multiple datastreams of different mime types Secondary information pertaining to object and datastreams Identifiers for object (and datastreams) This is ~ OAIS Content Information XML-based representation of DOs Growing interest in XML-based representation of DOs in Digital Library architectures: Platform-independence, Industry-support Longevity, potential migration paths Processing tools, validation capabilities XML-based Compound Object formats: ISO/IEC 21000-2 MPEG-21 DID DIDL METS IMS/CP CCDS XFDU Typical functionality: By-Value (base64) and/or By-Reference provision of constituent datastreams By-Value and/or By-Reference provision of secondary information Provision of identifiers Storing XML-based representations of DOs Existing approaches: storage of the XML-representations as individual files in a file system: Poor access performance Poor backup performance storage of the XML-representations in (SQL, XML, object) databases Long term? Data are dependent on the underlying system storage of the XML-representations by concatenating many such documents into a single file such as tar or zip Not XML aware, hence, no use of off-the-shelf XML tools Increasing storage space (base64-encoding of the constituent datastreams) aDORe XMLtape/ARCfile solution Part of LANL aDORe repository effort: Standards-based, modular repository architecture Distributed architecture Protocol-based interactions between modules Usable to create interoperable federations of heterogeneous repositories Actual implementation of the architecture at LANL Components of aDORe software will be released Inspired by Internet Archive ARC file approach: File-based mechanism to store datastreams resulting from Web-crawling Concatenation of multiple datastreams into a single file Metadata as seperators between datastreams But not OK to store XML-based representations of DOs: Metadata capabilities very li
显示全部