Microsoft Compound File .net component - pure C# - netstandard 2.0
Structured Storage Xplorer
Introduction:
Structured Storage Xplorer is a .NET/C# library designed to manipulate Compound File Binary Format (CFBF) files. It provides developers with tools to work with structured storage in various file formats, including Microsoft Office documents, Outlook messages, and more.
Key Features:
Support for Large Files: Handles files up to 16 TB using major format version 4.
Transactional Operations: Allows committing or reverting changes to ensure data integrity.
Consolidation: Reduces on-disk size by removing free sectors.
Compatibility: Supports netstandard2.0 and net8.0 for broad compatibility.
Audience & Benefit:
Ideal for developers, professionals, and organizations needing to manipulate structured storage files. It offers enhanced productivity by simplifying complex file operations, ensuring data integrity, and optimizing disk space usage.
Installation Note:
Structured Storage Xplorer can be installed via winget, providing a seamless setup experience.
This tool empowers users with robust capabilities for handling compound files efficiently, making it an essential resource for developers working with structured storage formats.
Compound files include multiple streams of information (document summary, user data) in a single container, and is used
as the bases for many different file formats:
Advanced Authoring Format (.aaf)
Microsoft Office (.doc, .xls, .ppt)
Outlook messages (.msg)
Visual Studio Solution Options (.suo)
Windows thumbnails cache files (Thumbs.db)
OpenMcdf v3 has a rewritten API and supports:
An idiomatic dotnet API and exception hierarchy
Fast and efficient enumeration and manipulation of storages and streams
File sizes up to 16 TB (using major format version 4 with 4096 byte sectors)
Transactions (i.e. commit and/or revert)
Consolidation (i.e. reclamation of space by removing free sectors)
Nullable attributes
Limitations
Limited error tolerance/recovery
No support for single writer, multiple readers
No support for red-black tree balancing
Directory entries are stored in a perfect binary tree where the entries are sorted but the tree is not balanced. i.e.
the tree is "all-black", which is a valid red-black tree but has suboptimal performance for traversing large trees
(though still considerably faster than some other clients).
Clients such as LibreOffice create trees with red-violations, which OpenMcdf is tolerant to reading and writing.
Files with balanced red-black trees such as those created by Microsoft implementations will currently become unbalanced
upon adding or removing directory entries. Fortunately, since other clients are also tolerant of trees that are either
unbalanced or have red-violations, this should not be a major issue. The Wine implementation also has the same
limitation.
byte[] b = new byte[10000];
using var root = RootStorage.Create("test.cfb");
using CfbStream stream = root.CreateStream("MyStream");
stream.Write(b, 0, b.Length);
To open an Excel workbook (.xls) and access its main data stream:
using var root = RootStorage.OpenRead("report.xls");
using CfbStream workbookStream = root.OpenStream("Workbook");
To create or delete storages and streams:
using var root = RootStorage.Create("test.cfb");
root.CreateStorage("MyStorage");
root.CreateStream("MyStream");
root.Delete("MyStream");
For transacted storages, changes can either be committed or reverted:
using var root = RootStorage.Create("test.cfb", StorageModeFlags.Transacted);
root.Commit();
//
root.Revert();
A root storage can be consolidated to reduce its on-disk size:
root.Flush(consolidate: true);
Object Linking and Embedding (OLE) Property Set Data Structures
Support for reading and writing OLE Properties is available via the OpenMcdf.Ole package. However, the API is experimental and subject to change.
OlePropertiesContainer co = new(stream);
foreach (OleProperty prop in co.Properties)
{
...
}
OpenMcdf runs happily on the Mono platform and multi-targets
netstandard2.0 and
net8.0 to maximize client compatibility and support modern dotnet features.