What's the end goal / big picture? Search federation or enterprise search? This sounds a lot like CMIS, which seemed to fizzle out after a few years of hype.
As for accessing a document, there really isn't a way. You could store the object name, plus the byte offset and length, but it would still be compressed in a proprietary fashion, so you'd still have to call CMOD anyway.
You absolutely *can* use globally unique document IDs or document hashes as metadata for searching for a *specific* document, but that actually just adds to the overhead, it doesn't reduce it.
-JD.