Sunday, March 08, 2009

CMIS - EMC's role and vision for the future

First off, what on earth does CMIS stand for and why should any content management person care? Here's the easy part, what it stands for: "Content Management Interoperability Services." What is promises is a way for customers (vendors, and others) to begin allowing useful sharing of content between different vendor repositories. That is a huge thing, since right now most companies have several, maybe hundreds (and maybe they don't even know how many) different document repositories they have under their enterprise roof.

To write my column on this subject ("Building Content Bridges") I interviewed EMC and Day software. The former one of the original writers of the specification; the latter a vendor that is keenly supportive of content management standards. The following notes are taken from my EMC interview.

On the 23 rd of October, 2008, I spoke with two representatives from EMC about the emerging standard CMIS: Patricia Anderson, Sr. Marketing Manager, Documentum Platform Marketing, Content Management & Archiving and Dr. David Choy, Sr. Consultant. "CC" below refers to my comment on statements in the interview -- "Content Curmudgeon."

I was curious about the timeline for CMIS to be implemented (assuming it succeeds), and why CMIS is important either to EMC or to the content management space in general. Following are my notes from that interview.

Dr. Choy: Nobody knows how long the process will take, but about a year or more for a full-fledged standard. There were eight companies participating with validating the current version of the CMIS spec for interoperability (IBM, EMC, Microsoft and five others). The eight proved that the spec could be used to assure interoperability. After that the team sent the proposed standard to OASIS. The formal process for discussing the standard takes time, but in the meantime for EMC we intend to make the prototype available for the public to play with.

Security has administrative issues (mechanisms proprietary to each vendor) and also in the runtime space; security policies reign. CMIS security and access control is out of scope at this point. Each vendor has its own security model. In the near term, that is outside the scope of CMIS. Security policy is now reduced to the lowest common denominator (CRUD), but every vendor supports those.


CC: By CRUD, Dr. Choy means the basic four operations, Create, Read, Update or Delete. Every content management system provides at minimum those same operations. How they determine who can do those things is a separate issue, and CMIS assumes each system manages its own security in its own way. If the administrator of a CMIS-compliant system gives you one of these rights, then from your own CMIS-compliant system you can access and perform operations on content in that system.


Patricia: One of the questions is “ what caused the need for this standard in the first place?” But organizations would set up more than one repository platform, perhaps departments or as the result of M&As. We realized that it was difficult getting to this other information. This also hampered development that was cross-divisional or cross-platform. Then with Web 2.0 mashups, it became even more difficult to leverage use of information. ECM folks realized that it was a hindrance that affected all vendors. We looked at different standards but wanted a standard that was platform-agnostic and services-based, to unlock information in different repositories. Serious discussions began in October 2006. Other committees like IECM tried to develop such standards, but they needed to start fresh.”

David: iECM is an AIIM consortium that tried to create something similar to CMIS. That group wasn’t set up for highly technical interoperability standards. Very little concrete results occurred. iECM is still looking at best practices and standards, not technical areas.


CC: Clearly you need both and without either there is no bridge between the repositories.


Patricia: For users, CMIS can expand the available applications and open the market for developers to write cross-repository applications. It is an open protocol and supports all repositories that support the standard. This provides customers lots of investment protection.

David: Enterprise Content Integrated Services is an example of an application that can facilitate cross repository work. Federated search, mashups, business process workflows across repositories.

Patricia: This is the first and only web services standard. An insurance company could have separate subsidiaries across the world, and writing to a standard would enable access and update to the repository information. A distributed environment such as a franchise would also facilitate sharing of information outside each organization.

Patricia: The 3 originals were the first tier; then we included others such as Alfresco (participated), Oracle, SAP, OpenText; now Day Software. This standard is comparable to what SQL did for databases years ago.

David: The importance is how widely a standard is adopted. The spec is publicly available. Interested parties (after technical committee is formed) can send comments to the technical committee. They’d need to join the technical committee. Enterprise customers (the first group) can benefit from CMIS and need to tap into different repositories. The second group is between repositories and vendors, allowing them to access each others content. The third group interested in CMIS is Independent Software Vendors.

Patricia: Another way customers benefit is from having a broad suite of applications for their vertical markets, since a developer could develop for all.

David: Road maps for CMIS are difficult because CMIS is not a full-fledged standard yet. My rough guesstimate would be about a year, after the standard is released. We do intend to make prototypes available for the public before then, and those would be built on Doc Foundation Services. So those interfaces are close.

Patricia: This proposed specification is already 2 years in development and vendors have done interoperability testing. We didn't just send paper to OASIS, working prototypes. “What should I do today?” When you are evaluating the specification, when you go to your next purchase or RFI, ask if vendors support that standard.


Anonymous said...


I find the CMIS dialogue to be very interesting and certainly think it is past time for the evolution of a consistent standard across the ECM industry.

However, I have some very sincere questions and concerns about CMIS:

1. Beyond the marketing hype, what is the real benefit for the customer/consumer? Perhaps I am being a bit naive, but I would assume that the vendors that will support the standard are engineering CMIS support into their current shipping versions (or planned new releases). Patricia states that the need for the standard evolved from the fact that most organizations (either through multiple departmental deployments or through M&A) have deployed multiple content repositories. However, how many of these repositories will actually support the CMIS standard? In my experience, what companies really struggle with (and want to integrate to) are legacy imaging and document management platforms. Unless I miss my guess, most vendors aren't going to go back to their legacy installs and retrofit them to the CMIS standard.
2. Does CRUD really go far enough? One of the more common use cases for content integration is to allow vendors to apply records management policies to content in existing ECM repositories. DOD-certified records management functionality requires more than simple search, access and retrieval. If I remember correctly, deletions need to include information overwrites and users need to be able to execute "holds" and "locks" for records. This seems to be beyond simple CRUD requirements and, again, many legacy content repositories don't support this functionality in their API sets.
3. Unstructured information is exactly that -- unstructured. Simply being able to connect to and search across a target content repository may not be enough. What ensures that the content in the remote repository is in any way "intelligible"? It's a virtual certainty that the metadata models for the "host" and "target" ECM systems are different and I think we've all had experience with full-text search results. Further, what if the target content is images or digital assets? Then, our cross-repository search results are only as good as the metadata that describes the content.

What occurs to me is that the real beneficiary of the CMIS standard is the ISV or solution provider who builds business solutions on various vendors' ECM platforms. Principally, if the ISV or solution provider develops a solution against the CMIS standard, then the solution should be able to work across various vendors' platforms. Of course, there are always nuances to how a standard is implelmented, but the CMIS standard should simply cross-platform solution development which will benefit the customer/consumer by not being forced to select a certain ECM system based on available ISV partnerships/industry solutions.

Anyway, Bob, I apologize for the long post. Would greatly appreciate it if you, or the vendors referenced above, could provide some clarity regarding any of the above.

Content Curmudgeon/Green Hornet said...

I'm pleased to find such passion about a subject that puts most people to sleep (until they realize its implications).

More thoughts when I have more time, but my first comment re CRUD is that you have to start somewhere, and building a "wider" (more functional) bridge is very difficult. In part, that is as Choy says, due to their different security models. Finding precise equivalents to Documentum's 7 levels of security in SharePoint, for example, is not possible. It is a little bit like trying to map the colors between a Monet painting and a paint-by-numbers version of that painting.

Larry Donahue said...

As a federated search provider (, we're watching CMIS with considerable interest. Currently, a significant component of our business is devoted to developing, monitoring and maintaining connectors (the middle-ware necessary for our software to communicate with many other sources of information simultaneously).

Almost all sources of information are different in some way, and while our software does a good job avoiding the pitfall of sinking to the lowest-common denominator, it takes quite a bit of expense. And this expense is hard to justify with some clients. CMIS could definitely help alleviate unnecessary expense for our clients, provided it gains traction.