Tuesday, June 01, 2010

Exploring the iPad: First Impressions


Well, like 2 million other folks, I bought an iPad (WIFI, 32 gigs of memory). Like others, I couldn't resist the allure of this seductive device, and I'm suffering from Windows weariness:


  • Bloat
  • Security issues
  • Complexity
  • Slow performance
  • Reliability issues
  • Battery life between charges
  • Lack of openness to standards
  • Etc…
Why did I buy the iPad? Primarily as an e-book reader but also as a quicker, lighter web access tool. What do I think of the iPad as an e-Book reader? Marginal, but more about that assessment later. First a few words about my first impressions.

The iPad is seductively beautiful and I don't find the 1.5 pounds to be excessive; it is about the weight of a hardback novel. Getting used to it, if you are not a Mac-head or don't own an iPhone though, is difficult. Using Safari and the iPad is a little like joining a club where you haven't been told the secret handshakes. HELP isn't built-in, although there is an iPad help site automatically listed in your web favorites. After a while I learned to use two fingers to pinch or spread the screen; tapping twice in the middle of the screen enlarges and centers the page (unless you tap on a link that happens to be there). How you do a simple string-search "find" –CTRL F in most other browsers—I still haven't figured out and am beginning to guess that features is just missing.

I'm also realizing that I've traded one vendor's nose-thumbing to standards (or de facto standards) –Microsoft—for another vendor acting the same way: Apple. You'd be surprised, for example, how many sites you cannot use since Apple refuses to support Flash. Forget Hulu and most news video clips from major news sites. They almost all use Flash since it is ubiquitous and has a light footprint.

If you have a Windows PC (I have several), how do you get files from it to the iPad for viewing? First, I was amazed how little the Apple folks (both in the local Bethesda Apple store and the online Apple geniuses) know about --or maybe even have thought about—working with Windows machines. The store rep told me I could use iTunes and essentially drag and drop my files, or just email them to myself. Or I could subscribe for the fee-based MobilMe service to store these files (no thanks). Email all 500 files? No thanks to that either. Drag and drop? Sorry, I misunderstood. It turns out that you can drag-and-drop –as always, there's an app for that. The surprise was that it was built into an inexpensive app I already purchased to supplement the iPad's mediocre eReading abilities: GoodReader.

The notion of using iTunes to get files over to the iPad is itself revealing. I'd never used iTunes before, but its name rightly suggests music, tunes. So you can get your downloaded MP3s etc. to the iPad, and it will also transfer photos. But how about transferring a PDF file, or a folder of PDF files? No, but there's an app for that. Matter of fact there are many apps for that, some with 1 star ratings, some with 4 star ratings. Buy one and try it out. If that doesn't work, buy another and try that one out (I have no idea how you uninstall apps, but presumably I'll learn the secret handshake for that after I've bought several redundant apps.


And remember: I bought the iPad as an e-book reader, a reader for all my Microsoft office, Open Office, e-Pub, and PDF files. I also plan to compare it as an e-Reader to Plastic Logic's Que Pro.  How well does the iPad work natively as an e-book reader? What about the app I picked? Details about that later, as I begin learning more. Full details in an upcoming review comparing both products. Assuming I get my hands on a Que, which appears again to be behind schedule.


 

NOTE: Unfortunate Plastic Logic QUE ProReader update. As of late June, and announced on Que's LinkedIn group, Que has stopped shipping the ProReader due to changing market realities. I pointed out a couple of those in my interview with their Marketing staff: Lack of color (not a complete Que-killer IMHO), and lack of support for any browser (a REAL Que-killer). The lack of browser means you must purchase subscriptions for online products such as the WSJ even if you already have a subscription. And if you have subscriptions to some niche publication for which they don't offer a Que version, you're out of luck.


 

I wish Que luck, and I'm told I'll get an eval hardware copy when it is ready, but the rumor on the street is that this could be several years in the offing. As I said in my recent column, "the clock is ticking" and –in this case—time is not Que's friend.


 

But back to the iPad.


 

Here are a couple sneak preview pictures. First, here is a screenshot of one slide from a recent presentation I gave.



 

And here is how it look rendered on my iPad (with both the native iPad viewer and with the $1 app). Picture isn't great but it was an early Sunday morning shot in natural light with my Canon EOS. (Figuring out how to do iPad screenshots isn't easy; I found out by Googling. Once you know the multiple secret handshakes, it is easy.


 



 

What's wrong with that picture? Quite a few things as you can see.

I'll try to ratchet down my curmudgeon index a bit before my next iPad post.


 


 

 

 

I Hate Comcast

Oh, did I mention I hate Comcast?
Unfortunately I must have broadband even if I'd gladly give up their cable "services." Rich media is also content, and it is really unfortunate to be at the mercy of this cable company for accessing that content without constraint, including adding our own (subscription-free) devices to use it.

I spent 4 separate days waiting for them to fix the problem they caused with their little "digital to analog" gadget.

Do you hate Comcast too?

Believe it or not, there is a Facebook page devoted to (and named) just that: I hate Comcast. If you'd like to read the details of my rant and those of others, click here.

About all any of us can do is to press our elected senators and congresspeople to vote FOR Web Neutrality. Reduce cable strangleholds as much as we can. I'd do the same, but unfortunately --living in DC-- I have no senator or representative.

Sunday, April 11, 2010

E-Books, E-Readers, and Peak Oil


Huh? Isn't this kind of like mixing oil and water? Not really.

For related reasons, having to do in part with resource constraints, the cost of print subscriptions continue to rise, even sometimes becoming prohibitive. After 40 years as a continuous WSJ print subscriber, I canceled my print subscription. It cost nearly $400/year, and I already have an online subscription that costs around $100. The WSJ is great, but not $400/year great, especially in this economy, when I also have the online edition. So I cut the cord and went completely on-line. With online access I can of course search, save articles, print them to PDF, the way I used to clip print articles. My paper press archive goes back 30 years, but PDF lasts forever, right? Another advantage of online news: the news is always fresh. Besides I'm helping save the environment, or at least I hope so. I reduce the number of plastic bags (that you can't recycle); I eliminate the need to recycle the paper itself. There is even a potential business advantage to the right e-Reader: It can preserve into the indefinite future the opportunity to view important documents. What could be nicer?

Well, there are advantages to print. Print never crashes. You can read print even when broadband is down or out of reach. You can fall asleep in a chair, drop the newspaper, and not have to buy a new one. Can't do that with a laptop. Print is very easy to read, indoors or out in bright sunlight. And print graphically rich, uses color, and is still more familiar and comfortable. Spouse says "I miss the WSJ print edition." Oh Oh.

I tell her to wait, I'll find an e-Reader that is nearly as good or even better than print. It will meet my kind of Turing test for print: doesn't crash, very portable etc. but also preserves the benefits of online: Searching, always current. iPad is here; Plastic Logic's Que reader is coming. We'll find something (but haven't bought anything yet). Now the limitations begin to appear, both from others' reviews and my own discussions with vendors.

iPad is ever so cool, has Apple's trademark usability, color… what could be better? For one thing, it tries to be everything a netbook can be, way more than just an e-Reader. I don't care if it can run my iPhone apps because I don't have an iPhone. In fact, I don't want to be nickled and dimed (more like dollared) to buy lots of little apps to fill in iPad's gaps (like being able to print or use a USB). And iPad doesn't run Flash, which is commonly used on many web sites, including the online WSJ. This feels a little like the "microsofting" of Apple. You can run anything, but not without add-ons that may not play well together. So I can buy another iPad-custom WSJ subscription, right? And do I do that for every subscription? Oh well, at least iPad has a (downsized) browser, so I can get to the WSJ in some fashion if I decide to spring for an iPad. But what about the other constraints? Early reviews say that that beautiful 1 and one half pound product begins to feel very heavy after a while, even can make your wrists hurt. And what kind of netbook wannabe is only single-threaded?

So I've now talked with a marketing rep from Plastic Logic about Que, and expect to get an evaluation device as soon as they become available. Yes, I know it will not display color (hey, the WSJ didn't start using color until it became common in other print editions). And it is very light and also cool in its own way – even has a screen that is more book-like, roughly 8 ½ y 11 inches. It reads virtually every document format known to humankind, and has huge amounts of space for all my books. But wait: It doesn't do flash either, and apparently has no browser, not even a limited one.

Maybe I misunderstood. And maybe when I finally get my hands on Que, I'll discover other advantages that cancel out the negatives.

Or quite possibly there is no perfect e-Reader. I'm guessing that's the case, since this is the real world. And if that's the case, I have to figure out exactly what a document is, and what attributes are optional (like Flash). That is no easy decision, since it requires peering into the future and guessing exactly what I'll be willing to do without.

And that's where the similarity with Peak Oil comes in. Liquid fossil fuels provide 95% or so of the world's transportation fuel needs. Yet liquid fuels will eventually run out, and before they do, they will become erratically less available and more expensive. So we'll also have to figure out which transportation options are critical, which optional. I'm guessing SUVs are optional, and public transit is critical.

And this may even have some bearing on e-Readers: they depend on electricity and broadband. Those are critical resources too, right?

What's your guess, about which transportation choices are critical and which are optional?

What's your guess about what constitutes the essence of a document, so it can be preserved and read generations hence?

Monday, January 18, 2010

Justifying eDiscovery Systems

As I said in my Information Insider October 2009 column, "The landmark 2006 Federal Rules of Civil Procedures Rule 26 and its updates make all electronic stored information (ESI) subject to legal discovery, and ESI continues its unbridled growth." Given the nation's increasing litigiousness, and the exploding amount of electronic information everywhere that could be subject to subject to 2006 FRCP rule 26, I am surprised how little we've heard about such litigation. Is it simply that our attention is elsewhere (whether the US Health Care debate, 2 wars, Global Warming –or is it Global Cooling?, the earthquake in Haiti…)? Or is eDiscovery yet another ticking time bomb that will burst onto the news when we least expect it? Well the vendors supplying eDiscovery solutions have plenty to say about that.

And what is special about eDiscovery? Why not just buy the very best search system available, and use it to do all the "e-lectronic" discovery that you want? After all, isn't it all about "search"? I spoke with Ursula Talley, VP Marketing of Stored IQ, to gather expert opinions on this subject. Here are excerpts from her comments about this, which I find pretty illuminating.

First, "Enterprise Search and eDiscovery Search technology do share a set of core capabilities, specifically crawling, indexing and searching data across a multitude of various applications and storage systems. Enterprise Search is designed to assist knowledge workers with information access and retrieval. The end result is that a user can find some files with information that can help that user complete a task." So what's the difference? Ursula went on to say "eDiscovery Search is designed to support a workflow that can be legally defended in court. The end result is a set of data files that is preserved (saved to a new, target location without any changes to the metadata and recording every system and location for each data file was originally located)." This kind of quarantining of content goes over and above what you can do with any enterprise search system. Moreover, she says that search performed by eDiscovery systems must also be very robust. Such eDiscovery searching can require queries with between 25 – 300 search terms. Moreover (for those of you who have ever posed a complex query on an enterprise search system, then went to have a cup of coffee while you waited for the result to return) eDiscovery search must be able to copy large volumes of content that has been found, "if necessary hundreds to thousands of gigabytes, without disrupting user productivity."

While it's at it, robust eDiscovery systems such as those from StoredIQ can provide de-duplication of email and user files (saving space and attorney time pouring over the same redundant files), while keeping a record of every location where those items originally resided – in case the judge asks. Lastly, searching just email systems can be a real pain, since they are so big and are threaded. Even the best search often is like sorting through low-grade ore, tons of it. eDiscovery systems also can extract both metadata and content from email and export this into a database format that can be queried and re-used into legal document review applications.

So how do you go about justifying the purchase of an eDiscovery system? Not by claiming you can add features to an existing or new Enterprise Search system. Instead, focus on the other features that you'll need if a lawsuit comes a calling. Unfortunately, getting your eDiscovery house in order may be like getting your electronic records management house in order – really hard to justify until after the lawsuit. Still, at least you can avoid the trap of thinking that Enterprise Search can do all you need to find and quarantine your information for a credible eDiscovery defense.

Tuesday, October 06, 2009

And Now For Something Completely Different

... actually 8 things. What is different is that I normally use this blog for details that I couldn’t squeeze into my eContent Magazine column, Info Insider. The eight things I’m referring to are in AIIM’s recent (free) e-book describing the eight reasons you need a strategy for managing information.



John Mancini has a knack for writing simply, and this e-book (free for the downloading here) is well done. Although it is 95 pages long, don’t be put off by that; the pages are small ;-). Not only that, but the content, distilled from various “8 things” blogs, provides truly useful perspectives on Information Management. Here’s one gem from the section “Tidal Wave of Information.”

“A study by IDC a few years back concluded that there are currently 281 billion exabytes of information in the Digital Universe. So how much is this? Well…an exabyte is a million million megabytes. Thanks a lot. To put it in a bit of perspective, a small novel contains about a megabyte of information. So in other words, the Digital Universe is equal to 12 stacks of novels (fewer if the chosen novel is a big fat one like Harry Potter 6 or one of those Ken Follett Pillars of the Earth deals) stretching from the earth to the sun. So it's a big number, whatever it is.”
Go ahead, download a copy and enjoy the read.

Tuesday, May 05, 2009

It's TAXonomy Time


TAXonomy Time – Why the interest in Taxonomies?

I'm hearing the word “taxonomy” more and more often in ECM projects, often uttered by business people in the same sentence as “metadata.” Can it be that business people are becoming comfortable with these terms? If you know you've got a serious information overload problem, where do you start with taxonomies to tame and organize your content? Everybody starts with Excel for metadata and Visio or similar graphical tools to sketch out taxonomies. Those tools are available, sometimes free, and well understood. But they are fundamentally static. Do you need more? What are some best practice and alternatives?

As part of my latest column “It's TAXonomy Time” in EContent Magazine, I spoke with Carol Hert, PhD., Chief Taxonomist and Consultant for Schemalogic Inc. to get her take on trends in taxonomy projects. Here are my questions and Hert's responses.

1) What is the state of client awareness of the value and urgency of developing taxonomies? What is the trend – use the Gartner “hype cycle” stages if you’d like. Do you see increasing interest in taxonomies, and –if so—why? Is the “information explosion” itself motivating this interest?

We typically work with large corporations that have already developed and deployed multiple taxonomies across their organizations. These companies are well aware of the cost and limitations of trying to manage these taxonomies in a dynamic environment that includes many consuming systems. Some of the organizations we work with are focused on taxonomy harmonization-integrating single-use taxonomies into one or several related taxonomies that can be utilized enterprise-wide.

We continue to see increased interest in taxonomies with the further proliferation of SharePoint and other collaboration systems, the need to increase the efficiency of the information worker, and the continued interest in enterprise information findability. Also the need to meet compliance requirements for large amounts of unstructured information continues to increase the need to govern and manage information more effectively.

2) What are typical approaches to taxonomy development:

a.
Use an existing taxonomy only

b.
Build on existing taxonomies

c.
Enterprise versus single-application (tactical) approach

d.
Use tools not available from current application vendors (e.g., EMC Documentum) for possible use with multiple vendors, or vendor-specific tools?

Our customers usually have multiple taxonomies deployed across their organizations. They have issues with managing and coordinating multiple taxonomies, especially in a dynamic environment. The first thing we do is to collect these multiple taxonomies and model them in our metadata management platform. We can then work with the customer to connect and optimize these taxonomies and then extend them as well. Some of our customers approach this from an enterprise wide perspective, while others choose to focus on a single department, function or business process and then expand.

Because complexity increases as number of business stakeholders expands, most organizations are working to achieve a balance between the optimal goal of enterprise-wide taxonomies and single-application taxonomies. All our customers use SchemaLogic’s metadata management platforms to build and manage their taxonomies. Our systems are designed to allow customers to model enterprise-wide taxonomies and publish those taxonomies to multiple applications such as SharePoint and Documentum and well as to search engines such as FAST or auto-classification systems such as Teragram.

3) What trends do you see in the evolution of taxonomy development? In supporting technologies (such as SOA or SaaS)

There continues to be a need to manage taxonomies in a more dynamic way. The need to collaborate across the enterprise, locate and share information, and improve information governance at the same time is putting pressure on organizations to develop a more flexible approach to managing information. The distributed nature of SOA and SaaS architectures puts further pressure on companies to establish a enterprise with taxonomy that can be accessed by multiple applications.

4) What are best practices for developing taxonomies? What are some approaches to avoid?

Books could (and have been written on this topic), but a short list of Best Practices might include:

  • Understand the ultimate uses to which the taxonomies will be put (there is no one perfect taxonomy).
  • Incorporate business and technical stakeholders in the development process to assure that the final product will met requirements.
  • Conduct a “taxonomy”audit prior to developing any new taxonomies to understand what already exists and might be leveraged.
  • Consider taxonomy maintenance and governance during development processes to assure that the taxonomy is able to be maintained and there are clear lines of responsibility.
  • Look for externally available taxonomies but be cautious as they have not been designed for the particular goals of the organization in question. Participate in industry-wide organizations where taxonomy development efforts might be occurring.

5) Are there any emerging or existing standards other than ISO 2788 for developing or expressing taxonomies? Is ISO 2788 relevant (I gather it is oriented towards human indexers) and who tends to use it?

ISO2788 is relevant in terms of providing extensive guidance into term forms, and other such matters. Since most organizations work in networked environments and want to transfer taxonomic information electronically, most will need to explore approaches to structuring taxonomic data for electronic transmission. Some of the standards to be aware of are RDF, OWL, Topic Maps, and SKOS. Additionally, since taxonomies might reside in metadata repositories, standards such as ISO 11179 may be relevant.

6) What are some common exports from taxonomy tools (e.g., Excel)? Are there any common formats for importing existing taxonomies or developing them in taxonomy tools? For example, are there XML DTDs or Schemas?

CSV is a good common base line as some organizations still manage a number of their taxonomies in Excel. Some taxonomy management vendors have XML formats (such as we do) but these may be proprietary and need some translation into an XML format another application could use. Standards such as RDF, OWL, and Topic Maps might be used in this context as well.

7) Can you provide client case studies?

Yes. We have published several customer case studies and would be happy to work with you on additional case studies in the future.

Now About Tools

1) What are typical costs for acquiring and implementing taxonomy products?

The costs of taxonomy products varies greatly based on the particular application. Simple taxonomy modeling tools can cost less than $1000. While enterprise wide taxonomy management and governance systems can cost over $500,000. These larger systems provide highly scalable modeling capability, complete change management and governance, integration to full suites of enterprise applications and metadata compliance monitoring. We have deployed systems that range in price from less than $50,000 to over $1M.

2) What are three key features in taxonomy tools; what are three unique features in yours?

Three key features:

1. Support for a variety of relationships between terms (should at least be able to support the term relationship types specified by ISO2788).

2. Allow unlimited hierarchical structures.

3. Provide import and export features.

Three unique features in ours:

1. Extensive change management component that enables changes in taxonomies to be automatically subjected to governance.

2. Set of productized connectors that automatically can provide updated taxonomy information to consuming applications. In addition, the ability to create custom connectors.

3. Ability for end-user administrators of the interface to create custom properties on terms and taxonomies.

3) How would you assess the current state of the art for automatic classification features?

Auto-classification systems continue to improve, but still lack the precision and accuracy provided by a managed taxonomy. Taxonomies have been found to be useful frameworks upon which an auto-classification system can be developed rather than have the auto-classifcation tool start from scratch. A combination of taxonomy management to provide structure and manage term relationships combined with auto-classification methods has proven to be the most effective solution.

4) Do you provide “connectors” to work with enterprise content management systems such as EMC Documentum and Microsoft SharePoint?

We provide connectors that allow our customers to publish taxonomies out to subscribing systems such as Documentum and SharePoint. We also publish taxonomic metadata to search engines, auto-classification systems, portals, and other enterprise applications

---
So there you have it from an expert. And if you happen to use -- or be interested in using Documentum or SharePoint (or both), here's a way to move beyond graphical tools and spreadsheets to manage and leverage your taxonomies.

Sunday, March 08, 2009

CMIS - EMC's role and vision for the future

First off, what on earth does CMIS stand for and why should any content management person care? Here's the easy part, what it stands for: "Content Management Interoperability Services." What is promises is a way for customers (vendors, and others) to begin allowing useful sharing of content between different vendor repositories. That is a huge thing, since right now most companies have several, maybe hundreds (and maybe they don't even know how many) different document repositories they have under their enterprise roof.

To write my column on this subject ("Building Content Bridges") I interviewed EMC and Day software. The former one of the original writers of the specification; the latter a vendor that is keenly supportive of content management standards. The following notes are taken from my EMC interview.

On the 23 rd of October, 2008, I spoke with two representatives from EMC about the emerging standard CMIS: Patricia Anderson, Sr. Marketing Manager, Documentum Platform Marketing, Content Management & Archiving and Dr. David Choy, Sr. Consultant. "CC" below refers to my comment on statements in the interview -- "Content Curmudgeon."

I was curious about the timeline for CMIS to be implemented (assuming it succeeds), and why CMIS is important either to EMC or to the content management space in general. Following are my notes from that interview.

Dr. Choy: Nobody knows how long the process will take, but about a year or more for a full-fledged standard. There were eight companies participating with validating the current version of the CMIS spec for interoperability (IBM, EMC, Microsoft and five others). The eight proved that the spec could be used to assure interoperability. After that the team sent the proposed standard to OASIS. The formal process for discussing the standard takes time, but in the meantime for EMC we intend to make the prototype available for the public to play with.


Security has administrative issues (mechanisms proprietary to each vendor) and also in the runtime space; security policies reign. CMIS security and access control is out of scope at this point. Each vendor has its own security model. In the near term, that is outside the scope of CMIS. Security policy is now reduced to the lowest common denominator (CRUD), but every vendor supports those.

---

CC: By CRUD, Dr. Choy means the basic four operations, Create, Read, Update or Delete. Every content management system provides at minimum those same operations. How they determine who can do those things is a separate issue, and CMIS assumes each system manages its own security in its own way. If the administrator of a CMIS-compliant system gives you one of these rights, then from your own CMIS-compliant system you can access and perform operations on content in that system.

---


Patricia: One of the questions is “ what caused the need for this standard in the first place?” But organizations would set up more than one repository platform, perhaps departments or as the result of M&As. We realized that it was difficult getting to this other information. This also hampered development that was cross-divisional or cross-platform. Then with Web 2.0 mashups, it became even more difficult to leverage use of information. ECM folks realized that it was a hindrance that affected all vendors. We looked at different standards but wanted a standard that was platform-agnostic and services-based, to unlock information in different repositories. Serious discussions began in October 2006. Other committees like IECM tried to develop such standards, but they needed to start fresh.”


David: iECM is an AIIM consortium that tried to create something similar to CMIS. That group wasn’t set up for highly technical interoperability standards. Very little concrete results occurred. iECM is still looking at best practices and standards, not technical areas.

---

CC: Clearly you need both and without either there is no bridge between the repositories.

---

Patricia: For users, CMIS can expand the available applications and open the market for developers to write cross-repository applications. It is an open protocol and supports all repositories that support the standard. This provides customers lots of investment protection.


David: Enterprise Content Integrated Services is an example of an application that can facilitate cross repository work. Federated search, mashups, business process workflows across repositories.


Patricia: This is the first and only web services standard. An insurance company could have separate subsidiaries across the world, and writing to a standard would enable access and update to the repository information. A distributed environment such as a franchise would also facilitate sharing of information outside each organization.


Patricia: The 3 originals were the first tier; then we included others such as Alfresco (participated), Oracle, SAP, OpenText; now Day Software. This standard is comparable to what SQL did for databases years ago.


David: The importance is how widely a standard is adopted. The spec is publicly available. Interested parties (after technical committee is formed) can send comments to the technical committee. They’d need to join the technical committee. Enterprise customers (the first group) can benefit from CMIS and need to tap into different repositories. The second group is between repositories and vendors, allowing them to access each others content. The third group interested in CMIS is Independent Software Vendors.


Patricia: Another way customers benefit is from having a broad suite of applications for their vertical markets, since a developer could develop for all.


David: Road maps for CMIS are difficult because CMIS is not a full-fledged standard yet. My rough guesstimate would be about a year, after the standard is released. We do intend to make prototypes available for the public before then, and those would be built on Doc Foundation Services. So those interfaces are close.


Patricia: This proposed specification is already 2 years in development and vendors have done interoperability testing. We didn't just send paper to OASIS, working prototypes. “What should I do today?” When you are evaluating the specification, when you go to your next purchase or RFI, ask if vendors support that standard.




Saturday, January 10, 2009

Enterprise Search Summit Program

Do any of you feel like you can't keep up with the latest trends in search, or you just feel like you could wring more value out of your investment but aren't sure how? Or maybe you don't get the connection between Web 2.0 and Search? Whether you are responsible for your Intranet, your commercial site, or the various repositories inside your firewall, I heartily recommend the annual Enterprise Search Summit to be held this May in NYC.

I've attended this in the past, as a paid attendee (my "day job" employer considered it that worthwhile!), not gratis as a columnist for eContent magazine which is part of the Information Today Inc. portfolio. Michelle is the editor for eContent and designs/runs the Search Summit. I like this conference a lot. To learn more, click here.

Sunday, September 07, 2008

XML 10th Anniversary

In an upcoming Information Insider column, I invite XML to an intimate party where we can celebrate its 10th anniversary. I also invited Alexander Falk, CEO of Altova, and an XML aficionado if ever there were one (here's his blog http://www.xmlaficionado.com/ Here are some of the questions I asked Alexander as background for the column. I hope you'll find this interview interesting. After all, celebrating a "double digits" anniversary doesn't happen often. Alexander's responses to my questions are shown in blue text.

Question: The XML Recommendation is now 10 years old. XML led to hundreds of additional specifications, yet its adoption rate in publishing and word processing software (and XHTML in web pages) seems slow. What is your assessment of XML adoption, and what do you see for the next 10 years?

Ten years is a mighty long time to make forecasts for – my crystal ball is only rated for 2-3 years max…
What we’ve seen with XML over the last 10 years is a huge adoption in all areas that are data-centric, rather than content-centric. XML has become the lingua franca of data exchange and interchange and has made a whole class of enterprise applications possible, because you can now move data fairly freely between disparate systems.

The benefits of XML in a pure content-creation scenario – be it publishing, word processing, Web design – are only realizable if you have a large amount of content and use it with some content management system. That is not something that most small- or medium-size businesses would do, and that has, I believe, let to a somewhat slower rate of adoption in those areas.

Question: OOXML is essentially “ rich text format” expressed as XML rather than leveraging existing XML standards such as MathML. MS Office is expensive; OpenOffice (based on ODF that leverages other XML standards) is free. MS Office maintains office share. What gives?

This is an interesting conundrum. From a purely academic perspective I would agree with your statement that leveraging existing XML standards is desirable. But the reality is that 95% of the world’s office documents are MS Office documents today, and people want to continue working with those documents – and want to reuse the content that exists in those documents in other applications, and by opening the file format up and having them be XML-based rather than binary format, such reuse is now possible. I can tell you from our experience that we have received countless requests from our customers that they want to be able to work with OOXML documents, and not a single request for ODF. Also, when I look at e-mail that I receive from others, I have yet to encounter a single e-mail that came with an ODF attachment. I don’t necessarily like Microsoft’s near-monopoly on the office market, but to deny its existence and standardize on a file format like ODF that nobody actually uses in the real world doesn’t make much sense either.

Here we disagree a bit; my question to Alexander followed by his response.

Question: OOXML (which today looks like it will become an ISO Standard) is still essentially just an XML expression of Microsoft’s internal word processing format, “Rich text format.” What value does such a use of XML provide to potential applications?

Actually, I need to disagree on that one. OOXML is not just RTF in disguise. OOXML includes separate and distinct markup languages for expressing word processing documents, spreadsheets, and presentations. The wordprocessingML is somewhat related to RTF because it is based on a similar concept (runs of characters with styles applied to them), but that is where the similarity ends. We found that it is very easy to use XSLT (or XQuery) to extract content from either wordprocessingML or spreadsheetML documents in OOXML that were created in Office 2007 (or other OOXML compatible apps), and likewise it is very easy for us to generate OOXML content in both of those formats from our applications. For example, our data mapping tool MapForce makes it very easy for people to map data from a variety of data sources (including EDI, databases, Web services, XML, etc.) into spreadsheetML documents that they can then open with Excel 2007. Likewise, our stylesheet design tool StyleVision, makes it very easy for people to produce stylesheets that render reports from XML or database data not just in HTML or PDF, but now also in wordprocessingML for use in Word 2007.

Still, what is new in OOXML that didn't exist in earlier editions as Rich Text Format? And if 2007 simply uses XML as a replacement for RTF, I don't see the added value. Sure, you can search for table captions (if you want), but the richness of ODF is not there and won't be (can't be, due to compatibility with earlier versions).

Question: HTML 5 seems like a step backward from XML and XHTML. Is this a sign of eroding support for XML? One reason for HTML 5 (to quote the W3C) is “new elements are introduced based on research into prevailing authoring practices.” Wasn’t XHTML sufficient, or maybe too difficult for “ prevailing authoring practices”?

I’m afraid that the reality is that a lot of HTML is still created by hand: people creating some HTML in Web-tools like Dreamweaver or other HTML editors and then going into the HTML and messing around in it in text editing mode. Since those tools have been very slow to enforce XHTML compliance, people continue to generate sloppy HTML pages, and so there is unfortunately a real need out there to at least standardize on what authoring practices exist in the real world.

The much better approach is, of course, to generate XHTML by means of an XSLT stylesheet from XML source pages, which is what we do, e.g., for the http://www.altova.com/ Web site.

Question: XQuery is a standard co-developed by the developers of SQL. What’s your prediction for widespread adoption and use of XQuery?

I initially thought that XQuery had a lot of promise, too, which is why Altova was very quick to provide an implementation of XQuery in our products, including an XQuery editor, debugger, and the ability in our mapping tool to produce XQuery code. However, we’ve found that the adoption of XQuery in the real world is happening much slower than we and many others had anticipated. I think that one of the issues is that there isn’t yet a clear and consistent XQuery implementation level and API across all database systems that people can rely on. The beautiful thing about SQL is that – for the most part – you can throw the same SQL query against an Oracle, IBM DB2, SQL Server, or even MySQL database, and you will get back the same result. The same is not true for XQuery yet, and until we reach that level of wide-spread adoption in the database servers, it has no chance to be as widely adopted by database users and application developers.

The reality is that we see a lot more interest in XSLT 2.0 from our customers than XQuery.

Sad but true Alexander. I had high hopes for XQuery but I don't hear much about it these days.

Question: Will XBRL be one of the “next big things” leading to a major use of XML by investors via a new set of prosumer applications? Enterprise processes and financial systems? What role will XQuery provide in these contexts?

I do indeed see XBRL as being the next big thing. The fact that both the Europeans and the SEC are mandating XBRL for financial reports from publicly listed companies will be a huge driver of XBRL adoption on a global scale. I am convinced that XBRL will be essential in financial systems and will find its way into enterprise applications fairly swiftly. When it comes to the use of XBRL by investors as prosumer applications, I’m a little bit more skeptical. It is certainly clear that investment professionals will use XBRL to better compare data between different companies in a certain market and to derive some key financial figures much easier than before, because the financial reports don’t have to be re-keyed into their systems. But I don’t think that this effect will transcend the investment professionals and become easily available for consumers anytime soon. As to what role XQuery will play: it might play some role, but I’m thinking of XBRL more as a standardized data transport mechanism and am expecting investment firms to map the XBRL into their internal decision-making and analysis applications and do the querying there.

On this we agree. This might be XML's first great opportunity to transform significant amounts of content -- and the processes to generate that content -- outside the tech doc arena.

Question: I know some subscribers to online financial services are wondering if they will be able to supplement (or even skip) certain of these services by analyzing sets of XBRL files themselves. What are the practical limitations to such analysis? Is there an inherent limitation to max numbers of XBRL files that can be XQueried at once?

There aren’t really any limitations that I’m aware of. The problem is more one of: how will you use the data? An investor who is very accounting-savvy can probably easily use XBRL to extract some key financial indicators for a company and compare several possible investment candidates in an industry group. But most investors I know rather want the key financial indicators automatically calculated by somebody else rather than directly work with the raw XBRL data. So I am skeptical that individual investors will be able to skip their subscriptions. Augmenting them is, however, a possibility and I indeed see the ability for some people to get a more in-depth look at some numbers than what they can currently get from Bloomberg or similar services.


Saturday, February 16, 2008

Update on Office 2007 Compatibility etc.

Julie ("funnybroad") has updated her slide show about her Office 2007 compatibility findings. Here is an excerpt from what she said:

I've replaced my original Office 2007 Compatibility Mode Confusion paper on slideshare.net with an updated version. I had to delete and re-create the existing one, so the link to it from your blog is now broken (click here for Julie's updated info)....everything has been re-tested with Service Pack 1, and sadly, compatibility still sucks. So go to the new link, not the older one.

-----

While I'm on the subject of Office 2007, when I tested and reviewed the product I was happy to see a weird longstanding behavior removed: You print a document, then exit and are asked if you want to save changes. Most people simply "yes," fearing they forgot whatever change they'd made and don't want to lose it. Others say "no" thinking they made a change inadvertantly and don't want it to stick. Well, I was happy to see that dumb "feature" removed, but recently --several automated patch upgrades later, I guess-- I see the "feature" is removed. So we've got compatibility with pre-2007 suites, but this is one compatibility feature they could have dropped and it would have made the product better.

Monday, January 21, 2008

How Green Are Your Documents

Over the past 6 months, I've seen some of my hunches about growing awareness of environmental issues and concern about fossil fuel supplies (and prices) confirmed. Although oil never did close at $100/barrel, prices are sky high by anyone's estimate. In the autumn of 2007 I tried a different theme in my Information Insider column – one that I believe has never been done. I laid the groundwork for this series with the EContent 100 annual issue, in a column titled “ Content 2.0 Converges.” I titled the follow-on column in this series “ How Green Are Your Documents?” (the editor since changed that to “ It Ain't Easy Being Green” -- a fine alternative). I sent out queries to a variety of vendors for any thoughts they had about their products and the green theme, and waited. And waited. And began to think that this was the craziest idea I'd ever had and wondered how I'd meet the deadline with a different (unplanned) column in case this didn't pan out. Then the vendors began to respond, all except Google, but I blame that on the difficulty of finding the right contact there rather than Google's lack of interest – since Google is indeed showing itself to be very green indeed.

Who did respond? Adobe, MarkLogic, and Olive Software – the latter a vendor I'd never heard of but found (yes) with a Google search. And it was an avalanche of interest.

Let's start with Adobe. One obvious Adobe product is Acrobat, which has become a default electronic document standard, bulked-up with collaborative features in version 8 with Acrobat Connect, formerly Macromedia’s Breeze web conferencing but now integrated with Acrobat. I get the idea that web conferencing can cut down travel and thus save travel and carbon costs, but I was looking for more, and Adobe provided it. First, they've done as Google and now Microsoft have also done: begun adding online documents to their product set. In this case, Adobe acquired Buzzword, a web-based text editor. Interesting, but not the green lead I was looking for. Then it got interesting.

Adobe's new AIR (Adobe Integrated Runtime) lets web applications run offline – key, IMHO, to assuring the acceptance of online, collaborative documents and reducing the use of paper (with all the energy savings that implies). AIR is a cross-OS SDK, a mashup of Flash, HTML, Ajax, etc. AIR can target applications to the desktop and get the rich abilities expected in local clients plus the web. The key here is that you get persistent presence on the desktop, offline/online with re-synching of web content when you go back online. Traditional media has been moving to the web for some time; now the web is also moving to the desktop, with traditional functions on a browser or paper migrating to the desktop. All financial documents for example could give you reports, etc. and also perform applications that would require paper, such as loan applications required swapping excel spreadsheets, etc.

Developers (not end users) are beginning to develop AIR-basedonline-offline catalogs --that bane of the mail box. You download the application that would include the catalog, navigate through them, sort and search, flag them within the application and get notifications when available (reminders when back in stock). As you walk through the catalog, you could add electronic notes. You could share them with friends etc., send them an email with the relevant information. Collaborate on different desktops. Adobe says that Linux support for AIR is coming.

A 10 MB PDF catalog could be the whole size of the AIR application, and with progressive images or assets on demand, could make the “catalog application” smaller than the PDF.

Hot AIR? Yes, but in a good sense – reducing global warming in its own way.





Monday, July 09, 2007

Office Suites and XML - Vendor feedback

In my latest Info Insider column, I mentioned contacting two vendors to get their take on the impact of the two major office suites, OpenOffice/Star Office 8 (ODF) and Microsoft Office 2007 (OOXML), using XML internally. The vendors I contacted were Altova and MarkLogic. Here are the questions I asked them, followed by their responses.

Now that OpenOffice and Office 2007 both use XML natively, what new opportunities are there for analyzing or transforming Office documents?

Do you have any examples of customers using your products (or those of your technology partners) to analyze or transform OpenOffice/StarOffice or MS Office 2007 documents, leveraging their use of XML?

In essence, both vendors seem poised to provide ways for customers to extract extra value from
their document repositories, although the current state is a “ chicken and egg” problem. For now, there are no office document repositories, so there is no rush to buy new products to extract this value. However, sooner or later the enterprise chickens will be forced to lay the XML eggs (see below).

MarkLogic

Following are the responses from MarkLogic, specifically John Kreisa, Director of Product Marketing for MarkLogic. Regarding opportunities for analyzing or transforming Office documents (whether ODF or OOXML), John says:

"Microsoft’s choice of XML as a core form for Office 2007 means that everybody using Office will be authoring directly in XML – Office becomes a direct means for creating XML content. We believe there is a significant opportunity for customers to leverage the ever-increasing amount of XML content by combining Office 2007 with an XML content server, like MarkLogic. Doing so will allow users to exploit the XML within the content in two ways. First they can combine all their content into one common repository, which is the first step to getting more value from the content. Then second, they can build content applications to repurpose the content, dynamically publish the content in new ways, and perform analytic functions they haven’t been able to do before.

Loading all of their content into a content server lets organizations analyze their entire content in new ways including understanding the term frequency, word counts, page counts etc, and understand the relationships within the content like citation analysis between articles and many other areas of analysis. What we typically see is that once organizations take a platform approach to their content they immediately find new ways to exploit it and generate new business opportunities."


Of course this begs the question “When will there be enough XML content to put into a repository, since adoption rates are currently low even though as users upgrade to OOXML or switch to ODF, they will generate documents for this repository. And in the case of OOXML, if users decide to stick with Microsoft they’ll have no choice but to upgrade, since sooner or later Microsoft will stop releasing free security patches to its earlier office products.

Kreisa confirmed the problem of the current adoption rate in his response to my second request for examples of customers using MarkLogic products (or those of your technology partners) to analyze or transform OpenOffice/StarOffice or MS Office 2007 documents, leveraging their use of XML:

"While Mark Logic does not currently have any customers using MarkLogic Server with MS Office 2007, we do anticipate that as adoption of Office 2007 increases, our customers will leverage the XML content they create with Office 2007 by combining it with MarkLogic to create new content, repurpose existing content into multiple formats, and republish this content, and to mine the content to find previously undiscovered information.

Our senior VP of products demonstrated our Office 2007 related capabilities in a general session at our User Conference in May, and the audience were very impressed – lots of nodding and clapping. When people see what we can do it generates interest in upgrading to Office 2007.

We have not heard much from our customer base regarding OpenOffice. However Mark Logic’s fundamental value proposition remains the same. We can load, query, manipulate and render the XML from StarOffice in the same manner we do for Microsoft Office 2007.

In response to your question about how presentational XML facilitates text analytics in Microsoft Office, it really depends on the goal of the user. Highly marked up XML can complicate or confuse tools that are not capable of handling this kind of deep XML. MarkLogic Server, on the other hand, can easily handle this kind of content and separate the markup from the text. For example, if a user wants to know how many places a certain word is in bold or how many words are tagged as <title1> style, we can help with that kind of analysis. We see this as potentially relevant for technical documentations organizations, for example, who want to make sure that they have consistency across their different documents."

Altova

Altova is the vendor who created the famous XML Spy product line, providing lots of ways to create, analyze, and manipulate XML on desktop PCs. Here are responses to the same questions from Alexander Falk, President, CEO and Co-Founder of Altova.

"Organizations save vast amounts of information in Microsoft Word documents and Microsoft Excel spreadsheets, but until now, that content could not be re-used in an extensible, programmatic way. With the Open XML document formats, that data is now standards-based; and the new capabilities in Altova XMLSpy allow developers to extract, edit, query, and transform XML data from within documents that use Office Open XML Formats - the new file type used by the 2007 Microsoft Office release - to make the data highly interoperable and easy to process. This provides huge advantages to business people and application developers.

Because XML Spy's support for Office Open XML was released only a few weeks ago, its too early to provide feedback."

I followed up to ask about the issue of XML quality in the two office suites, and whether or not one offers greater potential for leveraging the new XML internals. Office 2007 is almost exclusively presentational, while OpenOffice goes beyond that with support for additional standards, Scaleable Vector Graphics, MathML and XML Forms.

"Yes, that is an old argument. In an ideal world, the content authors would be motivated to create content with semantically meaningful tagging, e.g. Docbook or DITA. But the reality is that in today’s world most content is created in Office documents, so it is better to be able to extract and process that content with Office Open XML, than to continue to wait until all content creators use semantically meaningful tags. Furthermore, the Office 2007 Open Office XML formats are not just for Word documents. Extracting data from the millions of Excel spreadsheets that get created and processing it further in XML opens the door to a huge opportunity for information reuse and repurposing."

So there you have it. OOXML will likely have the largest installed base. In fact, the Massachusetts Information Technology Division (ITD), (the agency that essentially stuck its finger in Microsoft’s eye) has released a new draft of its Enterprise Technical Reference Model This draft now includes OOXML as an acceptable open format. The discussion period will end on 20 July 2007, but I’m betting the draft will become approved. 20. For an expert insight into the issues with the Massachusetts ITD, go to:

http://www.consortiuminfo.org/standardsblog/article.php?story=20070702101415578&mode=print

And there are still persuasive arguments that OOXML is fundamentally inferior to ODF, and how that plays out over the next several years will be abstractly fascinating to watch -- if only the future of our office document content weren’t so important. I’v e got my opinions on the XML quality issue, expressed in my Information Insider columns at EContent Magazine for some time. Here is O’Reilly’s take on the issue.

http://www.onlamp.com/pub/a/onlamp/2007/06/14/achieving-openness-a-closer-look-at-odf-and-ooxml.html .

It is right for both the above vendors to profess no preference over one format or the other, since both suites use XML and their products can and will work with each. Still, quality and openness matter. We’ll see how this plays out.


Friday, July 06, 2007

More Evidence of Content 2.0 - Blogging with StarOffice 8

Sun Web Logging! I just received a Blog publishing plug-in to Star Office Writer called Sun Weblog Publisher (Go to sun.com/products-n-solutions/edu/solutions/staroffice.html for details about StarOffice 8). I am publishing this blog entry using the Weblog Publishing tool. I just installed it and already I'm in love with this. I have to admit, when I first heard about the product from Sun's PR, I wasn't quite sure why I'd want it. Then as I thought about it, the many reasons became very clear. Among the reasons:


  • Use the word processor interface that you're accustomed to and use many times each day.

  • Create your blog offline, and publish it when you're ready to.

  • Leave your HTML skills at the door (and use them when you really need to, but in a robust environment such as DreamWeaver).

  • And hey –let's admit it-- it's getting hard to remember the each blog's username/password pair. (I have a database of over 400 passwords – more than most people, I'm sure-- but every one you don't have to remember is a big help.)


I tried to include a screenshot from a portion of the Sun Weblog brochure: a great picture of the ants carrying big leaves is a perfect metaphor for the blogosphere.

Apparently you can't do that with this tool, even though Blogspot allows you to upload an image from either a location on the web or from your local computer. Oh well, a minor thing and this after all is version 1.0. For a list price of $9.95, the Weblog Publishing tool is till a terrific value.

One last thing -- this tool is powerful, and lets you blog to different blogs on the same blog server (like blogspot) or on different blog servers. You can even download a posted blog entry, edit it, and push it back to the blog. Nice.

Tuesday, April 03, 2007

Office 2007 Packaging

This weekend, I received the official MS Office Professional 2007 package, the same that consumers or SMBs would get when they buy the product. Now I admit I have trouble with contemporary packaging of all sorts -- razor blades, anything that is meant to prevent shoplifting, or the electronic equivalent of bootlegging software, especially anything with the Microsoft label. I completely sympathize with Microsoft's aggressive stance vis-a-vis bootlegged software. However, I've seen a couple of things lately --including the packaging for MS Office-- that I think goes a bit over the top.

First there was the prompt to download important security updates. It turns out, that that was a piece of software to determine whether or not my copy of Windows was genuine. Of course it was, since I was using review software that I'd received from Microsoft, but I think that procedure is a bit devious.

Now on to the more physical side of security: The package I received containing MS Office 2007 Professional. There were two sticky labels, one on the top and one on the body, indicating I should pull the one on the top and then somehow open the package. Problem was, pulling the top tab looked like it would damage the license key that was firmly affixed to the top. So I tugged and pulled, did my best not to damage anything, then moved on to the main seals. After much tugging (and using heavy-duty shears to cut what looked like a pop-rivet on the side), I realized that this package is intended to swivel downward, getting you to the software and manual. Inside and attached to the inside packaging was a set of graphics about the contents (Excel, Word, etc.) with a headline "Manage analyze and communicate..." I can't tell you exactly what the rest of the headline was, because to read it I'd have to bend and maybe break the outside plastic shell that houses the swivel-down housing with the CD, manual, etc.

Truthfully, this packaging looks like it was built by a committee, and "Security" got to veto "Ease of Use."


Sunday, March 18, 2007

HELP - online only?

So far as I can see, there are two ways to activate HELP in Office 2007 applications: The tiny little question mark in the upper right side, and the old standby F1. Both seem to get you only Microsoft online help. What happens if I lose (or temporarily do not have) an online connection? Am I stuck?

Actually, this de-emphasized HELP suggests to me that Microsoft believes the new ribbon interface is so clear that you won't need help. And secondly, that if you need help, you always have a broadband connection. I'm not sure both assumptions are true.

Anyway, here is the answer to the question I received from Microsoft's rapid response team:
"...the question mark button does, by default, take the user to Microsoft Office Online for Help. But if you click on the “Connected to Microsoft Office Online” button at the bottom of the box, you can choose to “Show Content Only from this computer” and that allows the user to see help content when not connected to the internet.

Super Tooltips, a feature of the Microsoft® Office Fluent™ user interface in the 2007 Microsoft Office system, integrates Help topics into the product in a new way to make the experience easier for new customers. One of the main problems that people have with Help topics today is that they don’t know the terms used to describe features. Super Tooltips are integrated help tips that provide quick access to information about a command directly from the command’s location in the Office Fluent user interface. One of the biggest innovations that began with version 2003 was the opportunity to get feedback on our Help. We use this feedback to drive the development of new content and to update current help topics as needed. We also use the feedback to identify trends that assist us in creating better Help for new features. The 2007 Office system Help was developed with the benefit from having feedback from thousands of Office customers."

So if you think to go to the bottom of the HELP box, you'll figure out how to get information without being online.

The right brain, aesthetic side of Office 2007; the Left Brain view of PowerPoint

I've been so caught up in looking at new features, or where my old Office features now reside, that I've overlooked one important point. Microsoft has clearly expended a lot of effort to achieve two important benefits: truly elegant set of styles (themes) along with some new fonts, and much improved consistency between the various Office programs.

Across all the Office programs, there is a new, softer look that subliminally suggests you can approach the new system comfortably. That new friendliness is true across all the applications, from the Outlook email program through Word, Excel and PowerPoint (the only applications I'm currently evaluating in Office Professional). This right-brain improvement in all the applications isn't something you'll see in feature checklists, or if you do it may sound like marketing hype. But seeing is believing.

On the consistency side, one of my past pet peeves with the Office suite was inconsistency. If I created a table in Word and imported it into PowerPoint, or vice versa, I'd always get something different. And if the direction was from Word to PowerPoint, I'd get a "dumbed down" table because that's all PowerPoint could handle. Now I've found that you can create complex (and beautiful) tables in PowerPoint with all the horizontal and vertical cell merges you want, and export them accurately into Word. Not only the power of the new table model, but this consistency across applications, is a very strong inducement to work with the new Office 2007.

Now it is Sunday evening, and it appears I spoke too soon about how well PowerPoint uses styles and its consistency. It appears that if you have existing objects (e.g., bullets) and change the bullet styles via the master, it doesn't apply those changes to the existing bullets, only to new ones. In fact, PowerPoint Help confirms this: "It is a good idea to create a slide master before you start to build individual slides, rather than after. When you create the slide master first, all of the slides that you add to your presentation are based on that slide master. However, if you create a slide master before you start to build individual slides, some of the items on the slides may not conform to the slide master design."

Thus IMHO, PowerPoint styles miss the point: A truly styles-based system would let you change your mind about the look and feel of a particular kind of object, then apply your change to all the objects of that type.

One last observation: Your editing view of PowerPoint slides, where you can see and edit the objects, is called "Normal." Why not "Draft," since Microsoft changed the name "Normal" to "Draft" for MS Word. Another inconsistency. Naughty naughty.

Sunday, March 11, 2007

Interchanging documents and email between 2007 and Office 2003

Here I have very good news. When Word or Excel open a document you created in an earlier version of Office, I found absolutely no errors. I stress-tested the process with some very complex Word documents and some equally complicated Excel spreadsheets. Everything worked exactly as it should in 2007.

I also interchanged email, calendaring appoints, and the like between Outlook 2003 and 2007. Again, interchange worked flawlessly.

Other little Installation Issues -- and a side reflection on Acrobat

As I said earlier, I hadn't planned to remove my earlier versions of Word, Excel, etc. -- only to install the new products so I could test each on the same machine. One product I had hoped not to install was Outlook, but the installation process gave it to me anyway. In some ways that was a good mistake, because there are some minor advantages to the new Outlook -- for example, adding a sender to your safe sender's list so you don't have to download graphics (a give-away to spammers that they've caught a live email account). However, there were some downsides to the Outlook install that I experienced and you will too, or may experience comparable issues. First, I found my Palm device no longer synchronized with Outlook. That's a strong suggestion that the format of the PST file is different. However, the new PST file is not appreciably smaller than the one it replaced, whereas the Compressed XML formats for the other main Office applications provide significantly smaller files. Palm, to its credit, quickly updated their synchronizing software so that now works fine.

Another thing that stopped working was my Acrobat 8 plug-in to Outlook, and with it the ability to select an email folder and PDF the entire contents. (Likewise, the PDF plug-in to other office products also stopped working.) I contacted Adobe about this, and they said they were working on it.

One disappointing Adobe Acrobat side-note: Months ago I got tired of Internet Explorer and switched to Mozilla. However, there is no PDF plug-in to Mozilla/Firefox that Adobe has built for IE. That's a real disappointment; "PDF Capturing" a site within a browser is a facility I used frequently. You'd think that with its commitment to openness, Adobe would provide the same Acrobat plug-in for Mozilla that it does for IE.

Client Side or Web-Based Editing -- interesting observation

I am working on several different PCs as I write and print my Information Insider review of Office 2007 (and companion column). I have to; Office 2007 doesn't run under Windows 2000, which is the system that my printer is attached to. So I work at each and try to keep track of the latest versions, names of files, and where they are. This is not easy.

So with this review, I've decided to reverse things a bit. In the past, I've used this Blog as the place for my "cutting room floor" -- a place to put materials that I thought were important but didn't fit in the printed versions of my work. Now my strategy is to put all my observations in a single place (this blog) and select from the blog whatever I want to put into the final review.

That suggests one element of the strategic decision about office tools: Whether to use client-based tools (residing on the computers where you work) or to use on-line tools as I am doing here.