Friday, February 18, 2011

Woo hoo! Smart search is possible

Watson had not been on my mind when I wrote my last blog post. In fact, if you were to shake me awake in the middle of the night and say, "Watson!", my immediate response would have been "Elementary, Holmes!" Despite being vaguely aware of a project called Watson underway in my company, the name didn't really mean "Wow" to me till, well, last night.  But before I tell you why, let me go back a bit - and talk about my last blog post. In that post, I had implied 'storage' and 'retrieval' are two different things and, while tagging and indexing might work when one is storing something, the same tag or index could fail while retrieving the data. In my post, I had put foward a question - should content be context insensitive for it to be reused effectively? Conventional wisdom - that which runs search engines and SEO jobs - says, "No".

And then came along Watson, and beat two people in Jeopardy to win an indecent amount of money, which, it was declared, would be given to charity.  A computer beat two awesome humans.


Somewhere halfway through this video, when analysing why Watson got 'Chicago' wrong, you'll see the IBM engineer say something like "the info was stored in sectors but this was not about discrete compartments" or something to that effect.

Exactly! Human brain does not process information in linear paths. It hops, skips, jumps, runs around in circles. And, because we're getting there - with Watson's help - I think in my lifetime at least I'll see content being reused in ways that I (the writer) never imagined it could be because the user (with help from Watson) is retrieving just that much - and only that much - information that the user needs. Correctly, every time, the first time.  Smart user!

Thursday, February 10, 2011

Lose that tag please

It all started last night when I tweeted some Hindi film dialogues (In India, we have two major religions - films and cricket).
(to zoom, click the image)
And then, we got around to discussing the bestest comedy film that's ever come out of Bombay. And, this was the tweet that got me to writing this blog post.
(to zoom, click the image)
How is this relevant to techwriting?

Well, if I were an indexer, I'd have probably tagged all my film content with some of these words: bombay, bollywood, hindifilm, <nameOFfilm>, <nameOFstar>, and so on? Very search friendly and all. But say, my reader was a sociologist researching corruption in India (ahem!).  This dialogue ("Thoda khao, thoda phenko" #epic) would never even show up in the search results, yet it contains exactly what the sociologist is looking for - the entire social mileu from which the phenomenon has sprung (including the defence mechanisms people employ to forget the misery it produces).

So, how did I link this dialogue to scams (that are occupying the entire front page - and more - of newspapers these past weeks)? Because in my mind, my content database is neither indexed nor tagged. I can pull out random references and tag it to anything random and yet make it look relevant.
Me: To copy, perchance to paste; Aye, there's the rub for in that paste what copies might come when we have shuffled the platforms and the versions out of the filters.
Colleague: Coils. Not filters. Coils.
Me: Coils.
<silence for one minute>

Colleague: A coil is a wrapper, right?
Me: Right.
Colleague: So, if we put a wrapper to call the boolean....
So, here I am, thinking if there is more to indexing that meets the eye. More to content reuse? More to "context"? Thoughts?