Watson, Gordy and Paris Hilton: How Artificial Intelligence is Innovating ECM

Sometimes it feels like we’re a little spoiled by today’s fast-moving technology. Remember when the ability to brew a cup of coffee in less than two minutes felt like the cutting edge of sci-fi? Now I stand in front of my Keurig every morning with the timer on my Apple Watch running, tapping my fingers and saying “really, a minute and 11 seconds? What is this, the Dark Ages?”

Similarly, voice recognition devices are everywhere, but we’ve gone from amazement at being able to say “Alexa, turn on the TV” and have it happen to annoyance that Alexa responded to my voice command of “Play ‘Lord of the Rings’” by turning the TV to “Fellowship of the Ring” when I wanted to watch “Return of the King.” What’s the point of artificial intelligence if it can’t anticipate my every need?

Enter IBM’s Watson. You may know Watson best as the computer that beat the humans on Jeopardy, leading reigning champion Ken Jennings to “welcome our new computer overlords.” In case you don’t remember, here’s an article from TechRepublic that not only is a great overview of the Jeopardy event, but also explains the evolution of Watson into something that could serve as an asset to business (initially, healthcare) summed up this way:

“Jeopardy Watson had one task — get an answer, understand it, and find the question that went with it. It was a single user system — had three quizmasters put three answers to it, it would have thrown the machine into a spin. Watson had to be retooled for a scenario where tens, hundreds, however many, clinicians would be asking questions at once, and not single questions either — complex conversation with several related queries one after the other, all asked in non-standard formats. And, of course, there was the English language itself with all its messy complexity.”

Watson gets smarter

This brings us to Watson in 2018: an older and wiser Watson, if you will. In 2016, IBM unveiled Watson solutions for professions: cognitive solutions for marketing, commerce, supply chain and human resources. You’ll read a lot about that last one in Workflow; the IBM Talent Acquisition folks have appeared in these pages quite a bit. With all of these solutions you’ll notice two key phrases: “cognitive analytics” and “natural language understanding.”

Let’s look at cognitive analytics first. We frequently hear that phrase in conjunction with big data, and that’s largely because cognitive analytics involves collecting insights from all kinds of data, both structured and unstructured, and then using it to aid human understanding of the data. According to Technopedia, “The general concept here is that enterprises collect or aggregate large amounts of data from very diverse sources. Specific software programs or other technologies analyze these in depth to provide specific results that help a business get a better view of its own internal processes, how the market receives its products and services, customer preferences, how customer loyalty is generated or other key questions where accurate answers are used to provide a business with a competitive edge.”

This brings us to Watson Natural Language Understanding, which extracts metadata from unstructured content and incorporates it into a variety of services that help users derive understanding from that content. Since it may be one of those things that’s easier to see than read about, I recommend checking out the demo on IBM’s website.

The best example of the need for and complexity of natural language understanding is known as the “Paris Hilton problem.” Is someone who performs a search for “Paris Hilton” seeking a hotel room in France, or are they researching social anthropology, celebrity sex tapes and how the Kardashians became famous? There’s all kinds of nuance at play. Natural language understanding can help parse the ambiguity of human language — without judging your content consumption habits.

What does it look like in practical application? We spoke to Aarti Borkar, vice president, Product Management and Design for IBM Watson Talent and Collaboration, who explained how the Watson Candidate Assistant utilizes the technology. “We recognized some serious pain points in the candidate experience, particularly related to the onerous process of searching for jobs via keyword,” she said. “Watson Candidate Assistant uses natural language understanding to gather concepts, skills, and keywords from both resumes and job descriptions to provide the best job matches for a candidate. The API used enables candidates to have a conversation with Watson as if they were talking to a recruiter.”

What did the results look like? “We piloted the tool within IBM and saw three times the amount of applications compared to our standard keyword search tool,” she said. “Most recently, an independent digital media company piloted the tool in partnership with Uncubed, and saw that candidates using Watson during the pilot were 34 percent more likely to progress to a face-to-face interview.” (Read more of our interview with Borkar here).

Intelligent ECM

After spending enough time learning about the evolution of Watson and its many applications for business, we were understandably intrigued when the Gordon Flesch Company announced “AskGordy” — bringing Watson’s cognitive capabilities to the world of ECM.

Mike Adams, who is the brains behind AskGordy, is the manager of development for GFConsulting, the business arm that was created when Gordon Flesch acquired Cambridge Connections in late 2015. A longtime customer of Gordon Flesch, Cambridge Connections had been leveraging Laserfiche content management software to offer custom technology integrations along with cloud-based services to 24 different vertical markets for more than 30 years. As an IBM business partner for most of those 30 years, when Adams saw the Watson toolkit becoming available two years ago, he recognized some potential new applications.

“Our background is in document management and workflow, and we presumed there would be a market for an application that used all those Watson tools in the background but would be pointed to various content and file sharing systems,” he said. Feeling it would serve commercial and government agency users, Gordon Flesch began building exactly that, starting with a small pilot proof of concept that built up to a project using publications from the USDA.

Why the USDA? They were already a content management customer of Gordon Flesch, said Adams, using the Laserfiche ECM, and their publications themselves are public and available on the web. “All those publications get to that website through our content management and workflow system,” said Adams. “It gave us a background of publications that we were familiar with, that were available to the public, and where we had a use case where we could already see the search results users were getting from standard search.”

The results are quite different. Standard search is what we’re all probably familiar with — looking for words and roots of words, and doing an exact match. On the Watson side the query is done in natural language, and, as illustrated by the Paris Hilton problem, it’s all about intent. The intent of the query is extracted by the Watson engine based on training it is given, and using natural language understanding it returns results based on that intent. While a standard search would return the terms you searched for, a Watson query returns other results or responses based on the intent of your query rather than the exact words. “The results can be startlingly different,” Adams confirms. “We believe the Watson front end with our training is more likely to know what you mean.”

How does Watson get its training? It varies by use case. “We train the engine to understand in the context of the publications we’re providing it — how those entities and mentions and relationships would be commonly recognized by a reader in that particular document series,” explained Adams. “We can use some of our proprietary methods to pre-train Watson from a series of dictionaries, and explicitly define things that might be out of the ordinary.”

An example of this from the USDA documents is the abbreviation “FV.” Type “FV” into Google and you’ll get a lot of results pertaining to “future value” — the most typical meaning of FV. In agriculture, however, FV usually stands for “fruits and vegetables,” and so for the USDA use-case scenario, Watson has been trained to understand that its search is being performed in an agriculture series of documents. Additionally, Watson would know that the term “vegetables” includes individual vegetables — it would also return responses relating to discrete vegetables (zucchini, artichoke, peas) rather than just the aggregate term “vegetables.”

This is the case for most Watson projects, in which subject matter experts help define things particular to their expertise. For example, one application relating to oncology has thousands of doctors helping to train it. They then help build a dictionary that, presuming it is not proprietary, can continue to be expanded.

How valuable is this type of cognitive computing to the ECM world? According to some estimates, as much as 85 percent of all data contained in documents is “dark data” — not searchable by traditional methods. The ability to mine that data and return intuitive, relevant results – data that might otherwise have been lost entirely — would seem to be invaluable, or at least have a high price tag attached. Large organizations may employ data scientists or business intelligence experts to delve into this data and bring it into the light in a useful, searchable manner for analysis — but not all companies can afford that luxury. The ability to mine the dark data hidden in the typical ECM with a scalable, programmable machine-learning solution would certainly hold value for many such organizations.

The idea of robots and machine learning is often met with a combination of fear, skepticism and anticipation in today’s society — we joke that the robots are taking over, we worry about what Alexa overhears and we complain that Siri doesn’t understand us. But the truth is that there are many advantages to these advances as well. A 2017 IBM and Oxford Analytics study surveyed 6,050 C-suite executives worldwide and found 88 percent of all the highest-performing organizations reported cognitive computing was inevitable in their industry, and that their organizations either have already embraced the technology or were ready to do so. And of those high performers, nearly half were already either piloting, implementing or operating cognitive technologies today, versus only 11 percent of the lower-performing peers.

There are lessons to be learned from Watson, Gordy and yes, Paris Hilton — the future is here, and it’s getting smarter every day. Embrace it.

is BPO Media and Research’s editorial director. As a writer and editor, she has specialized in the office technology industry for more than 20 years, focusing on areas including print and imaging hardware and supplies, workflow automation, software, digital transformation, document management and cybersecurity.