Artificial intelligence is where the competition is in IT, with Microsoft and Google both parading powerful, always-available AI tools for the enterprise at their respective developer conferences, Build and I/O, in May.
It’s not just about work: AI software can now play chess, go, and some retro video games better than any human — and even drive a car better than many of us. These superhuman performances, albeit in narrow fields, are all possible thanks to the application of decades of AI research — research that is increasingly, as at Build and I/O, making it out of the lab and into the real world.
Meanwhile, the AI-powered voice technologies behind virtual assistants like Apple’s Siri, Microsoft’s Cortana, Amazon.com’s Alexa and Samsung Electronics’ Bixby may offer less-than-superhuman performance, but they also require vastly less power than a supercomputer to run. Businesses can dabble on the edges of these, for example developing Alexa “skills” that allow Amazon Echo owners to interact with a company without having to dial its call center, or jump right in, using the various cloud-based speech recognition and text-to-speech “-as-a-service” offerings to develop full-fledged automated call centers of their own.
Some of the earliest work on AI sought to explicitly model human knowledge of the world in a form that computers could process and reason from, if not actually understand. That led to the commercialization of the first text-based “expert systems.” Those early systems didn’t come by their expertise the way humans do, learning by experience over the course of their career. Instead, the experience was spoon-fed to them following a laborious process of humans interviewing other humans, and distilling their implicit knowledge into explicit rules.
The biggest advances in AI research in recent years, and the ones most applicable in the enterprise, have involved machines learning from experience to gain their knowledge and understanding. Improvements in machine learning led directly to the dramatic 4-1 defeat last year of 18-time world go champion Lee Sedol by AlphaGo, a program developed by Google’s DeepMind subsidiary.
Machine learning began with the creation of neural networks — computational models that mimic the way nerve cells, or neurons, transmit information around our bodies. Our brains contain around 100 billion neurons, each connected to about 1,000 others. An artificial neural network models a collection of these cells, each with their own inputs (incoming data) and outputs (the results of simple calculations on that data). The neurons are organized into layers, each layer taking input from the previous one and passing its output to the next. When the network correctly solves a problem, additional weight is given to the outputs of the neurons that correctly predicted the answer, and so the network learns.
Networks with many layers — so-called deep neural networks — can be more accurate. They are also computationally more expensive — prohibitively so, in their early days. They were saved from being a research curiosity by the parallel processing capacity of the GPU, previously used mainly to display games, not to play them.
Transistors are doing it for themselves
Those advances are giving businesses new ways to deal with their big data problems — but creating the necessary technology is, to some extent, a big data problem itself.
One of our strengths is that we can learn from just a few examples, Google engineering director Ray Kurzweil told attendees at the Cebit Global Conference in March.
“If your significant other or your boss tells you something once or twice, you might actually learn from that, so that’s a strength of human intelligence,” he said.
But in the field of deep learning there’s a saying that “life begins at a billion examples,” he said.
In other words, machine learning technologies such as deep neural networks need to observe a task a billion times to learn to do it better than a human.
Finding a billion examples of anything is a problem in itself: AlphaGo’s developers scoured the Internet for records of thousands of go games played by human players to provide the initial training for their 13-layer neural network, but as it became stronger, resorted to making it play against other versions of itself to generate new game data.
AlphaGo drew on two types of machine learning to win its match. The human games were analyzed using supervised learning, in which the input data is tagged with the response the neural network should learn — in this case, that playing these moves leads to victory.
When AlphaGo was left to play against itself another technique, called reinforcement learning, was used. The goal of winning the match was still explicit, but there was no input data. AlphaGo was left to generate and evaluate that for itself, using a second neural network in which the neurons began with the same weightings as the supervised learning network, but gradually modified them as it discovered superhuman strategies.
A third technique, unsupervised learning, is useful in business but less useful in games. In this mode, the neural network is given no information about its goal but is left to explore a data set on its own, grouping the data into categories and identifying links between them. Machine learning used this way becomes just another analytics tool: It may identify that a game can be played or end in several ways, but it leaves the judgment of what to do about it to a human supervisor.
There are plenty of companies, big and small, offering some of the building blocks of AI for use in enterprise applications and services. The smaller companies often focus on specific tasks or industries; the bigger ones on the big picture, and tools that can be used for general applications.
Thanks in large part to the barrage of publicity surrounding its Watson offering, IBM is one of the first vendors of AI to spring to mind — although it prefers the term “cognitive computing.”
The Watson range includes tools for creating chat bots, discovering patterns and structure in textual data, and extracting knowledge from unstructured text. IBM has also trained some of its Watson services with industry-specific information, tailoring the offering for user in health care, education, financial services, commerce, marketing and supply-chain operations.
IBM and its partners can help integrate these with existing business processes, or developers can dig in for themselves, as most of the tools are also available as APIs on IBM’s Bluemix cloud services portal.
Cognitive is Microsoft’s preferred term too. Under the Microsoft Cognitive Services brand, it offers developers access to APIs for incorporating machine learning technologies into their own applications. These include tools for converting speech to text and understanding its intent; detecting and correcting spelling mistakes in a text; translating speech and text; and exploring relationships between academic papers, their authors and the journals that publish them. There’s also a service for building chat bots and connecting them to Slack, Twitter, Office 365 mail and other services, called Bot Framework. Microsoft also offers an open-source toolkit businesses can download to train their deep learning systems using their own massive datasets.
At Build in early May, it offered production versions of services previously only available in preview, including a face-tagging API and an automated Content Moderator that can approve or block text, images and videos, forwarding difficult cases to humans for review. There’s also a new custom image recognition service that businesses can train to recognize objects of interest to them, such as parts used in a factory.
Google offers many of the machine learning technologies it uses internally as part of its Google Cloud Platform. The systems are available either already trained for particular tasks or as blank slates that can be trained on your data, and include image, text and video analysis, speech recognition and translation. There’s also a natural language processing tool for extracting sentiment and meaning from text that can be used in chat bots and call centers. There’s even a super-focused job search tool that attempts to match jobseekers with vacancies based on their location, seniority and skills.
As for Amazon Web Services, it allows businesses to create new “skills” or voice-controlled apps for Alexa, the digital assistant embedded in Amazon Echo devices, and offers many of the technologies behind Alexa “as a service.” Its latest is a call center as a service, Amazon Connect, charged for per call and per minute. This offers integrations with Amazon’s speech recognition and understanding services, allowing businesses to create more sophisticated interactive voice-response (IVR) systems.
When tomorrow comes
Those services are all in production, but there are plenty of others waiting in the wings.
Microsoft, for example, is already inviting businesses to test “preview” versions of several other services. These include the Emotion API image analysis tool that can identify the emotion expressed by faces in photos, assigning relative probabilities to anger, contempt, disgust, fear, happiness, sadness and surprise. (You can send it a selfie to try it out.) Coming enhancements to the company’s speech tools will allow businesses to tune the engine to specific regions or environments (Custom Speech Service), and even to recognize the speaker.
A new tool called QnA Maker extracts frequently asked questions from a corpus of text, and serves them up as answers for a chat bot. The results so far are somewhat obtuse, but that could be a problem with the source text rather than QnA Maker, which in all likelihood has not yet read a billion FAQs to learn its art.
At Google’s Cloud Next ’17 conference in San Francisco in March the company unveiled a private beta test of its Cloud Video Intelligence API, which will allow beta testers to find relevant video clips by searching for nouns or verbs describing the content. Google hopes to stimulate further demand for its services with a new machine learning startup competition it is running with venture capital firms Data Collective and Emergence Capital, and with the opening of its Machine Learning Advanced Solution Lab in Mountain View, California, where customers can work with Google experts to apply machine learning to their own problems.
Two months later, at Google I/O, the company showed the TensorFlow Lite platform for mobile phones, and a beefier processor for running machine learning workloads, the Cloud TPU (Tensor Processing Unit). It also published details of some of the machine learning APIs it had been using internally.
The big companies don’t have a monopoly on research into AI, but competition for qualified personnel is fierce. Facebook, which has its own internal AI research division, organizes internal training events to raise awareness of machine learning among its staff.
Some of the biggest companies engaged in AI research are showing a willingness to publish their results and to release much of their code under open-source licenses. Even the notoriously secretive Apple published its first research paper in the field late last year.
But they’re not giving away the crown jewels. Those machine learning toolkits and cloud services are all very well, but it’s clear that an untrained neural network is about as useful to the typical enterprise as a 16-year-old high-school drop-out.
Experience counts, just as it does in recruiting, and companies like Google, Facebook, Amazon, and even Apple and Microsoft, are gathering those billions of little examples that Kurzweil spoke of. Every search result clicked on or shopping recommendation accepted, each photo tagged or sports score asked for is added to the collection.
Of course, a billion examples may not always be necessary: Computers can learn to do some things nearly as well as a human with a lot less data, and for many tasks today, nearly may be good enough, especially if the computer is able to refer situations it can’t deal with to a human supervisor.
Right by your side
That’s what many of the organizations building AI-powered chat bots are counting on, in any case. They have far fewer than one billion data points to go on, but they’re still hoping that services like Microsoft’s QnA Maker will help them serve customers in new ways.
One such is Arthritis Research UK, a charitable organization that funds medical studies of joint inflammation and provides advice to sufferers. It is using IBM’s Watson Conversation API to build a virtual assistant that will answer questions about joint pain and suggest appropriate exercises to alleviate symptoms.
The organization’s goals are twofold: To reduce the load on its existing telephone support staff, and to create a new conversational channel through which it might deliver other services in the future.
The assistant has already learned 1,000 answers to common questions about 50 musculoskeletal conditions.
“We will be extending its capabilities to include information about medical and surgical treatments as well as diet in due course,” said Shree Rajani, communications campaign manager at Arthritis Research UK.