An important side effect of digital transformation is that your network is likely to become a digital crime scene. As such, it needs a systematic approach to identify the culprit. In this analogy, a crime can be equivalent to a network outage or gray failure. And this is where intent-based networking (IBN) can help.
The general approach in solving a crime like this is to collect as much information as possible, as soon as possible, and to narrow down the pool of suspects. So, let’s see via an example what role IBN plays in all this.
Digital crime scene profiling
Without “intent” you don’t even know that a crime has been committed. Finding traces of blood in a room in a blood bank or hospital are expected. Finding traces of blood in a room of a home of a missing person is a different matter. But without intent it’s hard to distinguish a blood bank from a home. In a similar manner, dropping a packet of an intruder or forbidden traffic source is a good thing. Dropping a packet of a customer because of a misconfigured ACL is a bad thing. Intent helps you differentiate the two.
But even when the intent is known, there will be things you don’t know. For example, you have an algorithm to optimally distribute application workloads across racks in a data center. But you don’t know what the exact application’s behavior is in terms of generating traffic, nor how a particular combination of applications behaves in terms of traffic burstiness. As a result, minor faults (also known as “latent failures” in this Microsoft paper about gray failures) may occur. These may include micro-bursts, ECMP imbalance, or temporal link overloads, and you need to be able to detect them.
Detecting latent failures is one thing. What you do about them is an entirely different matter. In order to focus your digital crime scene investigation, you need to collect as much relevant data as possible. The key word here is relevant, for reasons explained later. And ideally, you need to do this while the crime is happening. To achieve this in your network you need to have an intent-based framework that reacts to some anomaly by starting a drill-down for more information or performing some actions. IBN is that framework that reasons about the change given the context of the intent. In law enforcement this would be the equivalent to spawning an FBI investigator in the home of a victim the moment blood hits the floor. The benefit of doing this in real-time while the “crime” is happening is two-fold:
- First, you may not be able to collect this next level of detailed information all the time and on all the resources. A separate question is do you even want to do it? Even if law enforcement had a warrant to search the home of every person on Earth they may not be able to do it, or it may not be cost effective even if they were given the low possibility of finding good information. For example, a recent and well known security breach at Target went undetected as the operator missed the alert that caused the breach as it was buried among thousands of other innocuous alerts.
- Second, even if you managed to do the above, you will end up spending most of the time finding out useful information in the sea of non-relevant data. I remember seeing a scene in one of the TV crime shows where one of the FBI investigators sees the crime he was working on revealed on the national news TV circuit asking for “anyone who knows anything” to come forward. His reaction was one of despair and frustration: “Now, all the garbage of this country is going to end up in our laps.” He was clearly anticipating the avalanche of false leads, which in turn increased the likelihood of missing the right lead and was sure to drain valuable resources.
Going back to our example, upon detecting latent failures associated with placement of application workloads, an intent-driven probe is spawned to collect information about relevant context such as specific applications that were active on the server at the time of latent failure, and the status of related resources (protocol state, MAC/routing/TCAM table state, queue occupancy, CPU/memory). This information is invaluable as it is:
(a) timely (collected during the “crime”)
(b) relevant (collected on resources related in specific way to latent failure).
And to get resources that are “related in a specific way” to latent failure you need intent as a single source of truth revealing these relationships. In a similar fashion, FBI profilers provide the ever important first step in any investigation: focus by narrowing down the pool of suspects, or in the case of a digital crime scene – possible root causes.
This timely acquired information may lead to identifying the culprit. Or, if not, it may be used to store this information for post-mortem analysis that can be used to recognize the “signatures” leading to latent failures. These “signatures” can then be fed back into an IBN system’s validation logic which can from that point on recognize the formation of these signatures and proactively deal with them. Again, the key enabler for this validation is IBN’s ability to reason about a single source of truth in the presence of changes that lead to the formation of these signatures.
How IBN helps
To deal with Day-2 operations of your network, a common approach is to collect what is easy to collect. You have a warrant to search every house but no leads. This typically results in a lot of data, but not necessarily semantically rich data as it contains little context and little knowledge can be extracted out of it. In fact, too much data may be an outright bad thing as it obscures good data.
With intent-based analytics the situation is reversed — you can extract a lot of knowledge by collecting less data. This is possible because with IBN you know a lot upfront. You know how things should be working. And you are testing proactively for problems informed by the context in the intent.
Network reachability (how does a packet traverse a network) is an essential aspect of a network service. It is what blood is for humans. There are already quite a few tools that can, after they discover the forwarding state, model how any packet would traverse a network and what path it would take. They can answer any question about reachability you ask. This is equivalent to the ability to answer the many questions about the health of the culprit given that you have a blood sample. But it is just a piece of the puzzle. Solving a crime has gone way beyond just blood analysis. What if you knew (witness testimony) that the culprit was bald, had a patch on his eye, had skin discoloration, wore boots size 11, used a gun of a specific caliber and frequented a certain food joint?
It is how you leverage and compose this context that will narrow down and focus your investigation. In this situation, Intent-based analytics would translate into CCTV cameras detecting the image of the suspect, which then automatically triggers analysis of his credit card transactions and phone usage in the area where the image was captured and at the time it was captured, all to track his movements and get more data points. Something you could not afford to do for every person with blood type AB negative.
Detecting failure signature requires composing information along multiple dimensions, and it is this composition aspect that is at the heart of IBN. These are the early days of IBN, so some of the initial set of solutions, coming from established vendors are still one-dimensional. Putting multiple one-dimensional solutions together doesn’t make them multi-dimensional. It makes them a set of isolated solutions with multiple sources of truth. The complexity of composing them together and asking the right question is still the responsibility of the operator.
The ability to answer any question is a fantastic benefit, but it is not enough. The real key is being able to ask the right question at the right time. Or better yet, to have IBN and IBN-generated analytics, do this automatically on your behalf.
This article is published as part of the IDG Contributor Network. Want to Join?