Privacy and Visibility – The dichotomy of encryption and inspection

May 16, 2017

The encoding or encryption of communications and information is a very old practice. The concept is relatively simple. One of the easiest examples is simply to reverse the alphabet, A for Z, B for X and so on. The reverse function is the ‘key’ to deciphering the message. We needn’t go into the detailed but fascinating history of the evolution of cryptography and the concept and method of the key. Instead we only need to touch on a few key historical milestones and how they have impacted the world today.

Cryptography is indeed an old practice. The ancient Romans would write encrypted messages on strips of cloth that were wrapped around wooden staffs of various widths. They would then send just the cloth strip with the courier. Only if the right staff was used could the message be deciphered. Here the ‘key’ is the width of the staff. That information would either be known or communicated to the receiver ahead of time so that they would have the right staff on hand to decipher the message. Obviously if anyone intercepted the information regarding the width of the staff, they could also decipher the message if they intercepted that as well. So we see that the secrecy of the key is a very important thing in all of this. Let’s move forward to the Mongol Empires of Asia, when one Khan wished to send a secret message to another they would shave the heads of two boys. On one of them they would tattoo the key on his scalp on the other child they would tattoo the encrypted message. Then they would let their hair grow back and send them on separate caravans to wherever they needed to be sent. Once again, the ‘key’ child would be sent before the ‘message’ child. Once the children were received their heads were shaved, the information was recorded and then they were placed into training for the Khan who received them or used once again to send a return response. Here we start to see an intentional effort to hide not only the message but the key as well. This practice is known as steganography. This will begin a long line of intrigue and secrecy that is still prevalent today in the world of cryptography in cyberspace. Key cracking is a very important method as well as hash reversal to obtain clear text passwords. If we move forward again to the Second World War we have the legendary Enigma machines of Nazi Germany. This code was literally unbreakable in the earlier portions of the war. This was due to the increased complexity of the method, which involved various geared disks and dials of various ratios within a machine but also in the sophistication of the key distribution method; little black books that every field commander who handled communications was in possession of. Obviously, it was a very big leap forward for the allies to obtain one of these books, but even then without detailed knowledge of the Enigma method it was of little use and these books were updated and distributed on a regular basis as well so they were used for only a limited amount of time. Needless to say however, through the diligent reverse engineering of Marian Rejewski, Alan Turing and others the Enigma code was cracked and this breakthrough contributed strongly to the Allied victory. What many folks don’t realize is that this was also the foundation of modern computing as we know it today. That’s right, modern computing started as a need to crack a code to decipher the encrypted messages of the enemy. So from time immemorial, the pattern had not changed.

Guess what? – Times have changed!

Today we use encryption without even thinking about it and that is the intention. It’s a part of daily life. It is transparent to us as users. There is encryption of data in movement as well as encryption of data at rest and access authentication as well as transactions. As an example, the disk on the laptop I’m using is encrypted. I never have to deal with it other than that I see a brief message about it during boot up. It is the same thing with web sites, when I go to a secure site the encryption and key methods are automatic. I don’t need to deal with the complexities of the key exchange. This is obviously all very nice but it also creates an infrastructure that can be potentially compromised by someone with the right skill and tools. So this creates an ‘arms race’ of sorts that we witness today with stronger encryption methods and much longer – really, really long keys that are regenerated very frequently. The evolution of cracking methods follows. Quantum key generation is a recent evolution in this as there are two things that make for complexity in cracking a key, the length of it and the randomness of its generation. QKG can easily accommodate both.  Indeed with advent of the developments around true quantum encryption a truly unbreakable cipher will finally be achieved that is beyond the known powers of modern computing to crack. So it sounds like we won right? It sounds like we have beaten the bad guys and we will now soon have the ability for secure communications without fear of interception. While this sounds good, like all things in security, not quite.

This has alarmed authorities. Many law enforcement and national justice agencies are raising the concern that unbreakable (or at least exponential) ciphers are essentially impossible to eavesdrop in anything near real time. This becomes a very good avenue for criminal and terrorist organizations to establish and maintain communications with little risk of being monitored. This is already done on a regular basis with existing ‘strong’ cryptographic methods. Law enforcement is reduced to analyzing communications patterns which is eerily similar to the situation that the Allies were in at the early part of WWII. So we come to following dichotomy as we will illustrate below.

  • GOOD – There are malicious entities that strongly desire to extract or intercept and read your data. They would also like to impersonate you. As a consequence you desire secrecy through the use of encryption.
  • BAD – There are malicious entities that wish to communicate to one another in secret as a consequence they also would desire secrecy through the use of encryption.
  • GOOD – Law enforcement agencies desire the ability to intercept and decrypt criminal or terrorist communications in order to gain better insight into their activities and plans.
  • BAD – Malicious entities also desire the ability to intercept and decrypt your communications in order to gain better insight into your data and identity.
  • BAD – Malicious entities also desire to use your own encryption methods as a cover of darkness to move laterally and northbound towards your sensitive high risk assets.

Two things to note in this, first the bad points are more numerous than the good! Second, point number five is a very scary thought if it has never occurred to you before. Many folks make the incorrect but easy assumption and invest in a strong outer perimeter defense and the use of encryption within that perimeter for the protection of sensitive data both in movement and at rest. At first glance this seems like good prudence, but if the practice is taken too far it can be a very bad thing from a security practice perspective.

There is now the realization that cyber-communications requires some sort of independent inspection visibility into the end to end data path. The reasons for this are multifold as two simple case points illustrate;

1). User password and account access privileges can be compromised and the intruder uses this compromise to gain access to encrypted services. From there avenues for further infection and the establishment of command and control are provided by the normal encrypted channels.

2). A user’s device could be compromised allowing the attacker to utilize the user’s identity to further infiltrate the network and attached systems. There are several methods for this to occur with mobile edge devices that at times may yield vulnerabilities due to user’s behavior, a shortcoming in the edge protection mechanisms or a combination of both.

There are many examples in both categories and others that illustrate that assuming total end point protection is not a realistic proposition. If there is a determined intruder, they will get access. The question is then, how far do they get and what is the impact? How deep can they penetrate into the network or how much damage can they do before they are discovered? The data is scary. 80% of breaches are not discovered by internal IT security but instead by 3rd party resources such as law enforcement agencies or financial partners and though this number has improved recently (it shows folks are listening!) the ratio is still far too high. Add to this the fact that the average time of infiltration is around 240 days prior to breach or discovery. Again, this number is improving but not nearly fast enough and it is likely to never become ‘zero day’. These are disturbing results. It means that much of the security infrastructure we have built is being used against us in ways that we didn’t intend, allowing intruders to move or extract content at will, particularly if we are not diligent in practice and become complacent on the technology.

This clearly demonstrates that we require visibility into the network information flows in order to see anomalies quickly and investigate their cause. Because there is one thing that is certain, if you are a target they will get in – eventually.  So it is no longer a question of maintaining a secure perimeter with a strong inspection and sandboxing environment. It’s about detecting the threat before it gets too far into the secure domains of interest. This could be credit card data, health care personal records, individual criminal justice records, industrial control networks, intellectual property, etc. The list could go on and on and it will only increase with the growth of the Internet of Things. If you haven’t noticed the whole paradigm has shifted from encryption and secrecy to visibility and inspection. Indeed there is an almost Yin Yang relationship between the two in a truly comprehensive security practice.


Visibility into the SDN Fx Fabric

SDN Fx Fabric Connect yields what is referred to as a ‘stealth’ network topology. There are two key terms here that need to be pointed out. Stealth – the ability to be undetectable or undiscoverable, and Topology – the actual layout of the network infrastructure and internal switching paths. As pointed out earlier it is important to realize that stealth networking is focused on this alone and hence network data is not encrypted by default.

All accepted encryption methods work over SDN Fx however. So an IT architect could in theory encrypt into the stealth network end to end.  So the question is where do we get the visibility we require?

This is where our work began with various security partners. The need to have visibility into network traffic is a key critical asset for a proper security practice. What we have arrived at is part technology, part practice. It is based on three points.

Point one – End to end encryption is a bad thing as a general practice

The simple fact of the matter is that the bad guys use encryption as well. Many will argue that they would pick out these covert sessions but this is increasingly becoming untenable. If end points that use valid encryption become compromised the very same ‘trusted’ encrypted session could be used to further infiltrate and compromise or be used for exfiltration of data. If you’re encrypting end to end then there is no place so see any indication of abnormality other than changes in traffic patterns.


Point two – There needs be the enforcement of encryption free zones

As a response to point one, when encryption is used there always has to be a point for the inspection of clear data. Ideally, this should be coordinated at the most efficient and cost effective level. This requires some thought to the implementation and purpose. Further below in this article we will provide some further insights into this challenge.

Point three – There needs to be coordination between the security and network infrastructure to deliver the points highlighted above

In order for the above two points to be realized there needs to be the coordination of network service paths to enforce that required data is decrypted, inspected and if required re-encrypted. This brings about the concept of a network security service chain and it can be very powerful if properly leveraged with the right technology sets.

There is a resulting triangulated relationship which results in what is displayed below. On the lower left hand corner there is the category of ‘Stealth’; the ability to decrease or eliminate the ability for topological interpretations of the network infrastructure as well as movement within it. This is also married with ‘Hyper-segmentation’, which is the capacity for micro-segmentation with dynamic elasticity. On the lower right we have Encryption, which while at the surface might seem a good thing (in the end it is I promise…. stay tuned) as we’ve seen it can be a bad thing as well. Malicious entities could use the very encryption you trust to be used against you.

At the apex of the triangle we see the requirement for visibility into the traversing traffic patterns of the network. Without this capability the overall security practice is drastically compromised. The holistic aspect of this approach is obvious. If we can intercept or corral information into proper inspection boundaries then we have the ability to enforce encryption free zones where security instrumentation can be applied with the most effective results

The following diagram illustrates several SDN Fx topologies where encryption free zones can be engineered. Any allowed encrypted data patterns should require strict policy exception and should be monitored by a separate encryption free zone further into or out of the network. This requires the use of service chaining the zones into whatever form is required by the service in question.

At the traditional data center demarcation we see the use of two forms of encryption free zones. The first is the User to Network Interface boundary (UNI) which deals with normal 802.1Q tagged data. The second is found at the Network to Network Interface level (NNI) which is based upon 802.1ah framed data. SDN Fx is based upon 802.1ah transport. This provides for two potential areas where encryption free zones can be enforced to provide for total visibility to traversing traffic. Any use of encryption should be prohibited or at least require policy exceptions and alternative methods of inspection.  While the above approach covers the majority of north-south user traffic there is also the need to provide for inspection of any incoming traffic from the DMZ or other perhaps federated demarcations.

We also see that encryption free zones can also be implemented in certain edge situations where there is a high volume of encrypted peer to peer traffic. This is increasingly becoming a requirement in certain distributed IOT frameworks. This is due to the increased push of both computing and data to the network edge. The term that is often used for this phenomenon is a term known as ‘fog computing’.  The ability to inspect on these types of traffic patterns requires encryption free zones that are relatively close to the network edge. Here the UNI/NNI boundary is the best place to implement inspection as well and it may be very specific to a function or a service. Note also that with the recent introduction of VOSS 6.0, SDN Fx can now mirror I-SID’s (Virtual Service Networks) from the UNI side to monitoring systems, which provides for an excellent method to focus on certain edge behaviors in a very dedicated and discreet fashion. This approach does however require that the encryption function be decoupled from the actual device or end station. If this cannot be accomplished then the traffic should go to a demarcation where such an encryption free zone can be enforced. This incidentally is the approach of our Surge 2.0 Secure IOT solution offering.

The notion of a network security service chain is a very powerful one and not necessarily new. I first came upon this issue back in 2006 and posted a patent on the notion of mid-com interception for inspection which was posted in US Pat# 7,739,728.  SDN Fx provides for the ability to control service paths and accomplish this concept of ‘data corralling’. With this, controlled network hyper-segments are directed into encryption free zones where inspection can then occur. In the diagram below there is the illustration of a security service chain for the interception of clear data for inspection now in development that can occur within a single network element.

As can be seen there is an encrypted hyper-segment that is coming into one side of the network element that is decrypted, cataloged and then placed into a security inspection service chain that would involve full packet capture as well as threat and intrusion services. Additionally there may be other services such as load balancing or firewalls that may be desired, particularly at major traffic junctions. On the out bound side of the network element the data is re-encrypted and then sent out to its destination. Note that clear data is never on the wire. The whole service chain can occur within a single network element. This is the core essence of the security inspection point highlighted in patent # 7,739,728. But the true evolution is in the embodiment of the platform to support such functions in a truly virtualized fashion. This has now been achieved in our development labs.

Security technology partners, can provide for the threat intelligence and analytics that can be placed within these zones whether UNI or NNI. The end result is a holistic secure framework that incorporates encryption for content security but in a controlled and monitored fashion that always has a point of inspection of traversing data. Why is this important? Well, we are all aware of the severe damage that the Wannacry ransomware attack has wrecked. It literally took down Spain’s Healthcare infrastructure to the point where only emergency situations were being handled. Many other countries were affected as well. The reality of it is that encryption offers no protection against such a worm. As a matter of fact it will use your encryption services to propagate quietly and undetected. SDN Fx makes it very difficult for the worm to propagate due to hyper-segmentation and isolation of critical communities. So while you might be impacted by it, if the network is properly designed your critical assets should remain safe. But this is clearly not enough. You need to be able to detect it prior to it doing any significant damage. Encryption free inspection zones offer this capability.

This brings a new level of meaning to the motto – “Prevention is an ideal, but detection is an absolute must!”



Advanced Persistent Threats

November 23, 2016

titleAdvanced Persistent Threats

Coming to a network near you, or maybe your network!


There are things that go bump in the night and that is all they do. But once in a while things not only go bump in the night, they can hurt you. Sometimes they make no bump at all! They hurt you before you even realize that you’re hurt. No, we are not talking about monsters under the bed or real home intruders; we are talking about Advanced Persistent Threats. This is a major trend that has been occurring at a terrifying pace across the globe. It targets not the typical servers in the DMZ or the Data Center, but the devices at the edge. More importantly, it targets the human at the interface. In short, the target is you.

Now I say ‘you’ to highlight the fact that it is you, the user who is the weakest link in the security chain. And like all chains, the security chain is only as good as its weakest link. I also want to emphasize that it is not you alone, but myself or anyone or any device for that matter that accesses the network and uses its resources. The edge is the focus of the APT. Don’t get me wrong, if they can get in elsewhere they will. They will use whatever avenue they find available. That is another point. The persistence part, they will not go away. They will keep at it until eventually they find a hole, however small and exploit it. Once inside however they will be as quiet as a mouse. Being unknown and undetected is the biggest asset to the APT.

How long this next phase goes is not determinable. It is very case specific. Many time’s its months if not years. The reason why is that is not about attacking, it’s about exfiltration of information from your network and its resources and/or totally compromising your systems and holding you hostage. This will obviously be specific to your line of business. In the last article we made it plain that regardless of the line of business there are some common rules and practices that can be applied regardless to the practice of data discovery. This article hopes to achieve the same goal. To not only edify you as to what the APT is but illustrate its various methods and of course provide advice for mitigation.

We will obviously speak to the strong benefits of SDN Fx and Fabric Connect to the overall security model. But as in the last article, it will take a second seat as it is the primary practices and use of technology regardless of its type, as well as the people, policies and practices that are mandated. In other words, a proper security practice is a holistic phenomenon that is transient and is only as good as the moment of space and time it is in. We will talk to our ability and perhaps soon the ability of artificial intelligence (AI) to think beyond the current threat landscape and even perhaps learn to better predict the next steps of the APT. This is how we will close. So, this will be an interesting ride. But its time you took it.

What is the Advanced Persistent Threat?

In the past we have dealt with all sorts of viruses, Trojans and worms. Many folks ask, what is different now? Well in a nutshell, in the past these things were largely automated software devices that were not really discerning on the actual target. In other words, if you were a worm meant for a particular OS or application and you found a target that was not updated with the appropriate protection you nested there. You installed and then looked to ‘pivot’ or ‘propagate’ within the infected domain. In other words in the past these malicious software were opportunistic and non-discretionary in the way they worked. The major difference with the APT is that they are targeted. They are also typically staffed and operated by a dark IT infrastructure. They will still use the tools, the viruses, the Trojans the worms. But they will do so with stealth and the intent is not to kill but to compromise, perform exfiltration and even establish control. They will often set up traps that once it is clear they have been discovered they will run a ransomware exploit as they leave the target. This gives them a lasting influence and extension of impact.

In short, this is a different type of threat. This is like moving from the moving columns of ancient roman armies to the fast and flexible mounted assaults of the steppe populations out of Asia. The two were not well suited for one another. In the open lands, the horseback was the optimal. But in the populated farm areas and particularly in the cities, the Roman method proved superior. This went on for centuries until history and biology decided the outcome. But afterwards there was a new morphing, the mounted knight. A method which took the best from both worlds and attempted to combine them and by that created a military system that lasted for almost a thousand years. So we have to say that it had a degree of success and staying power.

We face a similar dilemma. The players are different, as are the weapons, but the scenario is largely the same. The old is passing away and the new is the threat on the horizon. But I also want to emphasize that no one throughout the evolution of warfare probably threw a weapon away unless it was hopelessly broken. Folks still used swords and bows long after guns were invented. The point is that the APT will use all weapons, all methods of approach until they succeed. So how do you succeed versus them?

Well, this comes back to another dilemma. Most folk cannot account for what is on their networks. As a result they have no idea of what a normal baseline of behavior is. If you do not have any awareness of that how do you think you will catch and see the transient anomalies of the APT? This is the intention of this article. To get you think in a different mode.

The reality of it is that APT’s can come from anywhere. They can come from any country, even internally to your organization! They can be for any purpose, monetary, political, etc. They will also tend to source in the country where the target is and use the ambiguity of DNS mapping to trace ‘home’. This is what makes them advanced. They have very well educated and trained staffs who are mounting a series of strong phases of attack against your infrastructure. Their goal is to gain communications and control (C2) channels to either gain exfiltration of information of actual control of certain subsystems. They are not out to expose themselves by creating issues. As a curious parallel there has been a noted decrease in DOS and DDOS attacks on networks as the APT trend has evolved. It’s not that it isn’t used anymore; it’s just that it is now used in a very limited and targeted fashion. Which makes it far more dangerous. Often to cover up some other clandestine activity that the APT is executing and this would be a very last resort. For them being stealth is key to their long term success. So the decreases in these types of attacks make sense when looked at holistically. But note that a major IOT DDOS attack just occurred with home video surveillance equipment. Was it just an isolated DDOS or was it to get folks to turn their attentions to it? We may never know. These organizations may be nation states, political or terrorist groups, even corporations involved in industrial espionage. The APT has the potential to be anywhere and it could put its targets on anything, anywhere, at any time according to its directives. The reason why they are so dangerous is that they are actual people who are organized and who use their intelligence and planning against you. In short, if they know more about your network than you do… you lose. Pure and simple.

So what are the methods?

There has been a lot of research on the methods that APT’s will use. Due to the fact that this is largely driven by humans, the range can be very wide and dynamic. Basically it all gets down to extending the traditional kill chain. This concept was first devised by Lockheed Martin to footprint a typical cyber-attack. This is shown in the illustration below.


Figure 1. The traditional ‘kill chain’

The concept of infiltration needs to occur in certain fashion. An attacker can’t just willy-nilly their way into a network. Depending on the type of technology, the chain might be rather long. As an example, compare a simple WEP hacking example against a full grade Enterprise WPA implementation with strong micro-segmentation. There are many degrees of delta in the complexity of the two methods. Yet, many still run WEP. The APT will choose the easiest and most transparent method.


In the first initial phase of identifying a target a dark IT staff is called together as a team. This is known as the reconnaissance or information gathering phase. In the past, this was treated lightly at best by security solutions. Even now with highlighted interest in this area by security solutions, it tends to be the extended main avenue of knowledge acquisition. The reason for this is that much of this intelligence gathering can take place ‘off line’. There is no need to inject probes or pivots at this point. This is like shooting into a dark room and hoping you hit something. Instead the method is to gain as much intelligence about the targets as possible. This may go on for months or even years, as it continues as the next step and even the others occur. Note how I say ‘targets’. This notes that the target, when analyzed will result in a series of potential target systems. Now in the past these were typically servers, but now this may not be the case. The APT is more interested in the users or edge devices. These devices are typically more mobile with a wider degree of access media type. There is also another key thing on many of these devices. They have you or me at the interface.


Once the attacker feels that there is enough to move forward the next step is to try to establish a beach head into the target. In the past this was typically a server somewhere, but folks have been listening and following the advice of the security communities. They have been hardening their systems and keeping up to date and consistent with code releases. Score one for us.

There is the other side of the network though. This is more of a Wild West type of scenario. In the old west of the United States law was a tentative thing. If you were in a town out in the middle of nowhere and some dark character came into town, your safety was as good as the sheriff, which typically didn’t last the first night. Your defense was ‘thin’. Our end points are much the same way. As a result, truly persistent professional teams that are advanced in nature will target the edge, more specifically, the human at the edge. No one is immune. In the past a phishing attempt was easier to see. This has changed recently in that many times these attempts will be launched from a disguised email or other correspondence with an urgent request. The correspondence will appear very legitimate. Remember the APT has done their research. It appears to have the right format and headers; it is also from your manager. He is also referring to a project that you currently are working on with a link indicating that he needs to hear back immediately as he is in a board meeting. The link might be a spreadsheet; a word document… the list goes on. Many people would click on this well devised phish. Many have. There are also many other ways, some that in the right circumstances does not even require the user to click.

There are also methods to create ‘watering holes’ which is basically an Infiltration of websites that are known to be popular or required with the target. Cross site scripting is a very common set of methods to make this jump. Once visited the proper scripts are run and the infiltration then begins. A nice note is that this has fallen off due to improvements in the JRE.

There are also physical means. USB ‘jump sticks’. These devices can carry malware that can literally jump into any designed system interface. There is no need to log on to the computer. Only access to the USB port is necessary and even then only momentarily. In the right circumstances a visitor could wreak a huge amount of damage. In the past this would have been felt immediately. Now you might not feel anything at all. But it is now inside your network. It is wreaking no damage. It remains invisible.

Exploitation (now the truth of the matter is that it’s complicated)

When the APT does what it does if it is successful you will not know it. The exploit will occur and if undiscovered continue on. It is a scary point to note that most APT infiltrations are only pointed out after the fact to the target by a third party such as a service provider or the law enforcement. This is sad. It means that both the infiltration and exploitation capabilities of the APT are very high. The question is how does this get accomplished? The reality of it is that each phase in the chain will yield information and the need to make decisions as to the next best steps in the attack. Well, the realization is that this is the next step in the tree. This is shown in the figure below there are multiple possible exploits and further infiltrations that could be leveraged off of the initial vector. It is in reality a series of decisions that will take the intruder closer and closer to its target.


Figure 2. The Attack Tree

Depending upon what the APT finds as it moves forward its strategy will change and optimize over time. In reality it will morph to your environment in a very specific and targeted way. So while many folks think that exploitation is it. It’s really not. In the past it was visible. Now it’s not. The exploitation phase is used to further implant into the network.


Execution or Weaponization

In this step there is some method established to the final phase which is either data exfiltration or complete command and control (C2). Note that again, these steps may be linked and traced back. This is important as we shall see shortly. Note that execution is a process that will have a multitude of methods ranging from complete encryption (ransomware) to simple probes or port and keyboard mappers to gain yet further intelligence. Nothing is done to expose its presence. Ideally, it will gain access to the right information and then begin the next phase.



This is one of the options. The other is command and control (C2) which to some degree is required for exfiltration anyways. So APT’s will do both. Hey, why not? Seeing as you are already into the belly of the beast why are you not leveraging all avenues available to you? It turns out that both require a common trait; an outbound traffic requirement. At this point if the APT wants to pull the desired data out of the target it must establish an outbound communication. This is also referred to as a ‘phone home’ or ‘call back’. These channels are often very stealthy and they also are typically encrypted and mixed within the profile of the normal data flow. Remember, while there are well known ports assigned that we all should comply to, an individual with even limited skills can generate a payload with ‘counterfeit’ port mappings. DNS, ICMP and SMTP are three very common protocols for this type of behavior. It’s key to look for anomalies in behavior at these levels. The reality of it is that you need some sort of normalized baseline before you can judge whether there is an anomaly. This makes total sense.

If you bring me to edge of a river and say “Ed, tell me the high and low levels”, I could not reliably provide you with that information given what I am seeing. I would need to monitor the river for a length of time. To ‘normalize’ it, in order to tell you the highs and the lows. Even then with the possibility of extreme outliers. This is very much the same with security. We need to normalize our environments in order to see anomalies. If we can see these odd outbound behaviors early, then we can cut the intruder off and prevent the exploit from completing.

The APT needs systems to communicate in order for the tools to work for them. This means that they need to leave some sort of ‘footprint’ as they look to establish outbound channels. They will often use encryption to maintain a cloak of darkness for the transport of the data.

Remember, unlike the typical traditional threat which you probably are well prepared for. The APT will look to establish a ‘permanent’ outbound channel. The reason I use quotes around permanent is that these channels may often jump sessions, port behaviors or even whole transit nodes if the APT has built enough supporting malicious infrastructure into your network. Looking at the figure below, if the APT has compromised a series of systems; it has a choice on how to establish outbound behaviors.


Figure 3. Established exfiltration channels

The larger the footprint the APT has the better it can adjust and randomize its outbound behaviors, which makes it much more difficult to tease out. So catching the APT early is very key. Otherwise it’s much like trying to stamp out a fire that is growing out of control.


Command and Control (C2)

This is the second option. Sometimes the APT wants more than just data from you. Sometimes they want to establish C2 channels. This can be for multiple purposes. As in the case above, it might be to establish a stealth outbound channel network to support exfiltration of data. On the other side of the spectrum this might be complete (C2). Think power grids, high security military, intelligent traffic management systems, automated manufacturing, subways, trains, airlines. The list goes on and on.

The reality of it is that once the APT is inside most networks it can move laterally. This could be through the network directly but it might also be through social venues that might traverse normal segment boundaries. So the lateral movement could be at the user account level, the device, or completely random based on a set of rules. Also, let’s not forget the old list of viruses, web bots and worms that the APT can use internally within the target and on a very focused basis. It has the vectors for transport and execution. Note how I do not say outright propagation, in this case it is much more controlled. As noted above once the APT has established a presence at multiple toeholds it’s very tough to knock it out of the network. A truly comprehensive approach is required to mitigate these outbound behaviors. It starts at the beginning, the infiltration. Ideally we need to catch it there. But the reality is that in some instances this will not be the case. I have written about this in the past. With the offense there is the nature of surprise. The APT can come up with a novel method that has not been seen before by us. So we are always vulnerable to infiltration to some degree. But if not cutting it off before it enters we can work to prevent the exploit and later phases of attack. While not perfect, this has merit. If we can make the infiltration limited and transient in its nature the later steps become much more difficult to accomplish. We will speak to this later as it is a very key defense tactic that if done properly is very difficult to penetrate past. Clearly these outbound behaviors are not the time to finally detect something, particularly if you pick it out of weeks of logs. The APT has already established its infrastructure, you are in reaction mode.

The overall pattern (hint – its data centric)

By now hopefully you are seeing a strong pattern. It is still nebulous and quite frankly it always will be. The offense still has a lot of flexibility. For us to think that the APT will not evolve is foolish. So we need to figure out a way to somehow co-exist with its constant and impinging presence. Due to its advanced and persistent nature (hence the APT acronym) the threat cannot be absolutely eliminated. To do so would make systems totally isolated. And while this might be desired to a certain level for certain systems as we will cover later, we have to expose some systems to the external Internet if we wish to have any public presence.

Perhaps this is another realization. We should strongly limit our public systems and strongly segment with no confidential data access. When you get down to it, the APT is not about doing a DDOS attack on your point of sales. It’s not even about it absconding credit card data on a one time hit. None of these are good for you obviously. But the establishment of a persistent dark covert channel out of your network is one of the worst scenarios that could evolve. By this time you should be seeing a pattern. It’s all about the data. They are not after general communications or other such data unless they are doing further reconnaissance. They are about moving specific forms of information out or executing C2 on specific systems within the environment. Once we recognize this we see that the intent of the APT is long term residence and preferably totally stealth. The figure below shows a totally different way to view these decision trees.


Figure 4. A set of scoped and defined decision trees

Each layer from outer to center represents different phases in the extended kill chain. As can be seen they move from external (access), to internal (pivot compromise) and target compromise kill chains. You can also see that the external points are exposed vulnerabilities that the APT could leverage. These might be targeted and tailored email phishing or extensive water holing. There may also be explicit attacks against service points discovered. The goal is to establish a network of pivot points that can allow for a better exposure of the target. The series of decision trees all fall inward towards the target and if the APT gets its way and goes undiscovered, this will be the footprint of its web within the target. It is always looking to expand and extend it but not at the cost of losing secrecy. Its major strength lies in its invisibility.

So the concept of a linear flow to the attack has to go out the window. Again, this is the key term to persistence. This is very cyclic is the way it evolves over time. The OODA loop comes to mind which is typically taught to military pilots and quick response forces is – Orient, Observe, Decide, Access. The logic that the APT uses is very similar. This is because it is raw constructive logic. Trying to break down OODA any further becomes counterproductive, believe me many have tried. So you can see that the OODA principle is well established by the APT. Remain stealth, morph and move. But common to all of this is the target. Note how everything revolves around that center set of goals. If you are starting to see a strategy of mitigation and you haven’t read my previous article then my hat is off to you. If you have read my article and see the strategy then my hat is off to you as well. If you have not read my article and are puzzled – hang on. If you have read my last article and you are still puzzled I need to say emphatically. It’s all about the data!!!

We also should start to see and understand another pattern. This is shown in simpler terms in the diagram above; there is an inbound, a lateral and an outbound movement to the APT. This is the signature of the APT. While it looks simple, the mesh of pivots that the APT establishes can be quite sophisticated. But from this we can begin to discern that if we have enough knowledge of how our network normally behaves we can perhaps tease out these anomalies, which obviously did not exist before the APT gained residence. Note the statement I just made. Normalization means normalization against a known secure environment. A good time to establish this might be after compliance testing for example. You want to see the network as it should be.

Once you have that, you should with the right technologies and due diligence be able to see any anomalies. We will talk later about these in detail, but it can range from odd DNS behavior to random encrypted outbound channels. We will speak to methods of mitigation, detection as well as provide a strategic roadmap on goals against the APT realizing that we have limited resources available in our IT budgets.

So is this the end of IT Security as we know it?

Given all of the trends that we have seen in the industry one is tempted to throw up their arms and give up. Firewalls have been shown to have shortcomings and compromises; encryption has been abused as a normal mode of operation by the APT. What good is anti-virus in any of this? Many senior executives are questioning the value of the investment that they have made into security infrastructure, particularly if you are an executive of an organization that has been recently compromised.

After all, encryption is now being used by the bad guys, as are many other ‘security’ approaches. The target has shifted from the server to the edge. Does this mean that we jettison all of what we have built because it is no longer up to the challenge? Absolutely not! It does however indicate that we need to rethink how we are using these technologies and how they can be used with newer technologies that are coming into existence. Basically, the concept of what a perimeter is needs to change and we will discuss this in detail later on, but additionally we need to start thinking more aggressively in our security practice. We can no longer be sheep sitting behind our fences. We must learn to be more like the wolves. This may sound parochial but take a look at the recent news on the tracking and isolation of several APT groups not only down to the country of origin but the actual organization and in some instances even the site! This is starting to change the rules on the attackers.

But this is the stuff of advanced nation state cyber-warfare, what can the ‘normal’ IT practitioner do to combat this rising threat? Well, it turns out there is quite a bit. And it turns out that aside from launching your own attacks (which you shouldn’t do obviously), there is not much that the nation states can do that you can’t do. So let’s put on some different hats for this article. Let’s make them not black, but very nice dark gray. The reason why I say this is that in order to be really effective in security today you need to think like the attacker. You need to do research; you should attempt penetration and exploitation yourself (in a nice safe ISOLATED lab of course!). In short, you need to know them better than they know you because in the end it’s all about information. We will return to this very shortly. But we also need to realize that we need to create a security practice that is ‘data centric’. It needs to place the most investment in the protection of critical data assets that are often tiered in importance. Gone are the days of the hard static perimeter and the soft gooey core. We need to carry micro-segmentation to the nth degree. The microsegments need to not only strongly but exactly correspond to the tiers of the risk assets mentioned earlier. Assets with more risk should be ‘deeper’ and ‘darker’ and should require stronger authentication and much more granular monitoring and scrutinizing. All of this makes sense but it only makes sense if you have your data in order and have knowledge as to its usage, movement and residence. This gets back to the subject of my previous article and it sets the stage well for this next conversation. If you have not read it, I strongly urge you to do so before you continue.


Information and warfare

This is a relationship that is very ancient, as ancient as warfare itself. The basic premise is three fold. First, aggressors (hence weapon technology to a large part) has had the advantage in the theory of basic conflict. After all, it’s difficult to design defenses against weapons that you do not know about yet. But it doesn’t mean the defense lacks the ability to innovate either. As a matter of fact with a little ingenuity almost anything used in offense can be used for defense as well. So we need to think aggressively in defense. We cannot be passive sheep. Second, victory is about expectation. Expectation on a plan, on a strategy of some sort to achieve an end goal; in essence very few aggressive conflicts have no rationale. There is always a reason and a goal. Third, information is king. It to a very large degree will dictate the winners and the losers in any conflict, whether its Neolithic or modern day cyber-space. If the attacker knows more that you do, then you are likely to lose.

OK Ed! You might be saying wow! We are talking spears and swords here! Well, the point is that it’s not much has changed since the inception of conflict itself. Spying and espionage goes back as far as history, perhaps further. Let us not forget that it was espionage, according to legend that was the downfall of the Spartan 300. I can give you dozens (and dozens) of examples of espionage throughout history right up to modern times. Clandestine practice is certainly nothing new. But there may be a lot of things that we as security folk have forgotten along the way. Things that that the attackers might still remember; in today’s world if the APT knows more about your network and applications than you do; if they know more about your data than you do. You are going to lose.

Here you may be startled at the comment. How dare I. But if the question is extended to “Do you have a comprehensive data inventory? Is it mapped to relevant systems and validated? Do you know where its residence is? Who has access?” Many cannot answer these questions. The problem is that that APT can. They know where your data is and they know how it moves through your network, or at least they are in constant effort to understand that. They also understand where they can do exfiltration of the data as well. If they know and you don’t, they could be pulling information for quite a long time and you will not know. Do you think I am kidding? Well consider this. About 90% of the information compromises that occur are not discovered by internal IT security staff, they are notified of them by third parties such as their service providers or law enforcement agencies. Here is another sobering fact, the APT on average had residence in the victims network for 256 days.

So clearly things are changing. The ground as it were, is shifting underneath our feet. The traditional methods of security are somehow falling short. Or perhaps they always were and we just didn’t realize it until the rules changed. In any event, the old ‘keep ‘em out’ strategy is no longer sufficient. We need to realize that our networks will at some point be compromised. We will talk a little later as to some of the methods. Because of this, we need to shift our focus to detection. We need to identify the foreign entity and hopefully remove it before it does to much damage or gains to much knowledge. So IT security as we know it will not go away. We still require firewalls and DMZ’s, we will still require encryption and strong identity policy management as well as intrusion detection technologies. We will just need to learn to use them differently then we have in the past. We also have to utilize new technologies and concepts to create discrete visibility into the secure data environments. New architectures and practices will evolve over time to address these imminent demands. This article is intended to provide baseline insight into these issues and how they can be addressed.


It’s all about the user (and I’m not talking about IT quality of experience!)

Whenever you see a movie about hacking you always see someone standing in front of several consoles, cracking into various servers and doing their mischief. It’s fast moving and very intense. I always laugh because this is most definitely not the case. Slow and steady is always best and the server is most definitely not the place to start. It’s you. You are the starting point.

Think about it, you move around. You have multiple devices. You probably have less stringent security practices than the IT staff that maintains the server. You are also human. You are the weakest link in the security chain. Now I’ve spoken about this before but it has always been from the perspective of IT professionals who are not as diligent as they should be in the security practice of their roles. Here we are talking about the normal user, who may not be very technically savvy at all. Also, let’s consider that as humans we are all different. Some are more impulsive. Some who are more trusting. Some who simply don’t care. This is the major avenue or rather set of avenues that an attacker could use to gain compromised access into the network. Let’s look at a couple.

Deep Sea Phishing –

Many folks are aware of the typical ‘phishing’ email that says ‘Hey, you’ve won a prize! Click on this URL below!’ Hopefully, most folks now know not to click on the URL. But the problem is that this has moved into new dimensions with whole orders of magnitude in the increase of intelligence behind these types of attacks. As I indicated earlier, much of the reconnaissance that an APT does is totally off of your network. They use publicly posted information. News updates, social media, blog post (yikes – I’m writing one now!). They will not stop there either. There is a lot of financial data and profiling as well as the tagging of individuals to certain organizational chains and projects. Once the right chain is identified the phishing attack is launched. The target user receives a rather normal looking email from his or her boss. The email is about a project that they are currently working on and that they need to hear back on some new numbers that are being crunched. Could they take a look and get back to them by the end of the day. Time is of the essence as we are coming to the end of the quarter. They need to hear back by end of day. Many would open the spreadsheet and understandably so. HTML enabled email makes it even worse in that the SMTP service chain is obscured making it difficult to see the odd chain. And even then, many users wouldn’t even notice that. Many data breaches have occurred in just such a scenario. Once the url is clicked or the document is opened, the malicious code goes to work and establishes two things. The first is command and control back to the attacker, the second is evasion and resilience. From that point of presence the attacker will usually privilege escalate the local machine and then utilize it as a launching point to gain access to other systems.

The Poisoned Watering Hole or Hot Spot –

We all go out on the web and we all probably have sties that we hit regularly. We all go out to lunch and most probably go to our favorite places regularly. This is another thing that attackers can leverage the concept that we are creatures of habit. So let’s change the scenario. Let’s say that the attacker gets a good profile of the targets web behavior. They also learn where the target goes for lunch. But they don’t even need to know that. Typically they will select a place that is popular with multiple individuals in the target organization. That way the probability of will provide greater hits. Then they will emulate the local hot spot with aggressive parameters to force the targets to associate with it. Once that occurs the targets would gain internet access as always but now the attacker is in the middle. As the targets go about using the web they can be re-directed to poisoned sites. Once the hit occurs the attacker shuts down the rogue hot spot and then waits for the malicious code that is now resident on the targets to dial back. From the target users perspective the WLAN dropped and they simply re-associate to the real hot spot. Once the users go back to work, they log on and as a part of it they establish an outbound encrypted TCP connection to the APT. These will not be full standing sessions however, but intermittent. This makes the behavior seem more innocuous. The last thing that the APT wants is to stand out. From there the scenario proceeds much like before.

In both of the scenarios the user is the target. There are dozens of other examples that could be given but I think the two suffice. The human behavior dimension is just too wide to expect technology to fulfill the role, at least at this point. Until then we need firm clear policies that are well understood by everyone in the organization. There also needs to be firm enforcement of the policies in order for them to be effective. This is all in the human domain, not in the technology domain. But technology can help.


It’s all about having a goal as well

When an advanced persistent threat organization first starts to put you in their sites, they usually have a pretty good idea of what they are looking for or what they want to do. Only amateurs gain compromised access and then rummage or blunder about. It’s not that an APT wouldn’t take information that it comes across if it found it useful, but they usually have a solid goal and corresponding target data set. What that is depends on what the target does. Credit Card data is often a candidate, but it could be patient record data, confidential financial or research information, the list can be endless. We discussed this in my previous article on data discovery and micro-segmentation practices. It is critical that the critical data gets identified and accounted for. Because you can bet that the APT has.

This means that there is deliberate action on behalf of the APT. Again, only amateurs are going to bungle about. The other thing is that time is, unlike in the movies, not of essence! The average residency number that I quoted earlier illustrates this. In short, they are highly intelligent to their targets, they are very persistent and will wait many months until the right opportunity to move and they are very quiet.

This means that you need to get your house in order on the critical data that you need to protect. You need to know how it moves through your organization and you need to establish a solid idea of what normal is within those data flows. Then you need to move to fight to protect it.

The Internet – The ultimate steel cage

When you think about it, you are in the ultimate steel cage. You have to have a network. You have to have an Internet presence of some sort. You need to use it. You cannot go away. If you do you will go out of business. You are always there and so is the APT. The APT also will not go away. It will try and wait and wait and try and go on and on until it succeeds in compromising access. This paradigm means that you cannot win. No matter what you as a security professional does in your practice, the war can never be won. But the APT can win. It can win big. It can win to the point on putting you out of business. This creates a very interesting set of gaming rules if you are interested in that sort of thing. In a normal zero sum game, there is a set of pieces or tokens that can be won. Two players can sit down and with some sort of rules and maybe some random devices such as dice play the game. The winner is established by the first player to win all of the tokens. But if we remove the dice we have a game more like chess where the players’ lose or win pieces based on skill. This is much more akin to the type of ‘game’ that like to think that we play in information security. Most security architects I know do not use dice in their practice. Now in a normal game of chess, each player is more or less equal with the only real delta being skill. But remember you are sitting at the board with the APT. So here are the new rules. You cannot win all of his or her pieces. You may win some but even if you come down to the last one, you need to give it back. What’s more, there will not be just one. There will be ‘some number’ of pieces that you cannot win. Let’s say that it’s a quarter or maybe even half of the pieces are ‘unwinnable’. Well, it is pretty clear that you are in a losing proposition. You cannot win. The best you can do is stay at the board for as long as you can. Then also consider that the APT’s skill and resources may be just as great if not greater than yours. Does that help put things in perspective?

So the scenario is stark, but it is not hopeless. The game can actually go on for quite some time if you are smart in the way you play. Remember, I said ‘some number’ of pieces that you cannot win I did not say which types. If you look at a chess board you will note that the power pieces and the pawns are exactly half the count. This means that you could win all or most of the power pieces and leave the opponent with a far minimized ability to do damage to you as long as you aren’t stupid. So mathematically the scenario is not hopeless, but it is not bright either. While you can never win you can establish a position of strength that allows you to stand indefinitely.

Realize that the perimeter is now everywhere

Again, an old notion is that we can somehow draw a line our network and systems is becoming antiquated. The trends in BYOD, mobility, virtualization and cloud have forever changed what a security perimeter is. We have to realize that we are in a world of extreme mobility. Users crop up everywhere demanding access from almost anywhere, with almost any consumer device. These devices are also of consumer grade with little or no thought to systems security. As a result these devices, if not handled correctly with the appropriate security practices become a very attractive vector for malicious behavior.

This means that the traditional idea of a network perimeter that can be protected is no longer sufficient. We need to realize that there are many perimeters and these can be dynamic due to the demands of wireless mobility. This doesn’t mean that firewalls and security demarcations are no longer of any use; it just means that we need to relook at the way we use them and compare them with new technologies that can vastly empower them.

It is becoming more and more accepted is that micro-segmentation is one of the best strategies for a comprehensive security practice and to make it as difficult as possible for the APT. But this can’t be a simple set of segments off of a single firewall but multiple tiered segments with traffic inspection points that can view the isolated data sets within. The segmentation provides for two things. First, it creates a series of hurdles for the attacker, both on the way in and on the way out as they seek the exfiltration of data. Second and perhaps less obvious, segmentation provides for isolated traffic patterns with very narrow application profiles as well as interacting systems. In short, these isolated segments are much easier to ‘normalize’ from a security perspective. Why is this important? It is important because in the current environment 100% prevention is not a realistic proposition. If an APT has targeted you, they will get in. You are dealing with a very different beast here. The new motto you need to learn is that “Prevention is an ideal, but detection is a MUST!”

In order to detect you need to know what is normal. In order to make this clear let’s use a mundane example of a shoplifter in a store. The shoplifter wants to look like any other normal shopper, they will browse and try on various items like anyone else. In other words they strongly desire to blend into the normal behavior of the rest of the shoppers in the store. An APT is no different. They want to blend into the user community and appear like any other user in the network. As a matter of fact they will often commandeer normal users machines by the methods discussed earlier. They will learn the normal patterns of behavior and try as much as possible to match them. But at some point, in order to shoplift the shopper needs to diverge from the normal behavior. They need to use some sort of method to take items out of the store undetected. In order to do this, they need to avoid video surveillance direct views and allow for a time where they can ‘lift’ the items. But regardless of the technique, there needs to be delta. Point A, product… point B, no product. The question is will it be noticed. This is what detection is all about. In a retail environment it is also accepted that a certain amount of loss needs to be ‘accepted’ as the normal business risk for operations. The reasons being for this is that there is a cost point where further expense in the areas of prevention and detection do not make any fiscal sense.

It is very much the same thing with APT’s. You simply cannot seal off your borders. They will get in. The question is how far they penetrate and how much they are able to discover about you and what information they are able to pull out. There is common joke in the security industry, it goes like this. “If you want a totally secure computer, unplug all network connections. Seal it off physically with thick walls, including all and any RF with no entrance. Then take several armed guards and an equivalent number of very large attack dogs and place around the perimeter 24 x 7. Also you need to be sure that you have total independence of power, which means you need a totally separate micro grid that in turn cannot be compromised by using the above methods.” Like all tech sector jokes, the humor is dry at best and serves to show the irony of a thought process. Such a perfectly secure computer would be perfectly useless! We like the shop owner need to assume and accept a certain amount of risk and exposure to be on line. It is simply the reality of the situation, hence the steel cage analogy I used earlier. So detection is of absolute key importance to the overall security model.

How to catch a thief

So the next question is how do you detect an APT is in your network? Additionally how do you do it as early as possible taking into consideration that time is on the attackers’ side – not yours. Once again, it serves to revisit the analogy of the shoplifter. Retail outfits usually have store detectives. These individuals are specialists in retail security. They know the patterns of behavior and inflections of movement that will cause a highlight around a certain individual. Many of these individuals have a background in psychology and have been specifically trained to watch for telltale signs. Note that such indicators cannot cause arrest or even ejection from the store. They can only serve to highlight that additional attention is needed on a certain individual. Going further, there are often methods to get into dressing rooms and the counting of items before entry and upon exit. This could be viewed both as a preventative as well as a detective measure. There are also usually RF tags that will flag an alarm if the item is removed from the premises. Often these tags are ink loaded so that they will despoil the product if removal is attempted without the correct tool. All of this can be more or less replicated in the cyber environment. The key is what to look for and how to spot it.

A compromised system

This is the obvious thing to look for as it generally all starts here. But the problem is that APT’s are pretty good at hiding and staying under cover until the right time. So the key is to look for patterns of behavior that are unusual from a historical standpoint. This gets back to the concept of normalization. In order to know that a user’s behavior is abnormal, it is important to have a good idea on what the normal behavior profile is. Some things to look for are unusual patterns of session activity. Lots of peer to peer activity where in the past there was little or none. Port scanning and the use of discovery methods should be monitored as well. Look for unusual TCP connections, particularly peer to peer or outbound encrypted connections.

Remember that there is a theory to all types of intrusion. First, an attacker needs to compromise the perimeter to gain access to the network. Unless the attacker is very lucky, they will not be where they need or want to be. This means that a series of lateral and northbound moves will be required in order to establish a foothold and command and control. This is why it is not always a good idea to take a suspicious or malicious node off of the network. You can gain quite a bit by watching it. As an example, if a newly compromised system begins to implement a series of scans and no other behavior then it is probably an isolated or early compromise. If the same behavior is accompanied by a series of encrypted TCP sessions then there is a good probability that the attacker has an established footprint and is working to expand their presence.

Malicious or suspicious activities

Once again normalization is required in order to flag unusual activities on the network. If you can set up a lab to provide an idealized ‘clean’ runtime environment, a known good pattern and corresponding signature can be developed. This idealized implementation provides a clean reference that is normalized by its very nature. After all, you don’t want to normalize an environment with an APT in it now do you? Once this clean template is created, it is easy to spot deltas and unusual patterns of behavior. These should be investigated immediately. Systems should be located and identified with the corresponding user if appropriate. There may or may not be the confiscation of equipment. As pointed out earlier, sometimes it is desirable to monitor their activities in a controlled fashion with the option of quarantine at any point.


Exfiltration & C2  There must be some kind of way out of here                                  (Said the joker to the thief)

In order for any information to leave your organization there has to be an outbound exfiltration channel that is set up prior. Obviously, this is something that the APT has been working to accomplish in the initial phases of compromise. Again, going back to the analogy of the shoplifter, this is another area where the APT has to diverge from the normal behavior of a user. The APT needs to establish a series of outbound channels to move the data out of the organization. In the earlier days, a single outbound TCP encrypted channel would be established to move data as quickly as possible. But now that most threat protection systems are privy to this, they tend to establish networks that can utilize a series of shorter lived outbound sessions, moving only smaller portions of the data so as to blend in to the normal activities of the network. But even with this improvement in technique, they still have to diverge from the normal user pattern. If you are watching close enough you will catch it. But you have to watch close and you have to watch 24 by 7.

Here is a list of things that you want to look for,

1). Logon activity

Logon’s to new or unusual systems can be a flag of malicious behavior. New or unusual session types are also an important flag to watch for, particularly new or unusual out bound encrypted session. Other flags are unusual time of day or location. Watch also for jumps in activity or velocity as well as shared account usage or privileged accounts.

2). Program execution

Look for new or unusual program executions or the execution of the programs at unusual times of the day or from new or unusual locations. Or the executing of the program from privileged account status rather than a normal user account.

3). File access

You want to catch data acquisition attempts before they succeed with access, but if you can’t, you at least want to catch the data as it attempts to leave the network. Look for unusual high volume access to files servers or unusual file access patterns. Also be sure to monitor cloud based sharing uploads as these are a very good way to hide in the flurry of other activity.

4). Network activity

New IP addresses or secondary addresses can be a flag. Unusual DNS queries should be looked into, particularly those with a bad or no reputation. Look for the correlation between the above points and new or unusual network connection activity. Also look for unusual or suspicious application behaviors. These could be dark outbound connections that may use lateral movement internally. Many C2 channels are established in this fashion.

5). Database access

Most users do not have to access the database directly. This is an obvious flag, but also look for manipulated applications calls that doing sensitive table access, modifications or deletions. Also be sure to lock down the database environment by disabling many of the added options that most modern databases provide. Be aware that many of them are enable by default. Be sure to be aware of what services are exposed out of the database environment. An application proxy service should be implemented to prevent direct access in a general fashion.

6). Data Loss Prevention methods

Always monitor sensitive data movement. As pointed out in the last blog, if you have performed your segmentation design correctly according to the confidential data footprint then you should already have isolated communities of interest that you can monitor very tightly, particularly at the ingress and egress to the microsegments. Always monitor FTP usage as well as mentioned earlier cloud services.

Analysis, but avoid the paralysis

The goal is to arrive at a risk score based on the aggregate of the above. This involves the session serialization of hosts as they access resources. As an example a new secondary IP address is created and an outbound encrypted session is established to a cloud service, but earlier in the day or perhaps during the wee hours that same system accessed several sensitive file servers with the administrator profile. Now this is a very obvious set of flags, these can and will be increasingly more subtle and difficult to tease out. This is where security analytics enters the picture. There are many vendors out there who can provide products and solutions in this space. There are several firms and consortiums that can provide ratings for these various vendors so we will not even attempt to replicate here. The goal of this section is on how to use it.

The problem with us humans is that if we are barraged with tons of data and forced to do the picking out of significant data, we are woefully inefficient. First of all, we have a very large capacity for missing certain data sets. How often have you heard the saying, “Another set of eyes”? It’s true, though we don’t like to admit it, when faced with large data sets we can miss certain patterns that others will see and visa-versa. This brings two lessons two lessons. First never manually analyze data alone, always have another set of eyes go over it. Second, perhaps we are not the best choice for this type of activity. There is another reason to look at though. It’s called bias. We are emotional beings. While we like to think we are always intellectual in our decisions this has been proven not to be the case. As a matter of fact, many neurologist researchers are saying that without emotions, we really can’t make a decision. At its root decision making for us is an emotional endeavor.

So enter computers and the science of data analytics. Computers and algorithms do not exhibit the same shortcomings as us humans. But they exhibit others. They are extremely good at sifting through large sets of data and identifying patterns then analyzing them against certain rules such as those noted above. They are also extremely fast in these tasks when compared to us. What they offer will be unadulterated and pure without bias, IF and only if the algorithms are written correctly and do not induce any bias in their design. This whole subject warrants another blog article sometime, but for now let be safe to say that algorithms and theories of operation as well as application design are all done by us. So the real fact of the matter is that there will be biases that are embedded into any solution. But there is one thing that computers do not do well yet. They can’t look at patterns and emotionally ‘suspect’ an activity ‘knowing’ the normal behavior of a user. As an example, to say to itself, “Fred just wouldn’t do this type of thing. Perhaps his machine has been compromised. I think I should give him a call before I escalate this. We can confiscate the machine if this is true, get him a replacement and get the compromised unit into forensics.” Note that I say for now. Artificial intelligence is moving forwards at rapid pace, but what is to say that AI will eventually roadblock on bias just like we have! Many cognitive researches are now coming to this conclusion. So it is clear that we and computers will be co-dependent for the foreseeable future, each side keeping the other from invoking bias. The real fact is that there will always be false negatives and false positives. The cyber-security universe simply moves too fast to assume otherwise. So the concept of setting and forgetting is not valid here. These systems will need assistance from humans, particularly once a system has been identified as ‘suspect’.

Automation and Security

At Avaya we have developed a shortest path bridging networking fabric we refer to as SDN Fx that is based on three basic self-complimentary security principles.


This is a new term that we have coined to indicate the primary deltas of this new approach to traditional network micro-segmentation. First, hyper-segments are extremely dynamic and lend themselves well to automation and dynamic service chaining as is often required with software defined networks. Second, they are not based on IP routing and therefore do not require traditional route policies or access control lists to constrict access to the micro-segment. These two traits create a service that is well suited to security automation.



We have spoken to this many times in the past. Due to the fact that SDN Fx is not based on IP, it is dark from an IP discovery perspective. Many of the topological aspects to the network, which are of key importance to an APT simply cannot be discovered by traditional port scanning and discovery techniques. So the hyper-segment holds the user or intruder into a narrow and dark community which has little or no communications capability with the outside world except through well-defined security analytic inspection points.


This refers to the dynamic component. Due to the fact that we are not dependent on IP routing to establish service paths, we can extend or retract certain secure hyper-segments based on authentication and proper authorization. Just as easily however, SDN FX can retract a hyper-segment, perhaps based on an alert from security analytics that something is amiss with the suspect system. But as we recall, we may not want to simply cut the intruder off but place them into a forensic environment where we can watch their behavior and perhaps gain insight into methods used. There may even be the desire to redirect them into Honey pot environments where whole network can be replicated in SDN Fx for little or no cost from a networking perspective.

Welcome to my web (It’s coated with honey! Yum!)

If we take the concept of the honey pot and extend it with SDN Fx, we can create a situation where the APT no longer has complete confidence of where they at and whether they are looking at real systems. Recall that the APT relies on shifting techniques that evolve over time, even during a single attack scenario. There is no reason why you could not so the same. Modern virtualization of servers and storage along with the dynamic attributes of SDN Fx create an environment where we can keep the APT guessing and ALWAYS without a total scope of knowledge about the network. Using SDN Fx we can automate paths within the fabric to redirect suspect or known malicious systems to whatever type of forensic or honey pot service required.

Avaya has been very active in building out the security ecosystem in an open system approach with a networking fabric based on IEEE standards. The concept of closed loop security now becomes a reality. But we need to take it further. Humans still need to communicate and interact about these threats on a real time basis. The ability to alert staff for threats and even set up automated conferencing where staf can compare data and decide on the next best course of action are now possible as such services can be rendered in only a couple of minutes in an automated fashion.

Figure 6. Hyper-segmentation, Stealth and Elasticity to create the ‘Everywhere Perimeter’

All of this places the APT in a much more difficult position. As the illustration above shows, hyper-segmentation creates a series of hurdles that need to be compromised before access to a given resource is possible. Then it becomes necessary to create out bound channels for the exfiltration of data across the various hyper-segment boundaries and associated security inspection points. Also note that as the figure above illustrates, you can create hyper-segments where there simply is no connectivity to the outside world. For all intents and purposes they are totally and completely orthogonal. The only way to gain access is to actually log into the segment. This creates even more difficultly for the APT as exfiltration becomes more difficult and if you are watching, easier to catch.

In summary

One could say and most probably should say that this was occurrence that was bound and destined. While I don’t like the term ‘destined’, I must admit that it is particularly true here. As our ability to communicate and compute has increased it has created a new avenue for illegal and illegitimate usage. The lesson here is that the Internet does not make us better people. It only makes us better at being what we already are. It can provide immense transformative power to convert folks to perform unspeakable acts and it can in a few hours’ notice take a global enterprise to its knees.

But it can also be a force for a very powerful good. As an example, I am proud to be involved in the effort on behalf of colleagues such as Mark Fletcher and Avaya in the wider sense to support Kari’s law for the consistent behavior of 9-1-1 emergency services. Mark is also actively engaged abroad in the subject of emergency response as I am for security. The two go hand in hand in many respects because the next thing the APT will attempt is to take out our ability to respond. The battle is not over. Far from it.









Establishing a confidential Service Boundary with Avaya’s SDN Fx

June 10, 2016



Security is a global requirement. It is also global in the fashion in which it needs to be addressed. But the truth is, regardless of the vertical, the basic components of a security infrastructure do not change. There are firewalls, intrusion detection systems, encryption, networking policies and session border controllers for real time communications. These components also plug together in rather standard fashions or service chains that look largely the same regardless of the vertical or vendor in question. Yes, there are some differences but by and large these modifications are minor.

So the questions begs, why is security so difficult? As it turns out, it is not really the complexities of the technology components themselves, although they certainly have that. It turns out that the real challenge is deciding exactly what to protect and here each vertical will be drastically different. Fortunately, the methods for identifying confidential data or critical control systems are also rather consistent even though the data and applications being protected may vary greatly.

In order for micro-segmentation as a security strategy to succeed, you have to know where the data you need to protect resides. You also need to know how it flows through your organization. What systems are involved and which ones aren’t. If this is information is not readily available it needs to be created by data discovery techniques and then validated as factual.

This article is intended to provide a series of guideposts on how to go about establishing a confidential footprint for such networks of systems. As we move forward into the new era of the Internet of Things and the advent of networked critical infrastructure it is more important than ever before to have at least a basic understanding of the methods involved.

Data Discovery

Obviously the first step in establishing a confidential footprint is in establishing the systems and the data that gets exchanged that needs to be protected. Sometimes this can be a rather obvious thing. A good example is credit card data and PCI. The data and the systems involved in the interchange are fairly well understood and the pattern of movement or flow of data is rather consistent. Other examples might be more difficult to determine. A good example of this is the protection of intellectual property. Who is to say what classifies as intellectual property? Who is to establish a risk value to a given piece of IPR? In many instances this type of information may be in disparate locations and stored with various methods and probably various levels of security. If you do not have a quantified idea on the volume and location of such data, you will probably not have a proper handle on the issue.

Data Discovery is a set of techniques to establish a confidential data footprint. This is the first established phase of identifying exactly what you are trying to protect. There are many products on the market that can perform this function. There are also consulting firms that can be hired to perform a data inventory. Fortunately, this is something that can be handled internally if you have the right individuals with proper domain expertise. As an example, if you are performing data discovery on oil and gas geologic data, it is best to have a geologist involved with the proper background in the oil and gas vertical. Why? Because they would have the best understanding of what data is critical, confidential or superfluous and inconsequential.

Data Discovery is also critical in establishing a secure IoT deployment. Sensors may be generating data that is critical to the feedback actuation of programmable logic controllers. The PLC’s themselves might also generate information on its performance. It is important to understand the fact that much of process automation has to do with closed loop feedback mechanisms. The feedback loops are critical for the proper functioning of the automated IoT framework. An individual that could intercept or modify the information within this closed loop environment could adversely affect the performance of the system; even to the point of making it do exactly the opposite of what was intended.

As pointed out earlier though, fortunately there are some well understood methods in establishing a confidential service boundary. It all starts with a simple checklist.

Establishing a Confidential Data Footprint – IoT Security Checklist for Data

1). What is creating the data?

2). What is the method for transmission?

3). What is receiving the data?

4). How/where is it stored?

5). What systems are using the data?

6). What are they using it for?

7). Do the systems generate ‘emergent’ data?

8). If yes, then is that data sent, stored, or used?

9). If yes, then is that data confidential or critical?

10). If so, then go to step 1.

No, step 10 is not a sick joke. When dealing with creating secure footprints for IoT frameworks it is important to realize that your data discovery will often loop back on itself. With closed loop system feedback this is the nature of the beast. Also be prepared to do this several times as these feedback loops can be relatively complex in fully automated systems environments. So it gets down to some basic detective work. Let’s grab our magnifier and get going. But before we begin we need to take a moment and take a closer look at each step in the discovery process a little closer.

What is sending the Data?

This is the start in the confidential data chain. Usually it will be a sensor of some type or a controller that has a sensing function embedded it. It could also be something as simple as a point of sale location for credit card data. Another possible case would be medical equipment relaying both critical and confidential data. This is where the domain expertise is a key attribute that you need on your team. These individuals will understand what starts the information service chain from an application services perspective. This information will be crucial in establishing a start to the ‘cookie crumb’ trail.

What is the method of transmission?

Obviously if something is creating data there are three choices. First, the device will store the data. Second, the device may use the data to actuate an action or control. Third, the device will transmit the data. Sometimes a device will do all three. Using video as an example, a wildlife camera off in the woods will usually store the data that it generates until some wildlife manager or hunter comes to access the content whereas a video surveillance camera will usually transmit the data to a server, a digital video recorder or a human viewer in a real time fashion. Some video surveillance cameras may also store recent clips or even feedback into the physical security system to lock down an entry or exit zone. When something goes to transmit the information it is important to establish the methods used. Is it IP or another protocol? Is it unicast or multicast? Is it UDP (connectionless) or is it TCP (connection oriented)? Is the data encrypted during transit? If so how? If it is encrypted is there proper chain of trust established and validated? In short if the information moves out the device and you have deemed that data to be confidential or critical then it is important to quantify the nature of the transmission paths and nature of or lack of security for it.

What is receiving the data?

Obviously if the first system element is transmitting data then there has to be a system or set of systems that are receiving it. Again, this may be fairly simple and linear such as the movement of credit card data from a point of sale system to an application server in the data center. In other instances, particularly in IoT frameworks the information flow will be convoluted and loop back on itself to facilitate the closed loop communication required for systems automation. In other words, as you begin to extend your discovery you will begin to discern characteristics or a ‘signature’ to the data footprint. Establishing transmitting and receiving systems are a key critical part of this process. A bit later in the paper we will take a look at a simple linear data flow and compare it to a simple closed loop data flow in order to clarify this precept.

Is the data stored? How is it stored?

When folks think about storage, they typically think about hard drives, solid state storage or storage area networks. So there are considerations that need to be made here. Is the storage a structured database or is it a simple NAS. Perhaps it might be something based on Google File System (GFS) or Hadoop for data analytics. But the reality is that data storage is much broader than that. Any device that holds data in memory is in actuality storing it. Sometimes the data may be transient. In other words, it might be a numerical data point that represents an intermediate mathematical step for an end calculation. Once the calculation is completed the data is no longer needed and the memory space is flushed. But is it really flushed? As an example some earlier vendor applications for credit card information did not properly flush the system of PIN’s or CVC values from prior transactions. It is important that if transient data is being created it needs to be determined if that data is critical or confidential and should be deleted up on termination of the session or if stored, stored with the appropriate security considerations. In comparison, the transient numerical value for a mathematical function may not be confidential because outside of the context that data value would be meaningless. But also keep in mind that this might not be the case as well. Only someone with domain expertise will know. Are you starting to see some common threads?

What systems are using the data and what are they using it for?

Again, this may sound like an obvious question but there are subtle issues and most probably assumptions that need to be validated and vetted. A good example might be data science and analytics. As devices generate data, that data needs to analyzed for traits and trends. In the case of credit card data it might be analysis for fraudulent transactions. In the case of IoT for automated production it might be the use of sensor data to tune and actuate controllers with an analytic process in the middle to tease out pertinent metrics for systems optimization. In the former example, it is an extension of a linear data flow, in the latter the analytics process is embedded into the closed loopback data flow. Knowing these relationships allows one to establish the proposed ‘limits’ to the data footprint. Systems beyond this footprint simply have no need to access the data and consequently no access to it should be provided.

Do those systems generate ‘emergent’ data?

I get occasional strange looks when I use this term. Emergent data is data that did not exist prior to the start of the compute/data flow. Examples of emergent data are transient numerical values that are used for internal computation for a particular algorithmic process. Others are intermediate data metrics that provide actual input into a closed loop behavior pattern. In the areas of data analysis this is referred to as ‘shuffle’. Shuffle is the movement of data across the top of rack environment in an east/west fashion to facilitate the mathematical computation that often accompanies data science analytics. Any of the resultant data from the analysis process is ‘new’ or ‘emergent’ data. In other words, emergent data is data that simply did not exist prior to the start of the compute/data flow.

If yes, is that data sent, stored or used?

Unless you have a very poorly designed solution set, any system that generates emergent data will do something with it (one of the three previously mentioned above). If you find that this is not the case then the data is superfluous and the process could possibly be eliminated out of the end to end data flow. So let’s assume that the system in question will do at least one of the three. In the case of a programmable logic controller it may use the data to more finely tune its integral and atomic process. The same system (or its manager) may store at least a certain span of data for historical context and systems logs. In the case of tuning, the data may be generated by an intermediate analytics process that would arrive at more optimal settings for the controllers’ actuation and control. So remember these data metrics could come from anywhere in the looped feedback system.

If yes, then is that data confidential or critical?

If your answer to this question is yes, then the whole process of investigation needs to begin again until all possible avenues of inter-system communications are exhausted and validated. So in reality we are stepping into another closed loop of systems interaction and information flow within the confidential footprint. Logic dictates that if all of the data up until this point is confidential or critical then it is highly likely that this loop will be as well. It is highly unlikely that one would go through a complex loop process with confidential data and say that they have no security concerns on the emergent data or actions that result out of the system. Typically, if things start as confidential and critical, they usually – but not always – will end up as such within an end to end data flow. Particularly if it is something as critical as the meaning of the universe which we all know is ‘42’.


Linear versus closed loop data flows

First, let’s remove the argument of semantics. All data flows that are acknowledged are closed loops. A very good example is TCP. There are acknowledgements to transmissions. This is a closed loop in its proper definition. But what we mean here in this discussion is a bit broader. Here we are talking about the general aspects of the confidential data flow, not the protocol mechanics used to move the data. That was addressed already in step two. Again, a very good example of a linear confidential data flow is PCI. Whereas automation frameworks provide for a good example of looped confidential data flows.

Linear Data Flows

Let’s take a moment and look at a standard data flow for PCI. First you have the start of the confidential data chain which is obviously the point of sale system. From the point of sale system the data is either encrypted or more recently tokenized into a transaction identifier by the credit card firm in question. This tokenization provides yet another degree of abstraction to avoid the need to transmit actual credit card data. From there the data flows up to the data center demarcation where the flow is inspected and validated by firewalls and intrusion detection systems and then handed to the data center environment where a server running an appropriately designed PCA DSS application to handle the card and transaction data. In most instances this is where it stops. From there the data is uploaded to the bank by a dedicated and encrypted services channel. Most credit card merchants to do not store card holder data. As a matter of fact PCI V3.0 advises against it unless there are strong warrants for such practice because there are extended practices to protect stored card holder data which further complicates compliance. Again, examples might be to analyze for fraudulent practice. When this is the case the data analytics sandbox needs to be considered as an extension of the actual PCI card holder data domain. But even then, it is a linear extension to the data flow. Any feedback is likely to end up in a report meant for human consumption and follow up. In the case of an actual credit card vendor however this may be different. There may be the ability and need to automatically disable a card based on the recognition of fraudulent behavior. In that instance the data analytics is actually a closed loop data flow at the end of the linear data flow. The close in the loop is the analytics system flagging to the card management system that the card in question be disabled.

Looped Data Flows

In the case of a true closed loop IoT framework a good simplified example is a simple three loop public water distribution system. The first loop is created by a flow sensor that measures the gallons per second flow coming into the tank. The second loop is created by a flow sensor that measures the gallons per second flow out of the tank. Obviously the two loops feedback on one another and actuate pumps and drain flow valves to maintain a match to the overall flow of the system with a slight favor to the tank filling loop. After all, it’s not just a water distribution system but a water storage system as well. But in ideal working situations as the tank reaches the full point the ingress sensor feeds back to reduce the speed and even shut down the pump. There is also a third loop involved. This is a failsafe that will actuate a ‘pop off’ valve in the case that a mismatch develops due to systems failure (the failure of one the drain valves for instance). Once the fill level of the tank or the tanks pressure gets to a certain level that is established prior, the pop off valve is actuated and thereby relieves the system of additional pressure that could cause further damage and even complete system failure. It is obviously critical for the three loops to have continuous and stable communications. These data paths have to also be secure as anyone who could gain access into the network could mount a denial of service attack on one of the feedback loops. Additionally, if actual systems access is obtained then the rules and policies could be modified to horrific results. A good example is that of a public employee a few years ago who was laid off and consequently gained access and modified certain rules in the metro sewer management system. The attack resulted in sewage backups that went on for months until the attack and malicious modifications were recognized and addressed. So this brings us now to the aspect of systems access and control.


But you’re not done yet…

You might have noticed that certain confidential data may be required to leave your administrative boundary. This could be anything from uploading credit card transactions to a bank or sharing confidential or classified information between agencies for law enforcement or homeland defense. In either case this classifies as an extension to the confidential data boundary and needs to be properly scrutinized as a part of it. But the question is how?

This tends to be one of the biggest challenges in establishing control of your data. When you give it to someone else, how do you know that is being treated with due diligence and is not being stored or transferred in a non-secure fashion; or worse yet being sold for revenue. Well, fortunately there are things that you can do to assure that ‘partners’ are using proper security enforcement practices.

1). A contract

The first obvious thing is to get some sort of assurance contract put in place that holds the partner to certain practices in the handling of your data. Ask your partner to provide you with documentation as to how those practices are enforced and what technologies are in place for assurance and it might be a good idea to request to visit the partners’ facilities to meet directly with staff and tour the site in question.

2). Post Contract

Once the contract is assigned and you begin doing business it is always wise to do a regular check on your partner to ensure that there has been no ‘float’ between what is assumed in the contract and what is reality. Coming short of the onerous requirement of a full scale security audit, (and note that there may be some instances where that may very well be required) there are some things that you can do to ensure the integrity and security of your data. It is probably a good idea to establish regular or semi-regular meetings with your partner to review the service that they provide (i.e. transfer, storage, or compute) and its adherence to the initial contract agreement. In some instances it might even warrant setting up direct site visits in an ad hoc fashion so that there is little or no notification. This will provide a better insurance on the proper observance of ‘day to day’ practice. Finally, be sure to have a procedure in place to address any infractions to the agreement as well as contingency plans on alternative tactical methods to provide assurance


Systems and Control – Access logic flow

So now that we have established a proper scope for the confidential or critical data footprint, what about the systems? The relationship between data and systems is very strongly analogous to musculature and skeletal structure in animals. In animals there is a very strong synergy between muscle structure and skeletal processes. Simply, muscles only attach to skeletal processes and skeletal processes do not develop in areas where muscles do not attach. You can think of the data as the muscles and the systems that use or generate the data as the processes.

This also should have become evident in the data discovery section above. Identifying the participating systems is a key point to the discovery process. This gives you a pre-defined list of systems elements involved in the confidential footprint. But it is not always just a simple one to one assumption. The confidential footprint may be encompassed by a single L3 VSN, but it may not. As matter of fact, in IoT closed loop frameworks this most probably will not be the case. These frameworks will often require tiered L2 VSN’s to keep certain data loops from ‘seeing’ other data loops. A very good example of this is production automation frameworks where there may be a higher level Flow Management VSN and then tiered ‘below’ it would be several automation managers within smaller dedicated VSN’s to communicate to the higher level Management environment. At the lowest level you would have very small VSN’s or in some instances dedicated ports to the robotics drive. Obviously it’s of key importance to make sure that the systems are authenticated and authorized to be placed into the proper L2 VSN within the overall automation hierarchy. Again, someone with systems and domain experience will be required to provide this type of information.

Below is a higher level logic flow diagram of systems and access control within SDN Fx. Take a quick look at the diagram and we will touch on each point in the logic flow in further detail.


Figure 1. SDN Fx Systems & Access Control

There are a few things to note in the diagram above. First in the earlier stages of classifying a device or system there are a wide variety of potential methods that are available that are by the process winnowed out to a single method on which validation and access occurs. It is also important to point out that all of these methods could be used concurrently within a given Fabric Connect network. It is best however to be consistent in the methods that you use to access the confidential data footprint and the corresponding Stealth environment that will eventually encompass it. Let’s take a moment and look a little closer at the overall logic flow.

Device Classification

When a device first comes on line in a network it is a link state on the port and a MAC address. There is generally no quantified idea of what the system is unless the environment is manually provisioned and record keeping scrupulously maintained. This is not a real world proposition so there is the need to classify the device, its nature and its capabilities. We see that there are two main initial paths. Is it a user device, like a PC or a tablet? Or is it just a device? Keep in mind that this could still be a fairly wide array of potential types. It could be a server, or it could be a switch or WLAN access point. It could also be a sensor or controller such as a video surveillance camera.

User Device Access

This is a fairly well understood paradigm. For details, please reference the many TCG’s and documents that exists on Avaya’s Identity Engines and its operation. There is no need to recreate it here. At a higher level IDE can provide for varying degrees of authentication and type. As an example, normal user access might be based on a simple password or token, but other more sensitive types of access might require stronger authentication such as RSA. In extension to that there may be guest users that are allowed for temporary access to guest portal type services.

Auto Attach Device Access

Auto-attach (IEEE 802.1Qcj) known in Avaya as Fabric Attach supports a secure LLDP signaling dialog between the edge device running the Fabric Attach or auto attach client and the Fabric Attach proxy or server depending upon topology and configuration. IDE may or may not be involved in the Fabric Attach process. In the case of a device that supports auto attach there are two main modes of operation. First is the pre-provisioning of VLAN/I-SID relationships on the edge device in question. IDE can be used to validate that the particular device warrants access to the requested service. There is also a NULL mode in which the device does not present a VLAN/I-SID combination request but instead lets IDE handle all or part of the decision (i.e. Null/Null or VLAN/Null). This might be the mode that a video surveillance camera or sensor system that supports auto attach would use. There is also some enhanced security methods used within the FA signaling that significantly mitigate the possibility of MAC spoofing and provide for security of the signaling data flows.


Obviously 802.1X is used in many instances of user device access. It can also be used for just devices as well. A very good example again is video surveillance cameras that support it. 802.1X is based on a series of three major elements, supplicants – those wishing to gain access, authenticators – those providing the access such as an edge switch and an authentication server, which for our purposes would be IDE. From the supplicant to the authenticator the Extensible Authentication Protocol or EAP (or its variants) is used. The authenticator and the authentication server support a radius request/challenge dialog on the back end. Once the device is authenticated it is then authorized and provisioned into whatever network service is dictated by IDE whether stealth and confidential or otherwise.

MAC Authentication

If we arrive to this point in the logic flow, we know that it is a non-user device and that it does not support auto attach or 802.1X. At this point the only method left is simple MAC authentication. Note that this box is highlighted in red due to the concerns for valid access security, particularly to the confidential network. MAC authentication can be spoofed by fairly simple methods. Consequently, it is generally not recommended as a network access into secure networks.

Null Access

This is actually the starting point in the logic flow as well as a termination. Every device that attaches to the edge when using IDE gets access for authentication alone. If the loop fails (whether FA or 802.1X), the network state reverts back to this mode. There is no network access provided but there is the ability to address possible configuration issues. Once those are addressed, the authentication loop would again proceed with access granted as a result. On the other hand, if this chain in the logic flow is arrived at due to the fact that nothing else is supported or provisioned then manual configuration is the last viable option.

Manual Provisioning

While this certainly a valid method for providing access, it is generally not recommended. Even if the environment is accurately documented and the record keeping was scrupulously maintained there is still the risk of exposure. This is because VLAN’s are statically provisioned at the service edge. There is no inspection & no device authentication. Anyone could plug into the edge port and if DHCP is configured on the VLAN they are on the network and no one is the wiser. Compare this with the use of IDE in tandem with Fabric Connect, where someone could unplug a system and then plug their own system in to try to gain access. This will obviously fail. As a result this box is shown in red as well as it is not a recommended method in stealth network access.


How do I design the Virtual Service Networks required?

Up until now we have been focusing on the abstract notions of data flow and footprint. At some point someone has to sit down and design how the VSN’s will be implemented and what if any relationships exist between them. Well at this point, if you have done due diligence in the data discovery process that was outlined earlier, you should have.

1). A list of transmitting and receiving systems

2). How those systems are related and their respective roles

a). Edge Systems (sensors, controllers, users)

b). Application Server environments (App., DB, Web)

c). Data Storage

3). A resulting flow diagram that illustrates how data moves through the network

a). Linear data flows

b). Closed loop (feedback) data flows

4). Identification of preferred or required communication domains.

a). Which elements need to ‘see’ and communicate with one another?

b). Which elements need to be isolated and should not communicate directly?

As an example to linear data flows, see the diagram below. It illustrates a typical PCI data footprint. Notice how the data flow is primarily from the point of sale systems to the bank. While there are some minor flows of other data in the footprint, it is by and large dominated by the credit card transaction data as it moves to data center and then to the bank or even directly to the bank.


Figure 2. Linear PCI Data Footprint

Given the fact the linear footprint is monolithic, the point of sale network can be handled by one L3 IP VPN Virtual Service Network. This VSN would terminate at a standard security demarcation point with a mapping of a single dedicated port. In the data center a single L2 Virtual Service Network could provide the required environment for the PCI server application and the uplink to the financial institution. Alternatively, some customers have utilized Stealth L2 VSN’s out to provide connectivity to the point of sale systems that are in turn collapsed to the security demarcation.


Figure 3. Stealth L2 Virtual Service Network


Figure 4. L3 Virtual Service Network

A Stealth L2 VSN is nothing more than a normal L2 VSN that has no IP addresses assigned at the VLAN service termination points. As a result the systems within it are much more difficult to discover and hence exploit. L3 VSN’s, which are I-SID’s associated with VRF’s are stealth by nature. The I-SID replaces traditional VRF peering methods creating a much simpler service construct.

To look at looped data flows, let’s use a simple two layer automation framework. As shown in the figure below.


Figure 5. Looped Data Footprint for Automation

We can see that we have three main elements in the system, two sensors (S1 & S2), a controller or actuator and a sensor/controller manager, which we will refer to as SCM. We can also see that the sensor feeds information on the actual or effective state of the control system to the SCM. For the sake of clarity let’s say that it is a flood gate. So the sensor (S2) can measure whether the gate is open or closed or in any intermediate position. The SCM can in turn control the state of the gate by actuating the controller. You might even be more sophisticated in that you not only can manage the local gate, but also manage the local gate according to upstream water level conditions. As such there would also be dedicated sensor elements that allow the system to monitor the water level as well, this is sensor S1. So we see a closed loop framework but we also see some consistent patterns in that the sensors never talk directly to the controllers. Even S2 does not talk to the controller; it measures the effective state of it. Only the SCM talks to the controller and the sensors only talk to the SCM. As a result we begin to see a framework of data flow and which elements within the end to end system need to see and communicate with one another. This in turn will provide us with insight as to how to design the supporting Virtual Service Network environment as shown below.


Figure 6. Looped Virtual Service Network design

Note that the design is self-similar in the effect that it is replicated at various points of the watercourse that it is meant to monitor and control. Each site location provides three L2 VSN environments for S1, S2 and A/C. Each of these is fed up to the SCM which coordinates the local sensor/control feedback. Note that S1, S2 and A/C have no way to communicate directly, only through the coordination of the SCM. There may be several of these loopback cells at each site location, all feeding back into the site SCM, but also note that there is a higher level communication channel provided by the SCM L3 VSN which allows for SCM sites to communicate upstream states information to downstream flood control infrastructure.

The whole system becomes a series of interrelated atomic networks that have no way to communicate directly and yet have the ability to convey a state of awareness on the overall end to and system that can be monitored and controlled in a very predictable fashion, as long as it is within the engineered limits of the system. But also note that each critical element is effectively isolated from any inbound or outbound communication other than that which is required for the system to operate. Now it becomes easy to implement intrusion detection and firewalls with a very narrow profile on what is acceptable within the given data footprint. Anything outside it is dropped, pure and simple.


Know who is who (and when they were there (and what they did))!

The prior statement applies not only to looped automation flows but also to any confidential data footprint. It is important not only to consider the validation of the systems but also the users who will access it. But it goes much further than network and systems access control. It touches into proper auditing of that access and associated change control. This becomes a much stickier wicket and there is still no easy answer. It really comes down to a coordination of resources, both cyber and human. Be sure to think out your access control policies in respect to the confidential footprint. Be prepared to buck standard access policies or demands from users that all services need to be available everywhere. As an example, it is not acceptable to mix UC and PCI point of sale communications in one logical network. This does not mean that a sales clerk cannot have a phone and of course we assume that a contact center worker has a phone. It means that UC communications will traverse a different logical footprint than the PCI point of sale data. The two systems might be co-resident at various locations, but they are ships in the night from a network connectivity perspective. As a customer recently commented to me, “Well, with everything that has been going on, users will just need to accept that it’s a new world.”     He was right. In order to properly lock down information domains there needs to be stricter management of user access to those domains and exactly what they can and cannot do within them. It may even make sense to have whole alternate user ID’s with alternate, stronger methods of authentication. This provides an added hurdle to a would-be attacker that might have gained a general users access account. Alternate user accounts also provide for easier and clearer auditing of those users activities within the confidential data domain. Providing for a common policy and directory resource for both network and systems access controls can allow for consolidation of audits and logs. By syncing all systems to a common clock and using tools such as the E.L.K stack (Elastic Search, Logstash and Kibana), entries can be easily searchable against those alternate user ID’s and systems that are touched or modified. There is still some extra work to generate the appropriate reports but having the data in an easily searchable utility is a great help.

Putting you ‘under the microscope’

Even in the best of circumstances there are times when a user or a device will begin to exhibit suspicious or abnormal behaviors. As previously established, having an isolated information domain allows for anomaly based detection to function with a very high degree of accuracy. When exceptions are found they can be flagged and highlighted. A very powerful capability of Avaya’s SDN Fx is its unique ability to leverage stealth networking services to move the offending system into a ‘forensics environment’ where it is still allowed to perform its normal functions but it is monitored to assure proper behavior or determine the cause of the anomaly. In the case of malicious activity, the offending device can be placed into quarantine with the right forensics trail. Today we have many customers who use this feature on a daily basis in a manual fashion. A security architect can take a system and place it into a forensics environment and then monitor the system for suspect activity. But the human needs to be at the console and see the alert. Recently, Avaya has been working with SDN Fx and the Breeze development workspace to create an automated framework. Working with various security systems partners, Avaya is creating an automated systems framework to protect the micro-segmented domains of interest. Micro-segmentation not only provides for the isolated environment for anomaly detection, but also for the ability to lock down and isolate suspected offenders.

Micro-segmentation ‘on the fly’ – No man is an island… but a network can be!

Sometimes there is the need to move confidential data quickly and in a totally secret and isolated manner. As a result to this, there arose a series of secure web services known as Tor or Onion sites. These sites were initially introduced and intended for research and development groups but over time they have been absconded by drug cartels and terrorist organizations. It has as a result become known as the ‘dark web’. The use of strong encryption in these services is now a concern among the likes of the NSA and FBI as well as many corporations and enterprises. These sites are now often blocked at security demarcations due to concerns about masked malicious activity and content. Additionally, many organizations now forbid strong encryption on laptops or other devices as concerns for their misuse has grown significantly. But clearly, there is a strong benefit to closed networks that are able to move information and provide communications with total security. There has to be some compromise that could allow for this type of service but provide it in a manner that is well mandated and governed by an organizations IT department. Avaya has been doing research into this area as well. Dynamic team formation can be facilitated once again with SDN Fx and the Breeze development workspace. Due to the programmatic nature of SDN Fx, completely isolated Stealth network environments can be established in a very quick and dynamic fashion. The Breeze development platform is used to create a self-provisioning portal where users can securely create a dynamic stealth network with required network services. These services would include required utilities such as DHCP, but also optional services such as secure file services, Scopia video conferencing, and internal security resources to insure proper behavior within the dynamic segment. A secure invitation is sent out to the invitees with URL attachment to join the dynamic portal with authenticated access. During the course of the session, the members are able to work in a totally secure and isolated environment where confidential information and data can be exchanged, discussed and modified with total assurance. From the outside, the network does not exist. It cannot be discovered and cannot be intruded into. Once users are completed with the resource they would simply log out of the portal and they would be automatically placed back into their original networks. Additionally, the dynamic Virtual Service Network can be encrypted by the network edge either on a device like Avaya’s new Open Network Adapter or by a partner such as Senetas, who is able to provide for secure encryption at the I-SID level. With this type of solution, the security of Tor and Onion sites can be provided but in a well-managed fashion that does not require strong encryption on the laptops. Below is an illustration of the demonstration that was publicly held at the recent Avaya Technology Forums across the globe.


Figure 7. I-SID level encryption demonstrated by Senetas

In summary

Many security analysts, including those out of the likes of the NSA are saying that micro-segmentation is a key element in a proper cyber-security practice. It is not a hard point to understand. Micro-segmentation can limit east-west movement of malicious individuals and content. It can also provide for isolated environments that can provide an inherently strong compliment to traditional security technologies. The issue that most folks have with micro-segmentation is not the technology itself but deciding what to protect and how to design the network to do so. Avaya’s SDN Fx Fabric Connect can drastically ease the deployment of a micro-segmented network design. Virtual Service Networks are inherently simple service constructs that lend themselves well to software defined functions. It cannot assist in deciding what needs to be protected however. Hopefully, this article has provided insight into methods that any organization can adopt to do the proper data discovery and arrive at the scope of the confidential data footprint. From there the design of the Virtual Service Networks to support it is extremely straightforward.

As we move forward into the new world of the Internet of Things and Smart infrastructures micro-segmentation will be the name of the game. Without it, your systems are simply sitting ducks once the security demarcation has been compromised or worse yet the malice comes from within.







What’s the Big Deal about Big Data?

July 28, 2014


It goes without saying that knowledge is power. It gives one the power to make informed decisions and avoid miscalculation and mistakes. In recent years the definition of knowledge has changed slightly. This change is the result of increases in the ease and speed in computation as well as the shear volume of data that these computations can be exercised against. Hence, it is no secret that the rise of computers and the Internet has contributed significantly to enhance this capability.
The term that is often bantered about is “Big Data”. This term has gained a certain mystique that is comparable to cloud computing. Everyone knows that it is important. Unless you have been living in a cave, you most certainly have at least read about it. After all, if such big names as IBM, EMC and Oracle are making a focus of it then it must have some sort of importance to the industry and market as a whole. When pressed for a definition of what it is however, many folks will often struggle. Note that the issue is not that it deals with the computation of large amounts of data as its name implies, but more so that many folks struggle to understand what it would be used it for.
This article is intended to clarify the definition of Big Data and Data Analytics/Data Science and what they mean. It will also talk about why they are important and will become more important (almost paramount) in the very near future. Also discussed will be the impact that Big Data will have on the typical IT departments and what it means to traditional data center design and implementation. In order to do this we will start first with the aspect of knowledge itself and the different characterizations of it that have evolved over time.

I. The two main types of ‘scientific’ knowledge

To avoid getting into an in depth discussion of epistemology, we will limit this section of the article to just the areas of ‘scientific’ knowledge or even more specifically, ‘knowledge of the calculable’. This is not to discount other forms of knowledge. There is much to be offered by spiritual and aesthetic knowledge as well as many other classifications including some that would be deemed as scientific, such as biology*. But here we are concerned with knowledge that is computable or knowledge that can be gained by computation.

* This is rapidly changing however. Many recent findings show that many biological phenomena have mathematical foundations. Bodily systems and living populations have been shown to exhibit strong correlations to non-linear power law relationships. In a practical use example, mathematical calculations are often used to estimate the impact of an epidemic on a given population.

Evolving for centuries but coming to fruition with Galileo in the 16th century, it was discovered that nature could be described and even predicted in mathematical terms. The familiar dropping of balls of different sizes and masses from the tower of Pisa is a familiar myth to anyone with even a slight background in the history of science. I say myth, because it is very doubtful that this had ever literally taken place. Instead, Galileo used inclined planes and ‘perfect’ spheres of various densities to calculate the fact that the gravitational pull is a constant regardless of size or mass. Lacking an accurate timekeeping device, he would sing a song to keep track of the experiments. Being an accomplished musician, he had a keen sense of timing. The inclined planes provided him the extended time for such a method. He correctly realized that it was resistance or friction that caused the deltas that we see in the everyday world. While everyone knows that when someone drops a cannon ball and a feather off of a roof, the cannon ball will strike the earth first. It is not common sense that in a perfect vacuum both the feather and the cannonball will fall at the exact same rate. It actually takes a video to prove it to the mind and this can be found readily if one looks on the Internet. The really important thing about this is that Galileo calculated this from his work with spheres and inclined planes and that the actual experiment was not carried out until many years after his death as the ability to generate a perfect vacuum did not exist at the time he lived. I find this very interesting as it says two things about calculable knowledge. First, it allows one to explain why things occur as they do. Second, and perhaps more importantly, it allows one to predict the results once one knows the mathematical pattern of behavior. Galileo realized this. Even though he was not able to create a perfect vacuum, by the meticulous calculation of the various values involved (with rather archaic mathematics – the equal sign had not even been invented yet, nor most of the symbols that we know as familiar) he was able to arrive at this fact. Needless to say, this goes against all common sense and experience. So much so, that this, as well as his workings with the fledgling science of astronomy, almost landed him on the hot seat (or stake) with the Church. As history attests however, he stuck to his guns and even after the Inquisitional Council had him recant his theories on the heliocentric nature of the solar system, he whispered of the earth… “Yet it still moves”.
If we fast forward to the time of Sir Issac Newton, this insight was made crystalline by Newton’s laws of motion which described the movement of ‘everything’ from the falling of an apple (no myth – this actually did spark his insight but it not hit him on the head) to the movement of the planets with a few simple mathematical formula. Published as the ‘Philosophiae Naturalis Principia Mathmatica’ or simply ‘Principia’ in 1687, this was the foundation of modern physics as we know it. The concept that the world was mathematical or at least could be described by mathematical terms was now something that was not only validated but demonstrable. This set of events led to the eventual ‘positivist’ concept of the world that reached its epitome with the following statement made by Pierre Laplace in 1814.
“Consider an intelligence which, at any instant, could have knowledge of all forces controlling nature together with the momentary conditions of all the entities of which nature consists. If this intelligence were powerful enough to submit all of this data to analysis, it would be able to embrace in a single formula the movements of the largest bodies in the universe and those of lighter atoms; for it, nothing would be uncertain; the future and the past would be equally present to its eyes.”

Wow. Now THAT’s big data! Sound’s great! What the heck happened?

Enter Randomness, Entropy & Chaos

In the roughly same time frame as Laplace, many engineers were using these ‘laws’ to attempt in the optimization of new inventions like the steam engine. One such researcher was a French scientist by the name of Nicholas-Leonard-Sadi Carnot. The research that he focused on was the movement of heat within the engine and to conserve as much of the energy as possible for work. In the process he came to realize that there was a feedback cycle within the engine that could be described mathematically and even monitored and controlled. He also realized the fact that some heat is always lost. It just gets radiated out and away from the system and is unusable for the work of the engine. As anyone that has stood next to a working engine of any type will attest, they tend to get hot. This cycle bears his name as the Carnot cycle. This innovative view led to the foundation of a new branch in physics (with the follow on help of Ludwig Boltzman) known as thermodynamics; the realization that all change in the world (and the universe as a whole) is the movement of heat, more specifically, hot to cold. Without going into detail on the three major laws of thermodynamics, the main point to this discussion is that as change occurs it is irreversible. Interestingly, recently developed information theory validates this as it shows that order can actually be interpreted as ‘information’ and that over time this information is lost to entropy in that there is a loss of order. Entropy is as such a measurement of disorder within a system. This brings us to the major inflection point on our subject. As change occurs, it cannot be run in reverse like a tape and arrive at the same inherent values. This is problematic, as the laws of Newton are not reversible in practice, though they may be on a piece of paper. As a matter of fact, many such representations up to modern times, such as the Feynman Diagrams to illustrate the details of quantum reactions are in fact reversible. What gives?
The real crux of this quick discussion is the realization that reversibility is largely a mathematical expression that starts to fall apart as the number of components in the overall system gets larger. A very simple example is one with two billiard balls on a pool table. It is fairly straightforward to use the Newtonian laws to reverse the equation. We can also do so in practice. But now let us take a single queue ball and strike a large number of other balls. Reversing the calculation is not nearly so straightforward. The number of variables to be considered begins to go beyond our ability to calculate much less control. They most certainly are not reversible in the everyday sense. In the same sense, I can flip a deck of playing cards in the air and bet you with ultimate confidence that the cards will not come down in the same order (or even the same area!) as in which it was thrown. Splattered eggs do not fall upwards to reassemble on the kitchen counter. And much to our chagrin, our cars do not repair themselves after we have had a fender bender. This is the term of entropy, the 2nd law of thermodynamics which states that some energy within a system is always lost to friction and heat. This dissipation could be minimized but never eliminated. As a result the less entropy an engine generates the more efficient it is in its function. Hmmmm, what told us that? A lot of data, that’s what, and back then things were done with paper & pencil! A great and timely discovery for its time as it helped move us into the industrial age. The point of all of this however is that in some (actually most) instances, information on history is important in understanding the behavior of a system.

The strange attraction of Chaos

We need to fast forward again. Now we are in the early 1960’s with a meteorologist by the name of Edward Lorenze. He was interested in the enhanced computational abilities that new computing technology could offer in the goal of predicting the weather. Never mind that it took five days worth of calculation to arrive at the forcast for the following day. At least the check was self evident as it already occurred four days ago!
As the story goes he was crunching some data one evening and one of the machines ran out of paper tape. He quickly refilled the machine and started it from where the calculations left off… manually by typing them in. He then went off and grabbed a cup of coffee to let the machine churn away. When he returned he noticed that the computations where way off the values that the sister machines were running. In alarm he looked over his work to find that the only real major difference was the decimal offset of the initial values (the interface only allowed a three place offset while the actual calculation was running with a six place offset). As it turns out the rounded values he typed in manually created a different result to the same calculation. This brought about the realization that many if not most systems are sensitive and at times extremely so to something now termed as ‘initial conditions’.
There is something more however. Lorenze discovered that if some systems are looked at long enough and with the proper focus of granularity, a quasi-regular or quasi-periodic pattern becomes discernible that allows for the general qualitative description of a system and its behavior without the ability to quantitatively say what the state of any particular part of the system may be at a given point in time. These are termed as mathematical ‘attractors’ within a system. A certain set of power law based formula that a system is, if left unperturbed, drawn to and will be maintained. These attractors are quite common. They are somewhat required for all dissipative systems. In essence, it is a behavior that can be described mathematically that by its nature keeps a system as a system, with just enough energy coming in to offset the entropy that must inevitably go out. The whole thing is fueled by the flow of energy (heat) through it. By the way, both you and I are examples of dissipative systems and yes we are based on a lot of information. But here is something to consider, stock markets are dissipative systems too. The only difference is that energy is replaced by money.

The problem with Infinity

The question is how sensitive do we have to be and to what level of focus will reveal a pattern? How many decimal places can you leave off and still have faith in the calculations that result? This may sound like mere semantics, but the calculable offset in Lorenzes’ work created results that were wildly different. (Otherwise he might very well have dismissed it as noise*)

* Actually in the electronics and communications area this is exactly what the phenomenon was termed as for decades. Additionally, it was termed as ‘undesirable’ and engineers sought to remove or reduce it so it was never researched further as to its nature. Recently efforts to leverage these characteristics are being investigated.

Clearly the accuracy in a given answer is dependent on how accurately the starting conditions are measured. Again, one might say that, OK perhaps this is the case for a minority of cases but that in most cases any difference will be minor. Again, this is alas not true. Most systems are like this. The term is ‘non-linear’. Small degrees of inaccuracy in the initial values of the calculations in non-linear systems can result in vastly different end results. One of the reasons for this is that with the seemingly unassociated concept of infinity, we touch on a very sticky subject. What is an infinite or infinitely accurate initial condition? As an example, I can take a meter and divide it by 100 to arrive at centimeters and then take a centimeter and divide it further to arrive at millimeters and so forth… This process can go on forever! Actually, this is not the case but the answer is not appeasing to our cause. We can continue to divide until we arrive at Planck’s constant which is the smallest recognizable unit of difference before the very existence of space and time become meaningless! In essence a foam of quantum probability from which emerges existence as we know it.
The practical question must be, when I make a measurement how accurate do I need to be? Well, if I am cutting a two by four for the construction of some macro level structure such as a house or shed, I only need to be accurate to the 2nd maybe 3rd decimal place. On the other hand, if I am talking about cutting a piece of scaffolding fabric to fit surgically into a certain locale within an organ to facilitate a substrate for regenerative growth, the orders of magnitude are very much increased. Possibly out to 6 or 8 decimal places. So the question to ask is how do we know how accurate we have to be? Here comes the pattern part! We know this by the history of the system we are dealing with! In the case of a house, we have plenty of history (a strong pattern – we have built a lot of houses) to deduce that we need only be accurate to a certain degree and the house will successfully stand. In the case of micro-surgery we may have less history (a weaker pattern – we have not done so many of these new medical procedures), but enough to know that a couple of decimal places will just not cut it. Going further we even have things like the weather where we have lots and lots of historic data but the exactitude and density of the information still limits us to only a few days of relatively accurate predictive power. In other words, quite a bit of our knowledge is dependent on the granularity and focus in which it’s analyzed. Are you starting to see a thread? Wink, wink.

Historical and Ahistorical knowledge

It all comes down to the fact that calculable knowledge is dependent on us having some idea of the history & conditions of a given system. Without these we can not calculate. But how do we arrive at these initial values? Well, by experiment of course. We all recall the days back in school with the tedious hours of experimentation in exercises where we knew full well the result. But think of the first time that this was realized by the likes of say Galileo. What a great moment it would have been! But an experiment by definition cannot be a ‘onetime thing’. One would have to run an experiment multiple times with ‘exactly’ the same conditions or varying the conditions slightly in a controlled fashion depending on what one was trying to prove. This brings about a strong concept of history. The experimental operations have been run, and we know that such a system behaves in such a way due to historical and replicable examples. Now we plug those variables into the mathematics and let it run. We predict from those calculations and then validate with further experiments. Basic science works on these principals, so as such we should say that all calculable knowledge is historic in nature. But it could also be said in argument that for certain immutable ‘mathematical truths’ that some knowledge is ahistorical. In other words, like Newton’s laws* and like the Feynman diagrams some knowledge just doesn’t care about the nature or direction of times arrow. Be that as it may it would further be argued that any of these would require historical knowledge in order to interpret their meaning or even find that they exist!

* Newton’s laws are actually approximations of what is reality. In normal everyday circumstances the linear laws work quite well. When speed or acceleration is brought to extremes however the laws fail to yield a correct representation. Einstein’s General Theory of Relativity provides for a more accurate way to represent the non-linear reality under these extreme conditions (actually they exist all the time, but in normal environments the delta to the linear is so small as to be negligible). The main difference – In Newton’s laws space and time are absolute. The clock ticks the same regardless of motion or location, hence linear. In Einstein’s theory space and time are mutable and dynamic. The clock ticks differently for different motions or even locations. Specifically, time slows with speed as the local space contracts, hence non-linear.

As an example, you can toss me a ball from about ten feet away. Depending on the angle and the force of the throw I can properly calculate where the ball will be at a certain point in time. I have the whole history of the system from start to finish. I may use an ahistorical piece of knowledge (i.e. the ball is in the air and moving towards me), but without knowledge of the starting conditions for this particular throw I am left with little data and will likely not catch the ball. In retrospect though, it’s amazing that our brains can make this ‘calculation’ all at once. Not explicitly of course but implicitly. We know that we have to back up or run forward to catch the ball. We are not doing the actual calculations in our heads (at least I’m not). But if I were to run out onto the field and see the ball that you threw in mid air with no knowledge of the starting conditions, I am essentially dealing with point zero in knowledge of a system that is pre-existing. Sounds precarious and it is. Because this is the world we live in. But wait! Remember I have a history in my head on how balls in air behave! I can reference this library and get a chunk of history in very small sample periods (the slow motion effect we often recall) and yes perhaps I just might catch that ball – provided that the skill of the thrower was consummate with the skill of those I have knowledge of. Ironically, the more variability there is in my experience with throwers of different skill levels; the higher the probability of my catching the ball in such an instance. And it’s all about catching the ball! But it also says something important about calculable knowledge.

Why does this balloon stay round? The law of large numbers

Thankfully, we live in a world full of history. But ironically, too much history can be a bad thing. More properly put, too specific of a history about a component within a system can be a bad thing. This was made apparent by Ludwig Boltzman in his studies of gasses and their inherent properties. While it is not only impractical but impossible to measure the exact mass and velocity of each and every constituent particle at each and every instant, it is still possible to determine their overall behavior. (He was making the proposition based on the assumption of the existence of as of yet unproven molecules and atoms.) As an example, if we have a box filled with air on one side and no air (a vacuum) on the other, we can be certain that if we lift the divider between the two halves, the particles of air will spread or ‘dissipate’ into the other side of box. Eventually, the gas in the now expanded box will have diffused to every corner. At this point any changes will be random. There is no ‘direction’ in which the particles will have to go. This is the realization of equilibrium. As we pointed out earlier this is simply entropy, reaching its ultimate goal within the limits of the system. Now let us take this box and make it a balloon. If we blow into it, the balloon will inflate and there will be equal distribution of whatever is used to fill it. Note that now the balloon is a ‘system’. After it cools to uniform state the system will reach equilibrium. But the balloon still stays inflated. Regardless of the fact that there is no notable heat movement within the balloon, it still remains inflated by the heat contained within the equilibrium. After all we did not say that there was no heat. We just said that there was no heat movement or more so that it has been slowed drastically. In actuality, it was realized that it was the movement of the molecules and this residual energy (i.e. the balloon at room temperature) that caused the pressure to keep the balloon inflated.*

* Interesting experiment… blow up a balloon and then place it in the freezer for a short while.

Boltzman, as a result of this realization was able to manipulate the temperature of a gas to control its pressure in a fixed container and visa-versa. This showed that the increase in heat actually caused more movement within the constituent particles of gas. He found that while it was futile to try and calculate what is occurs to a single particle; it was possible to represent the behavior of the whole mass of particles in the system by the use of what we now call statistical analysis. An example is shown in figure 1. What it illustrates is that as the gas heats up the familiar bell curve flattens and hence widens the probability that a given particle will be at a certain speed and heat level.

Figure 1

Figure 1. Flattening Bell curves to temperature coefficients

This was a grand insight, and it has enabled a whole new branch of knowledge which for better or worse; has helped shape our modern world. Note I am not gushing over the virtues of statistics, but it does when properly used have strong merits and it has enabled us to see things to which we would otherwise be blind. And after all, this is what knowledge is all about right? But wait, I have more to say about statistics. It’s not all good. As it turns out even if used properly, it can have blind spots.

Those pesky Black Swans…

There is a neat book written on the subject by a gentleman by the name of Nicholas Teleb*. In it he artfully speaks to the improbable but possible. Those events that occur every once in a while to which statistical analysis is often blind. These events are termed as ‘Black Swans’. He goes on to show these events are somewhat invisible to normal statistical analysis in that they are improbable events on the ‘outside’ of the Bell Curve. (Termed as ‘outliers’) He also goes on to indicate what he thinks is the cause. We tend to get myopic on the trends and almost convince ourselves of their dependability. We also do not like to think of ourselves as wrong or somehow flawed in our assumptions. He points out that in today’s world of information, there is almost too much of it and that you can find stats or facts just about anywhere to fit and justify your belief in that dependability. He is totally correct. Statistics is vulnerable to this. Yet, I need to correct that just a bit. It’s not statistics that is at fault. The fault lies with those using it as a tool.

* The Black Swan – Random House

Further, Taleb provides some insight to things that might serve as flags or ‘tell tales’ to Black Swans. As an example, he notes that prior to all drastic market declines they behaved in a spiky, intermittent behavior that, while still in norm with the Gaussian, had an associated ‘noise’ factor. Note that parallel phenomenon exists within electronics, communications and yes you guessed it, the weather! This ‘noise’ tends to indicate ‘instability’ where the system is about to change in a major topological fashion to another phase. These are handy things to know. Note how they deal with the overall ‘pattern’ of behavior. Not the statistical mean or even median.

Why is this at all important?

At this point you might be asking yourself. Where am I going with all of this? Well, it’s all about Big Data! As we pointed out, all knowledge is historical even if gained by ahistorical (law) insight. Properly understanding a given system means that one needs to understand not only those statistical trends, but higher level patterns of behavior that might betel outliers and black swans. All of this requires huge amounts of data of potentially wide varieties as well. Think of a simple example of modeling for a highway expansion. You go through the standard calculation and then consider that you want to add into consideration the local seasonal weather patterns. Things have exponentially increased in computation and data store requirements. This is what the challenge of Big Data is all about. It is in the realization, that it is not intended on handling the ‘simple’ questions. It is intent on pushing out the bounds of what is deemed tractable or calculable in the sense of knowledge. It’s not that the mathematics did not exist in the past. It’s just now that capability is within ‘everyday’ computational reach. Next let’s consider the use cases for Big Data and perhaps touch on a few actual implementations that you could actually run in your data center.


II. Big Data – What’s it good for? Absolutely everything! Well, almost…

If you will recall we spoke about dissipative systems. As it turns out, almost everything is dissipative in nature. The weather, the economy, the stock market, international political dynamics, our bodies, one could even say our own minds. Clearly, there is something to consider in all of that. The way humans behave is a particularly quirky thing. They (we) are also as a result the primary drive and input into the many of the other systems such as economics, politics, the stock market and yes even the weather. Further understanding in these areas could and actually have proven to be profound.
These are important things to know and we will talk a little later as to these lofty goals. But in reality Big Data can have far more modest goals and interests. A good real world example is for retail sales. It gets back to the age old adage… “Know your customer.” But in today’s cyber-commerce environment that’s often easier said than done. Fortunately, there are companies that are working in this area. One of the real founders to this is Google. Google is an information company at its heart. When one thinks about the sheer mass of information that it possesses it is simply boggling. Yet, Google strongly needs to leverage and somehow make sense of that data. At the same time however it had practical limits on computational power and associated costs for it. Out of these competing and contradictory requirements came the realization of a parallel compute infrastructure that leverages off the shelf commodity systems. Initially it was introduced to the public in a series of white papers as the Google File System or GFS and other ‘sister’ papers such as MapReduce, which provides for key/value mappings and Big Table, which can represent structured data into the environment. This technology has since been embraced by the open source community and is now known as Apache Hadoop Distributed File System or HDFS. The figure below shows the evolution of these efforts into the open source community.

Figure 2

Figure 2. Hadoop outgrowth and evolution into the open source space

The benefits of these developments are important as they provide for the springboard for the use of big data and data analytics in the typical Enterprise IT environment. Since this inception a literal market sector has sprung up with major vendors such as EMC and IBM but also startups such as Cloudera and MapR. This article will not go into the details of these different vendor architectures but be it safe to say that each has its spin and secret sauce that differentiates their approach. You can feel free to look into these different vendors and research others. For the purposes of this article we are concerned more so with the architectural principles of Hadoop and what it means to a Data Center environment. In data analytics a lot of data has to be read very fast. The longer it takes for the read time the longer the overall analytics process. HDFS leverages parallel processing at a very low level to provide for a highly optimized read time environment.

Figure 3

Figure 3. A comparison of sequential and parallel reads

In the above we show the same 1 terabyte data file being read by a conventional serial read process versus a Hadoop HDFS cluster which optimizes the read time by an order of ten. Note that the same system type is being used in both instances, but in the HDFS scenario there is just a lot more of them. Importantly, the actual analytic programming runs in parallel as well. Note also that this is just an example. The typical HDFS block size is 64 or 128MB. This means that relatively large amounts of data can be processed extremely fast with a somewhat modest infrastructure investment. As an additional note, HDFS also provides for redundancy and resiliency of data by the use of replication of the distributed data blocks within the cluster.
The main point is that HDFS leverages on a distributed data footprint rather than a singular SAN environment. Very often HDFS farms are comprised completely of Direct Attach Storage systems that are tightly coupled via the data center network.

How the cute little yellow elephant operates…

Hadoop is a strange name, and a cute little yellow elephant as its icon is even more puzzling. As it turns out one the key developers’ young son had a yellow stuffed elephant that he had named Hadoop. The father decided it would be a neat internal project name. The name stuck and the rest is history. True story, strange as it may seem.
Hadoop is not a peer to peer distribution framework. It is hierarchical, with certain master and slave roles within its architecture. The components of HDFS are fairly straight forward and shown in simplified form in the diagram below.

Figure 4

Figure 4. Hadoop HDFS System Components

The overall HDFS cluster is managed by an entity known as the Namenode. You can think of it as the library card index for the file system. More properly, it generates and manages the meta-data for the HDFS cluster. As a file gets broken into blocks and placed into HDFS, it’s the namenode that indicates where, and the namenode that tracks and replicates if required. The meta-data always provides a consistent map of the distributed file system as to where specific data resides. This is used not only for writing into or extracting out of the cluster, but also for data analytics which requires a reading of the data for its execution. It is important to note that in first generation Hadoop, it was a single point of failure. The secondary namenode in generation 1 Hadoop is actually a housekeeper process that extracts the nodename run-time metadata and copies it to disk in what is known as a namenode ‘checkpoint’. Recent versions of Hadoop now offer redundancy for the namenode. Cloudera for instance provides high availability for the namenode service.
There is a second node known as the Jobtracker. This service tracks the various jobs required to maintain and run over the HDFS environment. Both of these nodes are master role nodes. As such, Hadoop is not a peer to peer clustering technology, it is more so hierarchical.
In the slave role are the datanodes. These are the nodes that actually hold the data that resides within the HDFS cluster. In other words the blocks of data that are mapped by the namenode reside within these systems disks. Most often datanodes are direct attached storage and only leverage SAN to a very limited extent. The tasktracker is a process that runs on the datanodes and are managed and report back to the jobtracker for the various executions that occur within the Hadoop HDFS cluster.
And lastly, one of these nodes, referred to as the ‘edge node’ will have an ‘external’ interface that allows the HDFS environment to be exposed so that PC’s running the Hadoop HDFS client can be provided access.

Figure 5

Figure 5. HDFS Data Distribution & Replication

HDFS is actually fairly efficient in that it incorporates replication into the write process. As shown above, when a file is ingested into the cluster it is broken up into a series of blocks. The namenode utilizes a distribution algorithm to accomplish the mapping of where the actual data blocks will reside within the cluster. A HDFS cluster will have a default replication factor of three. This means that each individual block will be replicated three times and then placed algorithmically. The namenode in turn develops a meta-data map of all resident blocks with the distributed file system. This meta-data is in turn a key requirement for the read function, which is a requirement for analytics.
If a datanode were to fail within the cluster, HDFS will ‘respawn’ the lost data to meet the distribution and replication requirements. All of this means east/west data but it also means consistent distribution and replication which is critical for parallel processing.
HDFS is also rack aware. By this we mean that the namenode can be programmed to recognize that certain datanodes are common to racks and consequently should be taken into consideration during the block distribution or replication process. This awareness is not automatic. It must be programmed by batch or python script. However once it is done it allows the span algorithm to place the first data block on a certain rack and then placing the two replicated blocks into a separate common rack. As shown in the figure below, data blocks A and B are distributed evenly across the cluster racks.

Figure 6

Figure 6. HDFS ‘Rack Awareness’

Note that while the default replication factor is three for HDFS it can be increased or decreased at the directory or even file level. As adjustment to the R factor is done for a certain data set, the namenode assures that data is replicated, spawned or deleted according to that adjusted value.
HDFS uses pipelined writes to move data blocks into the cluster. In figure 7, a HDFS client executes a write for file.txt. As an example, the user might use the copyFromLocal command. The request is sent to the namenode. The namenode responds with a series of metadata telling the client where to write the data blocks. Datanode 1 is the first in the pipeline so it receives the request and sends a ready request to nodes 7 and 9. Nodes 7 and 9 respond and then the write process begins by placing the data block on datanode one where it is then pipelined to datanodes 7 and 9. The write process is not complete until all datanodes respond with a write success. Note that most data center topologies utilize a spine & leaf type topology meaning that most of the rack to rack data distribution must flow up and through the data center core nodes. In Avaya’s view, this is highly inefficient and can lead to significant bottlenecks that will limit the parallelization capabilities of Hadoop.

Figure 7

Figure 7. HDFS pipelined writes

Additionally, recent recommendations are to move to 40 GB interfaces for this purpose. These interfaces most certainly are NOT cheap. With the leaf and spline approach this means rack to rack growth requires large cap/ex outlay at each expansion. Suddenly, the aspect of Big Data and Data Science for the common man is becoming a myth! The network costs start to become the big key investment as the cluster grows and with big data, they always grow. We at Avaya have been focusing on this east/west capacity issue within the data center top of rack environment.
Reads within the HDFS environment happen in a similar fashion. When the Hadoop client requests to reads a given file the name node will respond with the appropriate meta-data so that the client can in turn request the separate data blocks from the HDFS cluster. It is important to note that the meta-data for a given block is in an ordered list. In the diagram below the name node responds with meta-data for data block A as being on datanodes 1, 7 & 9. The client will request the block from the first datanode in the list. Only after a failed response will it attempt to read from the other data nodes.

Figure 8

Figure 8. HDFS ordered reads

Another important note is that the read requests for data blocks B & C occur in parallel. It is only after all data blocks have been confirmed and acknowledged that a read request is deemed complete. Finally, similar to the write process, any rack to rack east/west flows need to flow over the core switch in a typical spine and leaf architecture. But it is important to note that most analytic processes will not utilize this type of methodology for ‘reading’. Instead, ‘jobs’ are sent in and partitioned into the environment where the read and compute processes occur on the local data nodes and then reduced into an output from the system as a whole. This provides for the true ‘magic’ of Hadoop, but it requires a relatively large east/ west (rack to rack) capacity and that capacity only grows as the cluster grows.
We at Avaya have anticipated this change of data center traffic patterns. As such we have taken a much more straightforward approach. We call it Distributed Top of Rack or “D-ToR”. ToR switches are directly interconnected using very high bandwidth backplane connections. These 80G+ connections provide ultra-low latency, direct connections to other ToRs to address the expected growth. The ToRs are also connected to the upstream core which can allow for the use of L3 and IP VPN services to ensure security and privacy.

Figure 9

Figure 9. Distributed Top of Rack benefits for HDFS

Note that the D-TOR approach is much better suited for high capacity east/west data flows rack to rack within the data center. Growth of the cluster no longer depends on continual investment in the leaf spline topology, now new racks are simply extended into the existing fabric mesh. Going further, by using front port capacity, direct east/west inter-connects between remote data centers can be created. We refer to this as Remote Rack to Rack. One of the unseen advantages of D-ToR is the reduction of north-south traffic. Where many architects were looking at upgrading to 40G or even 100G uplinks, Avaya’s approach negates this requirement by allowing native L2 east-west server traffic to stay at the rack level. The ports required for this are already in the TOR switches. This provides relief to these strained connections. It also allows for seamless expansion of the cluster without the need to continual capital investment in high speed interfaces.
Another key advantage of D-ToR is the flexibility it provides:
• Server to server connections, in rack, across rows or building to building or even site to site!
The architecture is far superior to other approaches in supporting advanced clustering technologies such as Hadoop HDFS.
• Traffic stays where it needs to be, reserving the North/South links for end user traffic or for advanced L3 Services. Only traffic that classifies as such need traverse the north/south paths.
• The end result is a vast reduction in the traffic on those pipes as well as a significant performance increase for east/west data flows. At far lesser cost.

Figure 10

Figure 10. Distributed Top of Rack modes of operation

Avaya’s Distributed Top of Rack can operate in two different ways-
• Stack-Mode can dual connect up to eight D-ToR switches. The interconnect is 640Gb without losing any front ports! Additionally dual D-ToR switches can be used to scale up to 16 giving a maximum east/west profile of 10 Tb/s
• Fabric-Mode creates a “one hop” mesh which can scale up to hundreds D-ToR switches! The port count tops out at 10 thousand plus 10Gig ports and a maximum east/west capacity of Hundreds of Terabits.

Figure 11

Figure 11. A Geo-distributed Top of Rack environment

Avaya’s D-ToR solution can scale in either mode. Whether the needs are small, large or unknown, D-ToR & Fabric Connect provides unmatched scale, flexibility and perhaps most importantly, the capability to solve the challenges, even the unknown ones that most of us face. As the HDFS farm grows, the seamless expansion capability of Avaya’s D-TOR environment can accommodate it without major architectural design changes.
Another key benefit is that Avaya has solved the complex PCI or HIPAA compliance issues without having to physically segment networks or by adding layers & layers of Firewalls. The same can be said for any sensitive data environments that might be using Hadoop, such as patient medical records, banking and financial information, smart power grid or private personal data. Avaya’s Stealth networking technology (referred to in the previous “Dark Horse” article) can keep such networks invisible and self-enclosed. As a result any attack or scanning surfaces to the data analytics network are removed. The reason for this is that Fabric Connect as a technology is not dependent upon IP as a protocol to establish and end to end service path. This removes on of the primary scaffolding for all espionage and attack methods. As a result the Fabric Connect environment is ‘dark’ to the IP protocol. IP scanning and other topological scanning techniques will yield little or no information.

Using MapReduce to extract meaningful data

Now that we have the data effectively stored and retrievable we will obviously want to exercise certain queries against the data and hopefully receive meaningful answers. MapReduce is the original methodology documented in the Google white papers. Note that it is also a utility within HDFS and is used to chunk and create meta-data for the stored information within the HDFS environment. Data can also be analyzed with MapReduce to extract meaningful secondary data such as hit counts & trends which can serve as the historical foundation for predictive analytics.

Figure 12

Figure 12. A Map Reduce job

Figure 12 shows a MapReduce project being sent into the HDFS environment. The HDFS cluster runs the MapReduce program against the data set and provides a response back to the client. Recall that HDFS leverages parallel read/write paths. MapReduce builds on this foundation. As a result, east/west capacity and latency are of important consideration in the overall solution.
• Avaya’s D-TOR solution provides easy and consistent scaling of the rack to rack environment as the Hadoop farm grows.

The components of MapReduce are relatively simple.

First there is the Map function, which provides the meta-data context within the cluster. So there is an independent record transformation that is a representation of the actual data. This includes deletions, replications to the system. For analytics, the function is performed against key value (K,V) pairs. The best way to describe it is to give an example. Let’s say a word, and we want to see how often it appears in a document or a given set of documents. Let’s say that we are looking for the word ‘cow’. This becomes the ‘key’. Every time the MapReduce function ‘reads’ the word cow it ticks a ‘value’ of 1. As the function proceeds through the read job various ticks are appended into a list of key/value pairs such as cow,31 or there are ‘31’ instances of the word ‘cow’ in the document or set of documents. For this type of job the reduce function is a method to aggregate the results from the Map phase and provide a list of key value pairs that are to be construed as the answer to the query.
Finally, there is the framework function which is responsible for scheduling and re-running of tasks. It also provides all utility functions such as providing a split to the input, which becomes more apparent on the figure below. But it actually refers to the chunking functionality that we spoke of earlier as data is written into HDFS. Typically, these queries are constructed into a larger framework. The figure shows a simple example of a query framework.

Figure 13

Figure 13. A simple Map Reduce word count histogram

Above we see a simple word count histogram, which is the exact process we talked about previously. The upper arrow shows data flow across the MapReduce process chain. As data is ingested into the HDFS cluster it is chunked into blocks as previously covered. The map function makes this read against the individual blocks of data. For purposes of optimization there are copy, sort and merge functions that provide for the ability to aggregate the resulting lists of key value pairs. This is referred to as the shuffle phase and it is accomplished by leveraging on east/west capacity within the HDFS cluster. From this the reduce function reduces the received key value outputs as a single statement (i.e. cow,31)
In the example above we show a construct to count for three words; Cow, Barn and Field. The details for two of the key value queries are shown. The third is simply an extension of that which is shown. From this we can infer that among these records cow appears with field more often than barn. This is obviously a very simple example with no real practical purpose unless you are analyzing dairy farmer diaries. But it illustrates the potential of the clustering approach in facilitating data farms that are well suited to the process of analytics which leverage very heavily on read performance.
In another more practical example, let’s say that we want to implement an analytics function for customer relationship management. We would want to know things like how often key words such as ‘refund’ or ‘dissatisfied’ or even terms like ‘crap’ and ‘garbage’ come up in queries of emails, letters or even transcripts of voice calls. Such information is obviously valuable and can gain an insight to customer satisfaction levels.
As one might guess, things could very quickly get unwieldy dealing with large numbers of atomic key/value queries. YARN, which stands for ‘Yet Another Resource Nanny’, allows for the building of complex tasks that are represented and managed by application masters. The application master starts and recycles tasks and also requests resources from the YARN resource manager. As a result a cycling self-managing job could be run. Weave is an additional developing overlay that provides for more extensive job management functions.

Figure 14

Figure 14. Using Hadoop and Mahout to analyze for credit fraud

The figure above illustrates a practical functional use of the technology. Here we are monitoring incoming credit card transactions for flagging to analysts. Transaction data will be flagged key value pairs. Indeed there may be dozens of key value pairs that are part of this initial operation. This provides for the consistent input into the rest of the flow. LDA scoring based on Latent Dirichlet Allocation allows for a comparative function against the normative set. It can also provide a predictive role. This step provides a scoring function on the generated key value pairs. At this point LDA provides a percentile of anomaly to a transaction. From there further logic can then impact a given merchant score.
All of this is based on yet another higher level construct known as Mahout. Mahout provides for an orchestration and API library set that can execute a wide variety of operations, such as LDA.
Examples are, Matrix Factorization, K Means & Fuzzy K Means, Logic Regression, Naïve Bayes and Random Forest. All of which in essence are packaged algorithmic functions that can be performed against the resident data for analytical and/or predictive purposes. Further these can be cycled such as the example above which would operate on each fresh batch presented to it.
Below is a quick definition of each set of functions for reference:

Matrix Factorization –
As its name implies this function involves factorizing matrixes. Which is to say to find two or more matrixes that when multiplied will yield the original matrixes (i.e. the other matrixes as a result must be subsets of the original). This can be used to discover latent features between entities. Factoring more than two matrixes requires the use of tensor mathematics which would be more complicated. A good example of use is in movie popularity and ratings matches such as done by NetFlix. Film recommendations can be made fairly accurately based on identifying these latent features. A subscriber rating, their interests in venues and the rating of those with similar interests can yield an accurate set of recommended films that the subscriber is likely to enjoy.

K-Means –
K-Means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. K-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into something termed as Voronoi cells. These cells are based on common attributes or features that have been identified. Uses for this are learning common aspects or attributes to a given population so that it can be subdivided or partitioned into various sub populations. From there things like logic regression can be run on the sub-populations.

Fuzzy K-Means –
K-Means clustering is what is termed ‘hard clustering’. In hard clustering, data is divided into distinct clusters, where each data element belongs to exactly one cluster and only one. In fuzzy clustering, also referred to as soft clustering, data elements can belong to more than one cluster, and associated with each element is a set of membership levels. These indicate the strength of the association between that data element and a particular cluster. Fuzzy clustering is a process of assigning these membership levels, and then using them to assign data elements to one or more clusters. A particular data element can then be rated as to its strongest memberships within the partitions that the algorithm develops.

Logic Regression –
In statistics, logistic regression, or logic regression, is a type of probabilistic statistical classification model. Logistic regression measures the relationship between a categorical dependent variable and one or more independent variables, which are usually (but not necessarily) continuous, by using probability scores as the predicted values of the dependent variable. Logic regression is hence used to analyze probabilistic relationships between different variables within a particular set of data.

Naïve Bayes –
In machine learning environments, naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes’ theorem with strong (naive) assumptions of independence between the features. In other words, it knows nothing to start. Naive Bayes is a popular (baseline) method for categorizing text, the problem of judging documents as belonging to one category or the other (such as spam or legitimate, classified terms, etc.) with word frequency as a large part of the features considered. This is very similar to the usage and context information provided by Latent Dirichlet Allocation

Random Forest –
Random Forests is another method for learning & classification of large sets of data from which further regression techniques can be used. Random Forests are in essence constructs of decision trees that are induced in a process known as training. Data is then run through the forest and various decisions are made to learn and classify the data. When building out large forests the concept comes into effect of allowing to decision tree subsets. Weights can then be given to each set and from that further decisions can be made.

The end result of all of these methods is a very powerful environment that is capable of machine learning type phenomena. The best part of it is that it is accomplished with off the shelf technologies. No super computer required. Just a solid distributed storage/compute framework and superior east/west traffic capacity in the top of rack environment. Big Data and Analytics can open our eyes to relationships between phenomena that we would otherwise be blind to. It can even provide us insight into causal relationships. But here we need to tread a careful course. Just because two features are related in some way does not necessarily mean that one causes the other.

A word of caution –

While all of this is extremely powerful, the last comments above should raise a flag to you. Just because you have lots of data and you have all of these fancy mathematical tools at your disposal you can still make some very bad decisions if your assumptions about the meaning of the data is somehow flawed. In other words, good data plus good math with bad assumptions will still yield bad decisions. We also need to remember Mr. Taleb and his black swans. Just because a system has behaved in the past within a certain pattern or range does not mean that it will continue to do so ad infinitum. Examples of these types of systems range from stock exchanges to planetary orbits to our very own bodies! In essence, most systems exhibit this behavior. Does that mean that all of the powerful tools referred to above are rendered invalid and impotent? Absolutely not. But we must remember that knowledge without context is somewhat useless, and knowledge with incorrect context is worse than ignorance. Why? Because we are confident about what it tells us. We like sophisticated mathematical tools that tell us in an oracle like fashion what the secrets of knowledge are within a given system. We have confidence in their findings because of their accuracy. But no amount of accuracy will make an incorrect assumption correct. This is where trying to prove ourselves wrong about our assumptions is very important. One might wonder why there are so many methods that sometimes appear to do the same thing but from a different mathematical perspective. The reason is that these various methods are often run in parallel to yield comparative data sets with multiple replicated studies. By generating large populations of comparative sets another level or hierarchy of trends and relationships becomes visible. Consistency of the sets will generally (but not always) indicate sound assumptions about the original data. Wild variations between sets in turn will usually indicate that something is flawed and needs to be revisited. Note that we are now talking about analyzing the analytical results. But this is not always done. Why? Because many times we don’t want to prove our own assumptions wrong. We want them to be right… no let’s go further – we need them to be right.
A good example is the recent market crash of 2006-2009. Many folks don’t know it but there is a little equation that actually holds a portion of the blame. Well, not really. As it turns out equations are a lot like guns. They are only dangerous when someone dangerous is using it. The equation in question is the Black-Scholes equation. Some have called it one of the most beautiful equations in mathematics. It is a very eloquent piece. Others would call it that because it had another name, the Midas equation. It made folks a ton of money! That is until…
The Black Scholes equation was an attempt to bring rationality to the futures market. This sounds good, but it is based on the concept of creating a systematic method of establishing a value for options before they mature. This also might not be a bad thing if your assumptions about the market are correct. But if there are things that you don’t know (and there always is), then those blind spots could in reality affect your assumptions in an adverse way. As an example, if you are trading on the futures of a given commodity and something happens in the market to affect demand that you did not consider or perhaps weighed its impact incorrectly then guess what… That’s right, you lose money!
In the last market crash that commodity was real estate. As one looks into the detailed history of the crash we can see multiple flawed assumptions that built upon one another. Then to compile the problem the market began to create obscurity by the use of blocks or bundles of mortgages that had absolutely no window into the risk factors associated with those assets. While the banks were buying blind, the banks were of the thought that foreclosures would be a minority and that the foreclosed home can always be sold for the loan value or perhaps greater. To the banks it seems that they couldn’t loose. We all know what happened. Even though the mathematics was elegant and accurate, the conclusions and the advice that was given as a result was drastically flawed and cost the market billions. The lesson, Big Data can lead us astray. It reminds us of the flawed premise of Laplace’s rather arrogant comment back in 1814. There is always something we don’t know about a given system such as a scope of history that we do not know or levels of detail that are unknown to us or perhaps even beyond our measurement. This does not disable data analytics but it puts a limit to its tractability in dealing with real world systems. In the end Big Data does not replace good judgment, but it can complement it.

So how do I build it and how do I use it?

Hadoop is actually fairly easy to install and set up. The major vendors in this space have gone much further in making it easy and manageable as a system. But there are a few general principles that should be followed. First, be sure to size your Hadoop cluster and maintain that sizing ratio as the cluster grows. The basic formula is 4 x D, where D is the data footprint to be analyzed. Now one might say ‘what’? I have to multiply my actual storage requirements by a factor of four!? But do not forget about the Map Reduce flow. The shuffle phase requires datanodes that will act as transitory nodes for the job flow. This extra space needs to be available. So while it might be tempting to float this number, it’s best not to. Below are a few other design recommendations to consider

Figure 15

Figure 15. Hadoop design recommendations

Another issue to consider is the sizing of the individual datanodes within the HDFS cluster. This is actually a soft set of recommendations that greatly depends on the type of analytics. In other words, are you looking to gauge customer satisfaction or model climate change or the stock market? These are obviously many degrees of complexity from one another. So it is wise to think about your end goals with the technology. Below is a rough sizing chart that provides some higher level guidance.

Figure 16

Figure 16. Hadoop HDFS Sizing Guidelines

Beyond this, it is wise to refer to the specific vendors design guidelines and requirements, particularly in the areas of high availability for master node services.
Another question that might be asked is “How do I begin?” In other words, you have installed the cluster and are ready for business but, what to do next? Actually this is very specific to usage and expectations. But we can at least boil it down to a general cycle of data ingestion, analytics and corresponding actions. This is really very similar to well-known systems management theory. A diagram of such a cycle is shown below.

Figure 17

Figure 17. The typical data analytics cycle

Aside from the work flow detail, it cannot be stressed enough, “Know your data”. If you do not know it then make sure that you are working very closely with someone who does. The reason for this is simple. If you do not understand the overall meaning of the data sets that you are analyzing then you are unlikely to be able to identify the initial key values that you need or should be focusing on. So often data analytics is done on a team basis with individuals from various backgrounds within the organization and the data analytics staff will work in concert with this disparate group to identify the key questions that need to be asked as well as the key data values that will help lead towards the construct of an answer to the query. Remember that comparative sets will allow for the validation of both the assumptions that are made on the data model but also on the techniques that are being used to extract and analyze the data sets in question. While it is tempting to jump to conclusions on initial findings, it is always wise to do further studies to validate those findings, particularly if it is a key strategic decision that will result from the analysis.

In summary

We have looked at the history of analytics from its founding fathers to its current state. Throughout, many things have remained consistent. This is comforting. Math is math. Four plus four back in Galileo’s time was the same answer as is today. But we must remember that math is not the real world. It is merely our symbolic representation of it. This was shown by the various discoveries on the aspects of randomness, chaos and infinitudes. We have gone on further in the article to show that the proper manipulation of large sets of data placed against a historical context can yield insights into it that might not be otherwise apparent. Recent trends are to establish methods to visualize the data and the resulting analytics by the use of graphic displays. Companies such as Tableau provide for the ability to generate detailed charts and graphs that can provide a visual view of the results of the various analytic functions noted above. Now a long table or spreadsheet of numbers becomes a visible object that can be manipulated and conjectured against. Patterns and trends can much more easily be picked out and isolated for further analysis. These and other trends are accelerating in the industry and become more and more available to common user or enterprise.
We also talked about the high east/west traffic profiles that are required to support Hadoop distributed data farms and the work that Avaya is doing to facilitate this in the Data Center top of rack environment. We talked about the relatively high costs of leaf spline architectures and Avaya’s approach to the top of rack environment as the data farm expands. Lastly, we spoke to the need for security in data analytics, particularly in the analysis of credit card or patient record data. Avaya’s Stealth Networking Services can effectively provide a cloak of invisibility over the analytics environment. This creates a Stealth Analytics environment from which the analysis of sensitive data can occur with minimal risk.
We also looked at some of the nuts and bolts of analytics and how, once data is teased out, it may be analyzed. We spoke to various methods and procedures, many times which are often worked in concert to yield comparative data sets. These comparative data sets can then be used to check assumptions made about the data and hence the analytic results. Comparative sets can help us measure the validity of the analytics that have been run, or more importantly the assumptions we have made. In this vein we wrapped up with a word of warning as to the use of big data and data analytics. It is not a panacea, nor is it a crystal ball but it can provide us with vast insights into the meaning of the data that we have at our fingertips. With these insights, if the foundational assumptions are sound we can make decisions that are better informed. It can also enable us to process and leverage the ever growing data that we have at our disposal at the pace required for it to be of any value at all! Yet, in all of this we are only at the beginning of the trail. As computing power increases and our algorithmic knowledge of systems increases the technology of data science will reap larger and larger rewards. But it is likely to never provide the foundation for Laplace’s dream.


‘Dark Horse’ Networking – Private Networks for the control of Data

September 14, 2013

Dark HorseNext Generation Virtualization Demands for Critical Infrastructure and Public Services



In recent decades communication technologies have realized significant advancement. These technologies now touch almost every part of our lives, sometimes in ways that we do not even realize. As this evolution has and continues to occur, many systems that have previously been treated as discrete are now networked. Examples of these systems are power grids, metro transit systems, water authorities and many other public services.

While this evolution has brought on a very large benefit to both those managing and using the services, there is the rising spectre of security concerns and the precedent of documented attacks on these systems. This has brought about strong concerns about this convergence and what it portends for the future. This paper will begin by discussing these infrastructure environments that while varied have surprisingly common theories of operation and actually use the same set or class of protocols. Next we will take a look at the security issues and some of the reasons of why they exist. We will provide some insight to some of the attacks that have occurred and what impacts they have had. Then we will discuss the traditional methods for mitigation.

Another class of public services is more so focused on the consumer space but also can be used to provide services to ‘critical’ devices. This mix and mash of ‘cloud’ within these areas are causing a rise in concern among security and risk analysts. The problem is that the trend is well under way. It is probably best to start by examining the challenges of a typical metro transit service. Obviously the primary need is to control the trains and subways. These systems need to be isolated or at the very least very secure. The transit authority also needs to provide for fare services, employee communications and of course public internet access for passengers. We will discuss these different needs and the protocols involved in providing for these services. Interestingly we will see some paradigms of reasoning as we do this review and these will in turn reveal many of the underlying causes for vulnerability. We will also see that as these different requirements converge onto common infrastructures conflicts arise that are often resolved by completely separate network infrastructures. This leads to increasing cost and complexity as well as increasing risk of the two systems being linked at some point in way that would be difficult to determine. It is here where the backdoor of vulnerability can occur. Finally, we will look at new and innovative ways to address these challenges and how they can take our infrastructure security to a new level without abandoning the advancement that remote communications has offered. The fact is, sometimes you do NOT want certain systems and/or protocols to ‘see’ one another. Or at the very least there is the need to have very firm control over where and how they can see one another and inter-communicate. So, this is a big subject and it straddles many different facets. Strap yourself in it will be an interesting ride!

Control and Data Acquisition (SCADA)

Most process automation systems are based on a closed loop control theory. A simple example of a closed loop control theory is a gadget I rigged up as a youth. It consisted of a relay that would open when someone opened the door to my room. The drop in voltage would trigger another relay to close causing a mechanical lever to push a button on a camera. As a result I would get a snapshot of anyone coming into my room. It worked fairly well once I worked out the kinks (they were all on the mechanical side by the way). With multiple siblings it came in handy. This is a very simple example of a closed loop control system. The system is actuated by the action of the door (data acquisition) and the end result is the taking of a photograph (control). While this system is arguably very primitive it still demonstrates the concept well and we will see that the paradigm does not really change much as we move from 1970’s adolescent bedroom security to modern metro transit systems.

In the automation and control arena there are a series of defined protocols that are of both standards based and proprietary nature. These protocols are referred to as SCADA, which is short for Supervisory Control and Data Acquisition. Examples of these protocols on the proprietary side are Modbus, BACnet and LonWorks. Industry standard examples are IEC 61131 and 60870-5-101[IEC101]. Using the established simple example of a closed loop control we will take the concept further by looking at a water storage and distribution system. The figure below shows a simple schematic of such a system. It demonstrates the concepts of SCADA effectively. We will then use that basis to extend it further to other uses.

Figure 1

Figure 1. A simple SCADA system for water storage and distribution

The figure above illustrates a closed loop system. Actually, it is comprised of two closed loops that exchange state information between. The central element of the system is the water tank (T). Its level is measured by sensor L1 (which could be as simple as a mechanical float attached to a potentiometer). As long as the level of the tank is at a certain range it will keep the LEVEL trace as ON. This trace is provided to a device called a Programmable Logic Controller (PLC) or Remote Terminal Unit (RTU). In the case of the diagram it is provided to PLC2. As a result PLC2 sends a signal to a valve servo (V1) to keep it in the OPEN state. If the level were to fall below a defined value in the tank then the PLC would turn the valve off. There may be additional ‘blow off’ valves that the PLC might invoke if the level of the tank grew too high. But this would be a precautionary emergency action. In normal working conditions this would be handled by the other closed loop. In that loop there is a flow meter (F1) that provides feedback to PLC1. As long as PLC1 is receiving a positive flow signal from the sensor it will keep the pump (P1) running and hence feeding water into the system. If the rate on F1 falls below a certain value then it is determined that the tank is nearing full and PLC1 will tell the pump to shut down. As an additional precaution there may be an alternate feed from sensor L1 that will only cause a flag to shut down the pump if the tank level reaches full. This is known as a second loop failsafe. As a result, we have a closed loop self monitoring system that in theory should run on its own without any human intervention. Such systems do. But they are usually monitored by Human Management Interfaces (HMI). In many instances these will literally be the schematic of the system with a series of colors (as an example yellow for off, orange & red for warning & alarm, green for running). In this way, an operator has visibility into the ‘state’ of the working system. HMI’s can also offer human control of the system. As an example, an operator might shut off the pump and override the valve close to drain the system for maintenance. So in that example the closed loop would be extended to include a human who could provide an ‘ad hoc’ input to the system.

The utility of these protocols are obvious. They control everything from water supplies to electrical power grid components. They are networked and need to be due to the very large geographic area that they often are required to cover. This is as opposed to my bedroom security system (it was never really intended on security – it was just a kick to get photos of folks who were unaware) which was a ‘discrete’ system. In such a system, the elements are hardwired and physically isolated. It is hard to get into such a room to circumvent the system. One would literally have to climb in the through the window. This offers a good analogy of what SCADA like systems are experiencing. But also one has to realize that discrete systems are very limited. As an example, it would be a big stretch to take a discrete system to manage a municipal water supply. One would argue that it would be so costly as to make no sense. So SCADA systems are a part of our lives. They can bring great benefit but there is still the spectre of security vulnerability.

Security issues with SCADA

Given that SCADA systems are used to control facilities such as oil, power and public transportation, it is important to ensure that they are robust and have the connectivity to the right control systems and staff. In other words they must be networked. Many implementations of SCADA are L2 using only Ethernet as an example for transport. Recently, there are TCP/IP extensions to SCADA that allow for true Internet connectivity. One would think that this is where the initial concerns for security would lie but actually they are just a further addition the systems vulnerabilities. There are a number of reasons for this.

First, there was a general lack in concern for security as many of these environments were at one time fairly discrete. As an example, a PLC is usually used in local control type scenarios. A Remote Terminal Unit does just what it says. It creates a remote PLC that can be controlled over the network. While this extension of geography has obvious benefits, along with it creep the window of unauthorized access.

Second, there was and still is the general belief that SCADA systems are obscure and not well known. Its protocol constructs are not widely published particularly in the proprietary versions. But as is well known, ‘security by obscurity’ is a partial security concept at best and many true security specialists would say it is a flawed premise.

Third, initially these systems had no connectivity to the Internet. But this is changing. Worse yet, it does not have to be the system itself that is exposed. All an attacker needs is access to a system that can access the system. This brings about a much larger problem.

Finally, as these networks are physically secure it was assumed that some form of cyber-security was realized, but as the above reason points out this is a flawed and dangerous assumption.

Given that SCADA systems control some of our most sensitive and critical systems it should be no surprise that there have been several attacks. One example is a SCADA control for sewer flow where a disgruntled ex-employee gained access to the system and reversed certain control rules. The end result was a series of sewage flooding events into local residential and park areas. Initially, it was thought to be a system malfunction, but eventually the hacker’s access was found out and the culprit was eventually nabbed.  This can even get into International scales. As critical systems such as power grids become networked the security concern can grow to the level of national security interests.

While these issues are not new, they are now well known. Security by Obscurity is no longer a viable option. Systems isolation is the only real answer to the problem.


The Bonjour Protocol

On the other side of the spectrum we have a service that is often required at public locations that is the antithesis of the prior discussion. This is a protocol that WANTS services visibility. This protocol is known as Bonjour. Created by Apple™, it is an open system protocol that allows for services resolution. Again it is best to give a case point example. Let’s say that you are a student that is at a University and you want to print a document from your iPAD. You can simply hit the print icon and the Bonjour service will send a SRV query for @PRINTER to the Bonjour multicast address of The receiver of the multicast group address is the Bonjour DNS resolution service which will reply to the request with a series of local printer resources for the student to use. To go further, if the student were to look for an off site resource such as a software upgrade or application, the Bonjour service would respond and provide a URL to an Apple download site. The diagram shows a simple Bonjour service exchange.

Figure 2

Figure 2. A Bonjour Service Exchange

Bonjour also has a way for services to ‘register’ to Bonjour as well. A good example as shown above is in the case of iMusic. As can see the player system can register to local Bonjour Service as @Musicforme. Now when a user wishes to listen they simply query the Bonjour service for @Musicforme and the system will respond with the URL of the player system. This paradigm has obvious merits in the consumer space. But we need to realize that consumer space is rapidly spilling over into the IT environment. This is the trend that we typically hear of as ‘Bring Your Own Device’ or BYOD. The University example is easy to see but many corporations and public service agencies are dealing with the same pressures. Additionally, some true IT level systems are now implementing the Bonjour protocol as an effective way to advertise services and/or locate and use them. As an example, some video surveillance cameras will use Bonjour service to perform software upgrades or for discovery. Take note that Bonjour really has no conventions for security other than the published SRV. All of this has the security world in a maelstrom. In essence, we have disparate protocols evolving out of completely different environments for totally different purposes coming to nest in a shared environment that can be of a very critical nature. This has the makings for a Dan Brown novel!



Meanwhile, back at the train station…

Let’s now return to our Transit Authority who runs as a part of its services high speed commuter rail service. As a part of this service they offer Business Services such as Internet Access and local business office services such as printing and scanning. They also have a SCADA system to monitor and control the railways. In addition they obviously have a video surveillance system and you guessed it, those cameras use the Bonjour service for software upgrade & discovery. They also have the requirement to run Bonjour for the Business Services as well.

In legacy approaches the organization would need to either implement totally separate networks or a multi-services architecture via the use of Multi-Protocol Label Switching or MPLS. This is an incredibly complex suite of protocols that have very well known CAP/EX and OP/EX requirements and they are high. Running an MPLS network is most probably the most challenging IT financial endeavor that an organization can take on. The figure below illustrates the complexity of the MPLS suite. Note that it also shows a comparison to Shortest Path Bridging IEEE 802.1aq and RFC 6329 as well as the IETF drafts to extend L3 services across the Shortest Path Bridging Fabric.

Figure 3

Figure 3. A comparison between MPLS and SPB

There are two major points to note. First, there is a dramatic consolidation of dependency overlay control planes into a single integrated one provided by IS-IS. Second, as a result to consolidation there results a breaking of the mutual dependence of the service layers into mutually independent service constructs. An underlying benefit is that services are also extremely simple to construct and provision. Another benefit is that these services constructs are correspondingly simpler from an elemental perspective. Rather than requiring a complex and coordinated set of service overlays, SPB/IS-IS provides a single integrated service construct element known as the I-Component Service ID or I-SID.

In previous articles we have discussed how an I-SID is used to emulate end to end L2 service domains as well as true L3 IP VPN environments. Additionally, we covered how I-SID’s can be used dynamically to provide solicited demand services for IP multicast. In this article, we will be focusing on their inherent traits of services separation and control as well as how these traits can be used to enhance a given security practice.

For this particular project we developed the concept of three different network types. Each network type is used to provide for certain protocol instances that require services separation and control. They are listed as follows:

1). Layer 3 Virtual Service Networks

These IP VPN services are used to create a general services network access for general offices and internet access.

2). Local User Subnets (within the L3 VSN)

These are local L2 broadcast domains that provide for normal internet ‘guest’ access for railway passengers. These networks can also support ‘localized’ Bonjour services for the passengers but the service is limited to the station scope and not allowed to be advertised or resolved outside of that local subnet boundary.

3). Layer 2 Virtual Service Networks

These L2 domains are used at a more global level. Due to SPB’s capability to extend L2 service domains across large geographies without the need to support end to end flooding, L2 VSN’s become very useful to support extended L2 protocol environments. Here we are using dedicated L2 VSN’s to support both SCADA and Bonjour protocols. Each protocol will enjoy a private non-IP routed L2 environment that can be placed anywhere within the end to end SPB domain. As such, they can provide global L2 service separated domains simply by not assigning IP addresses to the VLAN’s. IP can still run over the environment as Bonjour requires it, but that IP network will not be visible or reachable within the IS-IS link state database (LSDB) via VRF0.

Figure 4

Figure 4. Different Virtual Service Networks to provide for separation and control.

The figure above illustrates the use of these networks in a symbolic fashion. As can be seen, there are two different L3 VSN’s. The blue L3 VSN is used for internal transit authority employees and services. The red L3 VSN is used for railway passenger internet access. Note that there are two things of signifigance here. First, this is a one way network for these users. They are given a default network gateway to the Internet and that is it. There is no connectivity from this L3 VSN to any other network or system in the environment. Second, each local subnet also allows for local Bonjour services so that users can use their different personal device services without concern that they will go beyond the local station or interfere with any other service at that station.

There are then two L2 VSN’s that are used to provide for inter-station connectivity for the transit authorities use. The green L2 VSN is used to provide for the SCADA protocol environment while the yellow L2 VSN provides for the Bonjour protocol. Note that unlike the other Bonjour L2 service domains for the passengers, this L2 domain can not only be distributed within the stations but between the stations as well. As a result, we have five different types of service domains each one is separated, scoped and controlled over a single network infrastructure. Note that in the case of a passenger at a station who is bringing up their Bonjour client, they will only see other local resources, not any of the video surveillance cameras that also use Bonjour but do so in the totally separate L2 service domain that has absolutely no connectivity to any other network or service. Note also that the station clerk has a totally separate network service environment that gives them confidential access to email, UC and other internal applications that tie back into the central data center resources. In contrast, the passengers at the station are provided Internet access only for general browsing or VPN usage. There is no viable vector for any would be attacker in this network.

Now the transit authority enjoys the ability to deploy these service environments at will any where they are required. Additionally, if requirements for new service domains come up (entry and exit systems for example), they can be easily created and distributed without a major upheaval of the existing networks that have been provisioned.


Seeing and Controlling are two different things…

Sometimes one service can step on another. High bandwidth resource intense services such as multicast based video surveillance can tend to break latency sensitive services such as SCADA. In a different example project, these two applications were in direct conflict. The IP multicast environment was unstable causing loss of camera feeds and recordings in the video surveillance application. The SCADA based traffic light control systems experienced daily outages. In a traditional PIM protocol overlay we require multiple state machines that run in the CPU. Additionally, these state machines are full time meaning that they need to consider each IP packet separately and forward accordingly. For multicast packets there is an additional state machine requirement where there may be various modes of behavior based on whether it is a source or a receiver and whether or not the tree is currently established or extended. These state machines are complex and they must occur for every multicast group being serviced.

Figure 5

Figure 5. Legacy PIM overlay

Each PIM router needs to perform this hop by hop computation, and this needs to be done by the various state machines in a coordinated fashion. In most applications this is acceptable. As an example, for IP television delivery there is a relatively high probability that someone is watching the channels being multicast (if not, they are usually promptly identified and removed. Ratings will determine the most viewed groups). In this model, if there is a change to the group membership, it is minor and at the edge. Minor being the fact that one single IP set top box has changed the channel. The point here is that this is a minor topological change to the PIM tree and might not even impact it at all. Also, the number of sources is relatively small to the community of viewers. (200-500 channels to thousands if not tens of thousands of subscribers)

The problem with video surveillance is that this model reverses many of these assumptions and this causes havoc with PIM. First, the ratio of sources to receivers is reversed.  Also, the degree of the ratio changes as well.  As an example, in a typical surveillance project of 600 cameras there could be instances as high as 1,200 sources with transient spikes that will go higher during state transitions. Additionally, video surveillance applications typically have the phenomenon of ‘sweeps’, where a given receiver that is currently viewing a large group of cameras (16 to 64) will suddenly change and request another set of groups.

At these points the amount of required state change in PIM can be significant. Further, there may be multiple instances of this occurring at the same time in the PIM domain. These instances could be humans at viewing consoles or they could be DVR type resources that automatically sweep through sets of cameras feeds on a cyclic basis. So as we can see, this can be a very heavy lift application for PIM and tests have validated this. SPB offers a far superior method for delivering IP multicast.

Now let us consider the second application, the use of SCADA to control traffic lights. Often referred to as Intelligent Traffic Systems or ITS. Like all closed loop applications, there is a fail safe instance which is the familiar red and yellow flashing lights that we see occasionally during instances of storms and other impediments to the system. This is to assure that the traffic light will never fail in a state of permanent green or permanent red. As soon as communication times out, the failsafe loop is engaged and maintained until communications is restored.

During normal working hours the traffic light is obviously controlled by some sort of algorithm. In certain high volume intersections this algorithm may be very complex and based on the hour of the day. In most other instances the algorithm is rather dynamic and based on demand. This is accomplished by placing a sensing loop at the intersection. (Older systems were weight based while newer systems are optical.) As a vehicle pulls up to the intersection its presence is registered and a ‘wait set’ period is engaged. This presumably allows enough time for passing traffic to move through the intersection. In instances or rural intersection this wait set period will be ‘fair’. Each direction will have equal wait sets. In urban situations where there are minor roads intersecting with major routes the wait set period will be in strong favor of the major route. With a relatively large wait set period for the minor road. The point in all of this is that these loops are expected to be fairly low latency and there is not expected to be a lot of loss in the transmission channel. Consequently, SCADA tends towards very small packets that expect very fast round trip with minimal or no loss. You can see where I am going here. The two applications do not play well together. They require separation and control.

Figure 6

Figure 6. Separation of IP multicast and Scada traffic by the use of I-SIDs

As was covered in a previous article (circa June 2012) and also shown in the illustration above. SPB uses dynamic build I-SIDs with a value greater than 16M to establish IP multicast distribution trees. Each multicast group uses a discrete and individual I-SID to create a deterministic reverse path forwarding environment. Note also that the SCADA is delivered via a discrete L2 VSN that is not enabled for IP multicast or any IP configuration for that matter. As a result, the SCADA elements are totally separated from any IP multicast or unicast activity. There is no way for any traffic from the global IP route or ip vpn environment to get forwarded into the SCADA L2 VSN. There is simply no IP forwarding path available. The figure above illustrates a logical view of the two services.

 The end result of the conversion changed the environment drastically. Since then they have not lost a single camera or had any issues with SCADA control. This is a direct testament to the forwarding plane separation that occurs with SPB. As such both applications can be supported with no issues or concerns that one will ‘step on’ the other. It also enhances security for the SCADA control system. As there is no IP configuration on the L2 VSN (note that IP could still ‘run’ within the L2VSN – as for example as is possible with the SCADA HMI control consoles), there is no viable path for spoofing or launching a DOS attack.

What about IP extensions for SCADA?

As was mentioned earlier in the article there are methods to provide for TCP/IP extension for SCADA. Due to the criticality of the nature of the system however, this is seldom used due to the costs of securing the IP network from threat and risk. As with any normal IP network, protecting them to the required degree is difficult and costly. Particularly since the intention of the protocol overlay is provide for things like mobile and remote access to the system. Doing this with the traditional legacy IP networking with would be a big task.

With SPB, L3 VSN’s could be used to establish a separated IP forwarding environment that can then be then directed to appropriate secure ‘touch points’ within a predefined point in the topology of the network. Typically, this will be a Data Center or a Secured DMZ adjunct from it. There all remote access is facilitated through a well defined security series of Firewalls, IPS/IDS’s and VPN Service points. As it is the only valid ingress into the L3 Virtual Service environment, it is hence much easier and less costly to monitor and mitigate any threats to the system with clear forensics in the aftermath. The illustration below shows this concept. The message is that while SPB is not a security technology in and of itself, it is clearly a very strong compliment to those technologies.  If used properly it can provide the first three of the ‘series of gates’ in the layered defense approach. The diagram below shows how this operates.

Figure 7

Figure 7. SPB and the ‘series of gates’ security concept

In a very early article on this blog post I talked to the issues and paradigms of trust and assurance. (See Aspects and characteristics of Trust and its impact on Human Dynamics and E-Commerce – June 2009)
 There I introduced the concept of composite identities and the fact that all identities in cyber-space are as such. This basic concept is rather obvious when it speaks to elemental constructs of device/user combinations, but it gets smeared when the concept extends to applications or services. Or it can extend further to elements such as location or systems that a user is logged into. These are all elements of a composite instance of a user and they are contained within a space/time context. As an example, I may allow user ‘A’ for access application ‘A’ from location ‘A’ with device ‘A’. But any other location, device or even time combination may provide a totally different authentication and consequent access approach. This composite approach is very powerful. Particularly when combined with the rather strong path control capabilities of SPB. This combination yields an ability to determine network placement based on user behavior patterns. Those expected and within profile, but more importantly for those that are unusual and out of the normal users profile. These instances require additional challenges and consequent authentications.

As noted in the figure above, the series of gates concept merges well within this construct. The first gate provides identification of a particular user/device combination. From this elemental composite, network access is provided according to a policy. From there the user is limited to the particular paths that provide access to a normal profile. As a user goes to invoke a certain secure application, the network responds with an additional challenge. This may be an additional password or perhaps a certain secure token and biometric signature to reassure identity for the added degree of trust. This is all normal. But in the normal environment the access is provided at the systems level thereby increasing the ‘smear’ of the user’s identity. A critical difference in the approach I am referring to is that the whole network placement profile of the user changes. In other words, in the previous network profile the system that provides the said application is not even available by any viable network path. It is by the renewal of challenge and additional tiers of authentication that such connectivity is granted. Note how I do not say access but connectivity. Certainly systems access controls would remain but by and large they would be the last and final gate. At the user edge, whole logical topology changes occur that place the user into a dark horse IP VPN environment where secure access to the application can be obtained.

Wow! The noise is gone

In this whole model something significant occurs. Users are now in communities of interest where only certain traffic pattern profiles are expected. As a result, zero day alerts of anomaly based IPS/IDS systems become something other than white noise. They become very discrete resources with an expected monitoring profile and any anolamies outside of that profile will flag as a true alert that should be investigated. This enables zero day threat systems to work far more optimally as their theory of operation is to look for patterns outside of the expected behaviors that are normally seen in the network. SPB compliments this by keeping communities strictly separate when required. With a smaller isolated community it is far easier to use such systems accurately. The diagram below illustrates the value of this virtualized Security Perimeter. Note how any end point is logically on the ‘outer’ network connectivity side. Even though I-SID’s traverse a common network footprint they are ‘ships in the night’ in that they never see one another or have the opportunity to inter-communicate except by formal monitored means.

Figure 8

Figure 8. An established ‘virtual’ Security Perimeter

Firewalls are also notoriously complex when they are used for community separation or multi-tenant applications. The reason for this is that all of the separation is dependent on the security policy database (SPD) and how well it covers all given applications and port calls. If a new application is introduced and it needs to be isolated the SPD must be modified to reflect it. If this gets missed or the settings are not correct, the application is not isolated and no longer secure. Again SPB and dark horse networking help in controlling user’s paths and keeping communities separate. Now the firewall is white listed with a black list deny all policy after that. Now as new applications get installed unless they are added to the white list, they will be isolated by default within the community that they reside in. There is far less manipulation of the individual SPD’s and far less risk of an attack surface developing in the security perimeter due to a misssed policy statement.


Time to move…

There is another set of traits that are very attractive about SPB and particularly what we have done with it at Avaya in our Fabric Connect. It is something termed as mutability. In the last article on E-911 evolution we talked to this a little bit. Here I would like to go into it in a little more detail. IP VPN services are nothing new. MPLS has been providing such services for years. Unlike MPLS however, SPB is very dynamic in the way it handles new services or changes to existing services. Where the typical MPLS infrastructure might require hours or even days for the provisioning process, SPB can accomplish the same service in a matter of minutes or even seconds.  This is not taking into account that MPLS must also require the manual provisioning of alternate paths. With SPB not only are the service instances intelligently extended across the network by the shortest path, they are also provided all redundancy and resilience by virtue of the SPB fabric. If alternate routes are available they will be used automatically during times of failure. They do not have to be manually provisioned ahead of time. The fabric has the intelligence to reroute by the shortest path automatically. At Avaya, we have tested our fabric to a reliable convergence of 100ms or under with the majority of instances falling into the 50ms level. As such mutability becomes a trait that Avaya alone can truly claim. But in order to establish what that is let’s realize that there are two forms.

1). Services mutability

This was covered to some degree in the previous article but to review the salient points. It really boils down to the fact that a given L3 VSN can be extended anywhere in the SPB network in minutes. The principles pointed out from the previous article illustrate that membership to a given dark horse network can be rather dynamic and can not only be extended but retracted as required. This is something that comes as part and parcel with Avaya’s Fabric Connect. While MPLS based solutions may provide equivalent type services, none are as nimble, quick or accurate in prompt services deployment as Avaya’s Fabric Connect based on IEEE 802.1aq Shortest Path Bridging.

2). Nodal mutability

This is something very interesting and if you ever have the chance with hands on experience, please try it. It is very, very profound. Recall from previous articles, that each node holds a resident ‘link state database’ generated by IS-IS that reflects its knowledge of the fabric from its own relative perspective. This knowledge not only scopes topology but resident provisioned services as well as those of other nodes. This creates a situation of nodal mutability. Nodal mutability is the fact that a technician out at the far edge of the network can accidentally swap the two (or more) uplink ports and the node will still join the network successfully. Alternatively, if a node were already up and running and for some reason port adjacencies needed to change. It could be accommadated very easily with only a small configuration change. (Try it in a lab. It is very cool!) Going further on this logic the illustration below shows that a given provisioned node could unplug from the network and then drive over 100’s of kilometers to another location.

Figure 9

Figure 9. Nodal and Services Mutability

At that location, they could plug the node back into the SPB network and the node will automatically register the node and all provisioned services. If all of these services are dark horse then there will authentication challenges into the various networks that the node provides as users access services. This means in essence that dark horse networks can be extremely dynamic. They can be mobile as well. This is useful in many applications where mobility is desired but the need to re-provision is frowned upon or simply impossible. Use cases such as emergency response, military operations or mobile broadcasting are just a few areas where this technology would be useful. But there are many others and the number will increase as time moves forward. There is no corresponding MPLS service that can provide for both nodal and services mutability. SPB is the only technology that allows for it via IS-IS, and Avaya’s Fabric Connect is the only solution that can provide this for not only L2 but L3 services as well as for IP VPN and multicast.

Some other use cases…

Other areas where dark horse networks are useful are in networks that require full privacy for PCI or HIPPA compliance. L3 Virtual Service Networks are perfect for these types of applications or solution requirements. Figure 8 could easily be an illustration for a PCI compliant environment in which all subsystems are within a totally closed L3 VSN IP VPN environment. The only ingress and egress are through well defined virtual security perimeters that allow for the full monitoring of all allowed traffic. This combination yields an environment that, when properly designed, will easily pass PCI compliancy scanning and analysis. In addition, these networks not only are private – they are invisible to external would be attackers. The attack surface is mitigated to the virtual security parameter only. As such, it is practically non-existent.

In summary

While private IP VPN environments have been around for years they are typically clumsy and difficult to provision. This is particularly true for environments where quick dynamic changes are required. As an example, the typical MPLS IP VPN provisioning instances will require approximately 200 to 250 command lines depending on the vendor and the topology. Interstingly much of this CLI activity is not in provisioning MPLS but in provisioning other supporting protocols such as IGP’s and BGP. Also, consider that all of this is for just the initial service path. Any redundant service paths must then be manually configured. Compare with Avaya’s Fabric Connect which can provide the same service type with as little as a dozen commands. Additionally, there is no requirement to engineer and provision redundant service paths as they are already provided by SPB’s intelligent fabric.

As a result, IP VPN’s can be provisioned in minutes and be very dynamically moved or extended according to requirements. Again, the last article on the evolution of E-911 speaks to how an IP VPN morphs over the duration of a given emergency with different agencies and individuals coming into and out of the IP VPN environment on a fairly dynamic basis based on their identity, role and group associations.

Furthermore, SPB nodes are themselves mutable. Once again, IS-IS provides for this feature. An SPB node can unplug from the network and move to the opposite end of the topology which can be 100’s or even 1000’s of kilometers away. There they can plug back in and IS-IS will communicate the nodal topology information as well as all provisioned services on the node. The SPB network will in turn extend those services out to the node thereby giving complete portability to that node as well as its resident services.

In addition, SPB can provide separation for non IP data environments as well. Protocols such as SCADA can enjoy an isolated non IP environment by the use of L2 VSN’s and further they can be isolated so that there is simply no viable path into the environment for would be hackers.

This combination of privacy, fast mutability of both services and topology lend to what I term as a Dark Horse Network. They are dark, so that they can not be seen or attacked due to the lack of surface for such an endeavor. They are swift in the way they can morph by services extensions and they are extremely mobile, providing for the ability for nodes to make whole scale changes to the topology and still be able to connect to relevant provisioned services without any need to re-configure. Any other IP VPN technology would be very hard pressed to make such claims, if indeed they can make them at all! Avaya’s Fabric Connect based on IEEE 802.1aq sets the foundation for the true private cloud.

 Feel free to visit my new You Tube Channel! Learn how to set up and enable Avaya’s Fabric Path Technology in a few short step by step videos.

The evolution of E-911

November 2, 2012

                                                 NG911 and the evolution of ESInet


If you live within North America and have ever been in a road accident or had a house fire then you are one of the fortunate ones who had the convenience and assurance of 911 services. I am old enough to remember how these types of things were handled prior to 911. Phones (dial phones!) had dozens of stickers for Police, Fire and Ambulance. If there were no stickers then one had to resort to a local phone book that hopefully had an emergency services section. To think of how many lives that has been saved by this simple three digit number is simply boggling. Yet to a large degree we all now take this service for granted and assume it will just work as it always has regardless of the calling point. We also seem to implicitly assume that all of the next generation capabilities and intelligence that is available today can just automatically be utilized within its framework. This article is intended to provide a brief history of 911 services and how they have evolved up to the current era of E911. It will also talk about the upcoming challenges for extending the service into a true multi-tenant, multi-service framework that can leverage the latest technology offerings. In short, we are talking about the advent of Next Generation (NG911) Emergency Services infrastructure.

Conceptually, 911 is very simple. As the figure below illustrates, a person reporting an emergency calls the three digit number. The original intent was to provide the public with a single point of contact for all emergencies. Prior to 911, you would have a number for Police, a number for Fire, and a number for Medical; and to make matters worse, each jurisdiction would have it’s own unique number. That would be a dozen numbers to remember for a your town and 3 of your neighboring towns. It was out of this that “E”911 was born to deliver even more functionality. In addition to providing a single ubiquitous number regardless of where you were located, it provided ‘selective routing’, or automatic routing based on the originating numbers documented location in the telephone company database. It also provided some new intelligence on the wire, called Automatic NumberIdentification or ANI. You probably are more familiar with it’s street name of ‘Caller ID’.

Figure 1. Traditional 911 PSAP

This however was back in the days of land line type phone services. This is a growing minority in this age of mobile communications. Originally embodied by the advent of cellular phones, the industry has evolved to facilitate both local and wide area wireless technologies as well as PDA’s, tablets and yes, still cell phones. The problem with this is that the old original 911 model became increasingly broken and in need of an update to handle this new mobile phenomenon. Think of it, if I am driving down Interstate 90 in Boston and I called 911, how do they know that I am in Boston and not in Ontario, New York where my billing address is? At first, there was no such capability and some folks lost their lives due to longer response times incurred. For a while the first thing the 911 contact needed to do was validate and confirm the location assuming that if it was mobile there was no other way. Fortunately, this led to the evolution of PHASE 1 Cellular E911 services which allowed for the correlation of cellular 911 calls to a particular antennae face on a tower. Each cellular carrier has 3 antennas on each tower that provide service to a 120 degree arc of the compass. When a call is received in a particular sector, it is routed to the PSAP that has the primary coverage of that sector. PSAPs can also transfer calls between themselves, so if a call was misrouted once in a while, it could be easily warm handed off to the proper authority. There are several technologies that allow for this and they are summarized briefly in the illustration below.

Figure 2. Methods for mobile device location

As one can readily imagine, a wireless provider can tell which cell your device is operating in when the call to 911 is made. This can be a fairly vast geography however. The actual number varies depending on the technology but typically can be a radius of 10 to 20 miles. Accuracy is gained by leveraging different radio antenna sources by a method known as triangulation where a closer proximity can be gained by usng multiple signal points of reference. Lately, in the newer Droids and iPhones additional GPS capabilities lend to an accuracy of meters.

Another evolution is Assisted GPS or A-GPS. A-GPS works on the merging of GPS and network related technoloigies to increase the accuracy and decrease the ‘fix time’ to determine location. A-GPS uses network related resources and in turn use satelite services when signal conditions are poor due to signal weakness or interference. The typical A-GPS device will not only have GPS hardware but also Internet access as well. Most modern SmartPhones and PDA’s fit this capability mix. As a result a mobile user’s location can be determined with a great degree of accuracy.


But that was then…

Recent (within the decade) large scale emergencies both man made and otherwise have taught us a few things about events of this proportion. First, infrastructure is damaged and along with it communications elements. At times, communications can be lost all together for extended periods of time. Second, events of this scale require coordinated logistics between multiple organizations and their resources. When we put these two things together we see a real issue in that coordinated logistics requires reliable communications! Events like NYC 9/11, Katrina and even the BP Oil Spill crisis have shown that no single agency can address all of the needs that require response. In short, the ability to communicate effectively is paramount to effective large scale emergency response. Partcularly in large scale events of wide spread geographic proportions.

None the less, these events serve to remind us that they can render useless much of the technology we today take for granted. Additionally, the traditional E911 network is closed in architecture and very regional in the way it is deployed. This makes wide scale geographic coordination of information & resources very difficult. Emergencies that cross PSAP boundaries will often require additional impromtu adhoc communications that often lack context or clarity.

Now let’s add in the new abilities that technology brings to the table. Big Data analytics is my personal and professional favorite. In emergency situations, information is essential, but too much information without context will tend to slow down the emergency response. Contextual prioritized information and the timely delivery of it has been shown to increase both the timeliness and the accuracy of the response. A later example will clarify. The major point here is that E911, which was architected to handle the mobile emergency call, is still effective for that purpose but not effective for these upcoming challenges. NG911 is intended for the ‘other side’ of the equation, the agencies and services (fire, hazmat, medical response) that will require detailed and reliable communications and information to most effectively deal with the situation at hand.

All of this means that the supporting network must be capable of multi-service and multi-tenency. We will cover these two terms in the next few paragraphs. Both of these terms are part of the normal service provider nomenclature. Multi-service is ability of the network to deliver appropriate service level assurance for the proper operation of end to end applications. The categories most often thought of are voice, video and data but can be more granular to include data for certain application types so that some applications can be prioritized over another. Multi-tenancy is the ability to support multiple user, service or even application groups and keep the resources that they use totally separate from one another. At the same time, there may be applications that do have the requirement to cross tenant boundaries, such as IP voice or email but will be constrained to cross over a security demarcation were such rules can be enforced. Rule number one of multi-tenancy is tenant A should never see tenant B’s traffic or visa versa unless otherwise provisioned to do so as per above. Also tenant A should never be able to impinge on the resources allocated to tenant B, again unless otherwise provisioned. These are not easy bars to reach with traditional networking technology and practices. Typically, in order to do this to the scale required, we require a complex mix of technologies such as those shown in the diagram below. MPLS IP VPN services has really been the only technology that has been up to par to meet these requirements. Unfortunately, this means that many state and local governments are either forced to depend on a 3rd party public service provider or directly implement MPLS themselves. Those that do find that the technology is expensive, complex and requires an inordinately high staff count to properly implement and maintain it.

Recently however, there is another technology that has been ratified by the IEEE known as ‘Shortest Path Bridging’ or IEEE 802.1aq. This standard provides for a radical evolution to the Ethernet forwarding control plane that allows for both multi-tenency and multi-service capabilities without the complexities of legacy approaches. Previous articles have discussed both the methods and services that allow for these capabilities. As a result, we will not go into these areas with any depth here. To summarize, this is all achieved by introducing a link state protocol (IS-IS) to Ethernet switching as well as the concept of provisioned service paths. These innovations, when combined with a MAC encapsulation method known as MAC in MAC (IEEE 802.1ah) that serves as a universal forwarding label, allow for a radical change to the Ethernet switching control plane without abandoning its native dichotomy of control and data forwarding within the network element itself. This means that the switch remains an autonomous forwarding element, able to make its own decisions as to how to forward data most efficiently and effectively. Yet, at the same time the new stateful nature of the 802.1aq control plane allows for a very deterministic control of the data forwarding environment. The end result is a vast simplification of the Ethernet control plane that yields a very stateful and deterministic environment.

The figure below shows a comparison between MPLS and SPB. Note that there is a vast simplification in the number of protocol state machines required in order to support a given service. This simplification not only results in ease of use but also drastically increases the reachable scale for Ethernet as well. This is important for ESInets as the number of agencies and entities that will require access will increase as time moves on and NG911 technology evolves.

Figure 3. A comparison of MPLS to SPB


Ships in the Night, but I may want to jump ship if required…

As we look closer at the concepts of multi-tenancy for emergency services we see that the requirements can be fairly dynamic. As an example, during normal working operations entities may be quite separate from one another. Normal day to day operations might not require a lot of cross communications. There may be some common services that might be used such as email or Voice over IP as is often the case with State and Local Government, but by and large each agencies applications as well as traffic are largely separate.

During emergencies however this normal pattern may not apply. Certain entities may need to be in very tight logistical coordination and as a result have to communicate in a very seamless fashion with applications that may straddle agency boundaries. A good example is a hazardous chemical spill. In a typical scenario you will have a large number of agencies or entities involved in the response. For instance, there will obviously be the police to cordon off the area and maintain a ‘do not cross’ line. You will also have the Fire Dept. with particular HazMat teams that are matched according to experience. You may also have several area hospitals that are alerted and set up with triage teams to handle the exposed victims as well as ambulance services to provide transport. Obviously, the teams selected should have previous experience with such events and preferably even with the particular substance involved. The ability to match experience to requirements is a very key element to a successful response. This is where data analytics plays a key role. Another key element is to enable these teams to communicate effectively and with as much context and supporting data as possible, but it has to be filtered so as not to overload response personnel with superfilous information.

The figure below illustrates some of the potential that SPB could bring to the table to address these requirements. As shown, each entity in question has their own isolated L3 IP VPN environment that provides for normal day to day operations. As an emergency occurs however, a new L3 IP VPN environment can be created for the event response teams. Members of these teams will be selected and provided with enhanced credentials to access this new IP VPN environment. Note that these teams will have bi-directional communication capabilities. Both normal day to day services such as email as well as dedicated or special services for the emergency response can be provided to this team. Additionally, as they use these dedicated services they are isolated from the other VPN environments both from a service and resource perspective. This is important, as the applications that are being used during the emergency response might be high bandwidth such as video or insistent such as east/west flows within the data centers to support outbound data for field application use. In either case they most definitely will be critical and require absolute gaurantee of services reliability.

Figure 4. A hazardous material spill emergency

As the figure above shows, this new L3 IP VPN environment will exist for as long as required by the emergency response teams and can even exist for as long after the event as neccesary for forensics and/or audit investigations. Further, if additional entities are discovered to be required during the course of the event or for investigations afterwards, it becomes very easy and straight forward to extend the L3 IP VPN to include these new members without the need to do major rearchitecting of the service. As shown below, investigatory units from both the police and fire are required after the event has transpired. At each agency new memberships to the special L3 IP VPN environment are added and the personnel that are assigned to the investigation units are provided access via centralized or distributed access controls. These virtual Service networks are then added to the L3 IP VPN environment to facilitate their ability to communicate with the wider team. Note also, that certain critical real-time elements such as 911 dispatch, ambulance and emergency triage are no longer required in the post event L3 VPN so they are effectively dropped from the membership but can easily be added again if required. The main point in all of this is that unlike MPLS which has very complicated and somewhat rigid provisioning practices that prohibit such dynamic behavior, SPB due to its vast simplification of the protocol substrate allows for quick re-provisioning of the network environment without the complexity. Indeed the whole solution approach has a profound consequence; it has been largely relegated to the practice and federation of identity management.

Figure 5. Post event forensics L3 VPN


When the world is falling apart…

As we have learned from various wide scale emergencies both man made and otherwise such as NYC 9/11 and Katrina and more recently Sandy, there is often significant infrastructure damage that occurs with a disaster event. Such damage can be a critical impediment to the responding emergency teams. Often complex logistical data is provided by response data centers that correlate and filter information out to the field teams. Failures in the response center or in the network path between the field teams and the response center can cause a major set of logistical complications and possibly cost additional lives.

In one of my previous articles titled “Data Storage: The Foundation and potential Achilles Heel of Cloud Computing”. I illustrated the critical importance of the data footprint and the requirement of mobile virtual machaines to have access to these data stores regardless of location. Also, many applications are composite instances that are the result of several server exchanges on the data center back end. This is further complicated that in order to provide a truly resilient data fabric, multiple data centers are required at geographically dispersed locations. As a result, these data stores need to be replicated and updated on a very consistent basis, sometimes up to a full data journaling or copy on write requirement. Additionally, Virtual Machines need to be migrated or at the very worst whole scale site recovery be initiated. As this occurs, mapping to data stores must be preserved including all required network paths. Also, as the migration of the VM, Cluster or whole Data Center occurs, users will require the adequate communication paths to seamlessly continue in the use of the applications they require to do thier jobs. The figure below illustrates these critical relationships and the communication paths required to facilitate them.

Figure 6. Required Services and Communication Paths

Interestingly SPB provides a very optimal solution in that the networking technology is ‘topology aware’. As such, its convergence time is extremely fast, ranging in the 100’s of milliseconds. This not only includes layer 2 services like VLAN’s but layer 3 services such as IP VPN’s and IP multicast as well. As major outages occur within the fabric each individual SPB node will natively make the forwarding decision based on its shortest path knowledge of the network. If a path exists, SPB will use it. As the diagram below shows, several major outages can occur at multiple points in the end to end topology but if the mesh fabric is engineered correctly, there will always be an alternate route that is available for use. As a result whole regions of the network can fail without an overall failure to the network as a whole. Redundant links can be wireline (optical) or wireless such as microwave. As long as they provide point to point communications links for the SPB nodes and allow for the protocol to establish adjacencies they are candidate technologies for transport linkage.

Figure 7. Shortest Path Resiliency

As is shown above, both data centers and users have valid communication paths available despite the fact that a good portion of the network is down. This is an important trait for reliable communication infrastructures, particulary those that are used during emergencies. Note that through all of this the normal NG911 service is running as normal with no disruption of services or outage of call services.

Give me the Bull Horn please

In emergencies it is often a very strongly desired trait to broadcast alerts to all members of a given team or set of teams. This capability can increase the effectiveness of the field response teams but also may very well save their lives. In the past this feature was leveraged via LMR or Land Mobile Radio. While such technology still has valid use and is often gatewayed at the edge for voice communications, other packet based technologies can deliver richer information such as video and graphics such as weather and radar maps or building blueprints. The major limitation for these newer forms of wireless communication are that they require an IP multicast infrastructure which is difficult to scale and support. Additionally, major network outages tend to adversely affect the multicast service often to the point of rendering it unusable. As mentioned earlier, SPB can provide convergence of multicast services on the order of 100’s of millisenconds. This is accomplished by eliminating the typical protocol overlay model of networking shown in figure 3 and creating a collapsed route switching substrate which is Shortest Path Bridging. As the network is shortest path tree aware, it is also multicast distribution tree aware. My previous article discusses multicast in SPB and the major advancements in scale, performance and convergence time it provides. The diagram below shows a more symbolic representation in a major alert going out from the response center to not only the field response teams to the NG911 PSAP’s as well.

Figure 8. SPB Multicast used to provide all points alerts via multicast

With traditional networking technologies this would be a very difficult proposition, requiring the interaction of multiple virtualized PIM domains within MPLS. With SPB, it’s inherent multi-tenant capabilties lends to easy distribution of multicast trees each in separate VPN environments within one network domain. Additionally, there is the benefit of sub-second convergence of the network in lieu of failure or outage which would be fast enough to be totally transparent to the multicast services running over it whether they are audio, video, graphics or data. These traits are highly desireable and lend themselves well to critical communications infrastructure. Real time services become much more reliable when a resilient scalable networking technology is used as the infrastructure substrate. This also compliments the various layers of resiliency that can be built into other functional parts of the end to end solution such as servers and storage as well as whole data centers. The end result is a very strong resiliency plan that can stand the worst of impact and still survive… as long as valid SPB links exists.

In Summary

Let’s face it. No one ever wants to call 911. But when an emergency occurs we are always thankful for the service that it provides. Like many, I recall a time prior to the service. Many more have known nothing but. As this critical civil service moves into the future and begins to leverage the new technologies that are available it will become more and more important to pay attention to the network infrastructure that will support them. The ‘Cloud’ works by the reach of the network. The services remain up by the resiliency that the network provides. In reality, this is nothing new. Service Providers have been using such pratices for years. The thing that is really new is that IEEE 802.1aq Shortest Path Bridging provides for an infrastructure that is no longer out of reach for most State and Local Governments who are now analyzing the network requirements for true NG 911 and ESInet evolution.

I would like to thank my esteemed colleague Mark Fletcher, a fellow Avaya Engineer for his input and mentoring for this article. Mark has extensive experience in E911 and is an industry recognized expert in his field.

How would you like to do IP Multicast without PIM or RP’s? Seriously, let’s use Shortest Path Bridging and make it easy!

June 8, 2012


Why do we need to do this? What’s wrong with today’s network?

Anyone who has deployed or managed a large PIM multicast environment will relate to the response to this question. PIM works on the assumption of an overlay protocol model. PIM stands for Protocol Independent Multicast, which means that it can utilize any IP routing table to establish a reverse path forwarding tree. These routes can be created with any independent unicast routing protocol such as RIP or OSPF, or even be static routes or combinations thereof. In essence, there is an overlay of the different protocols to establish a pseudo-state within the network for the forwarding of multicast data. As any network engineer who has worked with large PIM deployments will attest, they are sensitive beasts that do not lend themselves well to topology changes or expansions of the network delivery system. The key word in all of this is the term ‘state’. If it is lost, then the tree truncates and the distribution service for that length of the tree is effectively lost. Consequently, changes need to be done carefully and be well tested and planned. And this is all due to the fact that the state of IP multicast services is effectively built upon a foundation of sand.

The first major point to realize is that most of today’s Ethernet switching technology still operates with the same basic theory of operation as the original IEEE 802.1d bridges. Sure, there have been enhancements of VLAN’s and tagged trunking that allow us to slice a multi-port bridge (which is what an Ethernet switch really is) up into virtual broadcast domains and extend those domains outside of the switch and between other switches. But, by and large the original operational process is based on ‘learning’. The concept of a learning bridge is shown in the simple illustration below. As a port on a bridge receives an Ethernet frame it remembers the source MAC address as well as the port it came in on. If the destination MAC address is known it will forward out to the port that it is last known to be on. As shown in the example below source MAC “A” is received on port 1. As the destination MAC “B” is known to be on port 2, the bridge will forward accordingly.


Figure 1. Known Forwarding

But MAC “A” also sends out a frame to destination MAC “C”. Since MAC “C” is unknown to the bridge, it will flood the frame to all ports. As a result of the flooding, MAC “C” responds and is found to be on port 3. The bridge records the information into its forwarding information base and forwards the frame accordingly from that point on. Hence, this method of bridging is known as ‘flood based learning’. As one can readily see, it is a critical function for normal local area network behavior. No one argues the value or even the neccesity of learning in the bridged or switched environment. The problem is that the example above was circa 1990.

 Figure 2. Unknown Flooding

As the figure below shows, adding in Virtual LAN’s and multi-port high speed switches makes things much more complex. The reality of it is that as the networking core grows larger, the switches in the middle get busier and busier. The forwarding tables need to be larger and larger, where end to end VLAN’s are no longer tractable so layer 3 boundaries via IP routing are introduced to segment the network domains. In the end, little MAC “A” is just one of the tens of thousands of addresses that traverse the core. In essence, there is no ‘state’ for MAC “A” (or any other MAC address for that matter).

 Figure 3. Unknown Flooding in a Routed VLAN topology

Additionally, recall that multicast is a destination address paradigm. IP multicast groups translate to destination MAC addresses at the Ethernet forwarding level. Due to the fact that it is a destination address, there needs to be a resolution to a unicast source address. This is not a straight forward process. It involves the overlay of services on top of the Ethernet forwarding environment. These services provide for the resolution of the source as well as the build of a reverse path forwarding environment and the joining of that path to any pre-existing distribution tree. In essence these overlay services embed a sort of ‘state’ to the multicast forwarding service. These overlays are also very dependent on timers for the operating protocols and the fine tuning of these timers according to established best practice to maintain the state of the service. When this state is lost or becomes ambiguous however, nasty things happen to the multicast service. This is the primary reason why multicast is so problematic in todays typical enterprise environment.

The protocols most often used to establish unicast routing service are, OSPFv2 or v3 (Open Shortest Path First – v2 being for IPv4 and v3 being for IPv6) for establishing the unicast routing tables for IP. OSPF runs over Ethernet and establishes end to end forwarding paths on top of the stateless frame based flood and learn environment below. On top of this, PIM (Protocol Independent Multicast) is run to establish the actual multicast forwarding service. Source resolution is provided by a function known as a ‘RP’ or Rendevous Point. This is an established service that registers sources for multicast and provides the ‘well known’ point within the PIM domain to establish source resolution. As a result, in PIM sparse mode all first joins to a multicast group from a given edge router is always via the RP. Once the edge router begins to receive packets it is able to discern the actual unicast IP address of the sending source. With this information the edge PIM router or the designated router (DR) will then build a reverse path forward back to the source or the closest topological leg of an existing distribution tree. At the L2 edge, end stations signal their interest in a given service by a protocol known as Internet Group Management Protocol or simply IGMP. In addition, most L2 switches can be aware of this protocol and actually allow for discretionate forwarding to interested receivers without flooding to all ports in a given VLAN. This process is known an IGMP snooping. In PIM sparse mode, the version of IGMP typically used is IGMPv2 which is non-source specific (This is *,G mode, where * means that the source address is not known.) Once the source is resolved by the RP the state changes to S,G – where the source is now known. All of this is shown in the diagram below.


Figure 4. Protocol Independent Multicast Overlay Model

As can be readily seen, this is a complex mix of technologies to establish the single service offering. As a result large multicast environments tend to be touchy and require a comparitively large operational budget and staff to keep running. Large changes to network topology can wreak havoc with IP multicast environments. As a result such changes need to be thought through and carefully planned out. Not all changes are planned however. Network outages force topological changes that can often adversely affect the stability of the IP multicast service. The reason for this is the degree of protocol overlay and the need for correlation of the exact state of the network. As an example, a flapping unicast route could adversely affect an end to end multicast service. Additionally, this problem could be caused at the switch element level by a faulty link, port or module. Mutual dependencies in these types of solutions lend themselves to difficult troubleshooting and diagnostics. This translates to longer mean time to repair and overall higher operational expense.


 There must be a better way…

As we noted previously, IP multicast is all about state. Yet at the lowest forwarding element level the operational aspects are stateless. It seems that a valid path forward is to evolve this lowest level to become more stateful and deterministic in the manner in which traffic is handled. In essence, the control plane of Ethernet Switching needs to evolve.

Control Plane Evolution

IEEE has established a set of standards that allows for the evolution of the Ethernet switching control plane into a much more stateful and deterministic model. There are three main innovations that enable this evolution.

Link State Topology Awareness – IS-IS

Universal Forwarding Label –The B-MAC

Provisioned Service Paths – Individual Service Identifiers

This is all achieved by introducing link state protocol (IS-IS) to Ethernet switching as well as the concept of provisioned service paths. These innovations, when combined with a MAC encapsulation method known as MAC in MAC (IEEE 802.1ah) allow for a radical change to the Ethernet switching control plane without abandoning its native dichotomy of control and data forwarding within the network element itself. This means that the switch remains an autonomous forwarding element, able to make its own decisions as to how to forward data most effectively. Yet, at the same time the new stateful nature of the control plane allows for very deterministic control of the data forwarding environment. The end result is a vast simplification of the Ethernet control plane that yields a very stateful and deterministic environment. This environment can then optionally be equipped with a provisioning server infrastructure that provides an API environment between the switching network and any applications that require resources from it. As applications communicate their requirements through the API, the server instructs the network on how to provision paths and resources. Yet importantly, if the network experiences failures, the switch elements know how to behave and have no need to communicate back to the provisioning server. They will automatically find the best path to facilitate any existing sessions and will use this modified topology for any new considerations.  In this model the best of both worlds is found. There is deterministic control of network services, but the network elements remain in control of how to forward data and react to changes in network topology.

 Figure 5. Stateful topology with the use of IS-IS

This technology is known as Shortest Path Bridging, the IEEE standard 802.1aq. As its name implies, it is an Ethernet switching technology that switches by the shortest available path between two end points. The anology here are the IP link state routing protocols OSPFv2 for IPv4 and OSPFv3 for IPv6. In link state protocols each node advertises its state as well as any extended reachability. By these updates, each node gains a complete perspective of the network topology. Each element then runs the Dyjkstra shortest path algorithm to identify the shortest loop free path to every point within the network.

When one looks at the stateless methods of Ethernet forwarding and the need for such antiquated protocols such as Spanning Tree one can not help but see it as a path of promise. The problem is that OSPF v2 and OSPFv3 are ‘monolithic’ routing protocols, meaning that they were designed exclusively to route IP. IEEE knew this of course and found a very good link state protocol that was open and extensible. That protocol is IS-IS (Intermediate System – Intermediate System) from the OSI suite.  One of the first areas of interest is that IS-IS establishes adjacencies with L2 Hello’s, NOT L3 LSA’s like OSPF. The second is that it uses extensible type, length, values (TLV) to move information between switch elements like topology, provisioned paths or even L3 network reachability.  In other words, the switches are ‘topology aware’. Once we have this stateful topology of Ethernet switches, we now can determine what network path data are to take for different application services.

The next step IEEE had to deal with was implementing a universal labelling scheme for the network that provides all of the information that a switch element needs to forward the data. Fortunately, there was a pre-existing standard, IEEE 802.1ah (MAC-in-MAC) that provides just this type of functionality. The standard was initially established as a provider/customer demarcation for metro Ethernet managed service offerings. The standard works on the concept of encapsulation of the outer edge (customer) Ethernet frame (C-MAC) into an inner core (provider) frame (B-MAC) that is transported and then stripped off on the other end of the inner core to yield a totally transparent end to end service. This process is shown in the illustration below.


Figure 6. The use of 802.1ah B-MAC as a universal forwarding label in conjunction with IS-IS

The benefits to this model are the immense amount of scalability and optimization that happens in the network core. Once a data frame is encapsulated, it can be transported anywhere within the SPB domain without the need to learn. How this is accomplished is by combining 802.1ah and IS-IS together with another modification and extension of virtualization. We will cover this next.

Recall that IS-IS allows for the establishment of adjacencies at the L2 Hello level and that information moves through these updates by the use of Type, length values or TLV’s. As we pointed out earlier, some of these TLV’s are used for network reachability of those adjacencies. Well, these adjacencies are all based on the B-MAC’s of the SPB switches within the domain. Only those addresses are populated into the forwarding information databases at the establishment of adjacency and the running of the dyjkstra algorithm to establish loop-free shortest paths to every point on the network. As a result, the core Link State Database (LSDB) is very small and is only updated at new adjacencies such as new interfaces or switches. The important point is that it is NOT updated with end system MAC addresses. As a result, a core can support 10’s of thousands of outer C-MAC’s while only requiring a 100 or so B-MAC’s in the network core. The end result is that any switch in the SPB network can look at the B-MAC frame and know exactly what to do with it without the need to flood and learn or reference some higher level fabric controller.

There is one last thing required however. Remember that we still need to learn MAC’s. At the edge of the SPB network we need to assume that there are normal IEEE 802.3 switches and end systems that need to be supported. So how does one end system establish connectivity across the SPB domain without flooding? This is where the concept of constrained multicast comes in. The simplest way to discuss constrained multicast is based on the concept of provisioned service paths. These provisioned paths or I-SID’s (Individual Service Identifiers) are similar to VLAN’s in that they contained a broadcast domain, but they operate differently as they are based on subsets of the dykstra forwarding trees mentioned previously. As the example below shows, now when a station wishes to communicate with another end system, it simply sends out an ARP request. That ARP request is then forwarded out to all required points for the associated I-SID.


Figure 7. The ‘Constrained Multicast’ Model using 802.1ah and IS-IS

The end system on the other side receives the request and then responds establishing a unicast session over the same shortest path. As a result, the normal Ethernet ‘flood and learn’ process can still be facilited on the outside of the SPB domain without the need to flood and learn in the core. This vastly simplifies the network core, allows for determistic forwarding behavior as well as provides for the ability for separated virtual network services. The reason for this is shown in the diagram below with a little better detail on the B-MAC for SPB and the legacy standards that it builds upon. As can be seen, the concept of the I-SID is a pseudo evolution of the parent Q tag in the 802.1Q-in-Q standard. The I-SID value is contained within the actual B-MAC and consequently tells a core switch everything it needs to know, including whether or not it needs to replicate it for constrained multicast functionality. Note that the two most difficult problems of multicast distribution are solved. The first being source resolution and the second being the RPF build.


Figure 8. IEEE 802.1ah and its relation to other ‘Q’ standards

Once these technologies were merged together into a cohesive standard framework known as IEEE 802.1aq Shortest Path Bridging (MACinMAC) or SPBm, we have as a result a very stateful and scalable switching infrastructure that lends itself very well to the building and distribution of multicast services. In addition, SPB can offer many other different types of services ranging from full IP routing to private IP VPN services. All provisioned at the edge as a series of managed services across the network core. With these layer three services comes the need for the distribution of multicast services across the L3 boundaries. This is true L3 IP multicast routing. Interestingly, SPBm provides some very unique approaches to solving the problem. Again, let us take note that the two most important problems have already been solved.

The figure below shows a SBPm network that is providing multicast distribution between two IP subnets. One of the subnets is a L2 VSN (an I-SID that is associated with VLAN’s). The other subnet is a peripheral network that is reachable by IP shortcuts via IS-IS. Note that as a stream becomes active in the network, the BEB that has the source dynamically allocates an I-SID to multicast stream and that information becomes known via the distribution of IS-IS TLV’s. At the edge of the network the Backbone Edge Bridges (BEB’s) are running IGMP snooping out to the L2 Ethernet edge. The edge SPB BEB in effect becomes the querier for the L2 edge. As receivers signal their interest in a given IP multicast group they are handled by the BEB to which they are connected. which looks for ISIS LSDB (Link State Database) which advertize the multicast stream within the context of the VSN to which the receiver belongs. Once the BEB advertizing the stream and the I-SID are found in the LSDB – the BEB connected to the receiver uses standard ISIS-SPB TLVs to receive traffic for the stream. The dynamically assigned I-SID values start at 16000001 and works up. Provisioned services use values less than 16,000,000. In the case of the L3 traversal, the I-SID is dynamically extended to provide for the build of the L3 multicast distribution tree. 802.1aq supports up to 16,777,215 I-SID’s.  

Figure 9. IP Multicast with SPB/IS-IS using IP Shortcuts and L2 VSN

As the diagram above shows, for an end station to receive multicast from the source, it merely uses this dynamic I-SID to extend the service to end stations and which are members same subnet over the L2 VSN. Conversely, receiver will use the same dynamic I-SID built using the information provided by IS-IS to establish the end to end reverse forwarding path. In this model, IP multicast becomes much more stateful and integrated into the switch forwarding element. This results in a far greater build out capacity for the multicast service. It also provides for a much more agile multicast environment when dealing with topology changes and network outages. Switch element failures are handled with ease because the layered mutual dependence model has been removed. If a failure occurs within the core or edge of the network, the service is able to heal seamlessly due to the fact that the information required to preserve service is already known by the all of the elements involved. Due to the fact that the complete SPBm domain is topology aware, each switch member knows what it has to do in order to maintain established service. As long as a path exists between the two end points, Shortest Path Bridging will use it to maintain service. This is the result of true integration of link state routing into the Ethernet forwarding control plane.

What goes on behind closed doors…

In addition to providing constrained and L3 multicast, SPB also provides for the ability to deliver ‘ship in the night’ IP VPN environments. With SPBm’s native capabilities it becomes very easy to extend multicast distribution into these environments as well. Normally, multicast distribution within an IP VPN environment is notoriously complex dealing with yet more overlays of technology. Within SPBm networks however the task is comparitively simple. As the diagram below illustrates, a L3 VSN (IP VPN) is nothing more than a set of VRF’s that are associated with a common I-SID. Here we run IGMP on the routed interfaces that connect to the edge VLAN’s. Note that IGMP snooping is not used here as the local BEB interface will be a router. IGMP, SPB and IS-IS perform as before and the dynamic I-SID simply uses the established Dyjkstra path to provide the multicast service between the VRF’s. Important to note though is that this service is invisible to rest of the IP forwarding environment. It is a dark network that has no routes in and no routes out. Such networks are useful for video surveillance networks that require absolute separation from the rest of the networking environment. Note though that some services may be required from the outside world. This can be accomodated by policy based routing.


Figure 10. IP Multicast with SPB/IS-IS using L3 VPN

As the figure illustrates, the users within the L3VSN have access to subnets,, and within the network which is useful for services that require complete secure isolation such as IP multicast based video surveillance. The end result is a very secure closed system multicast environment that would be very difficult to build with legacy technology approaches.

I can see clearly now…

Going back to figure 4 that illustrates the legacy PIM overlay approach, we see that there are several demarcations of technology that tend to obscure the end to end service path. This creates complexities in troubleshooting and overall operations and maintenance. Note that at the edge we are dealing with L2 Ethernet switching and IGMP snooping, then we hop across the DR to the world of OSPF unicast routing. Over this and at the same demarcation we have the PIM protocol. Each demarcation and layer introduces another level of obscurity where the service has to be ‘traced and mapped’ into each technology domain. As a result, intermittent multicast problems can go on for quite some time until the right forensics are gathered to resolve the root cause of the problem.

With SPB, many if not all of these demarcations and overlays are eliminated. As a result, something that is somewhat of a Holy Grail in networking occurs. This is called ‘services transparency’. The end to end network path for a given service can be readiy established and diagnosed without referring to protocol demarcations and ‘stitch points’. As previously shown, IP multicast services are a primary beneficiary to this network evolution. The elimination of protocol overlays provides for a stateful data forwarding model at the level where it makes the most sense; at the data forwarding element itself.

Network diagnostics becomes vastly simplified as a result. End to end latency and connectivity becomes a very straight forward endeavor. Additionally, diagnosing the multicast service path, some thing that is notoriously nasty with PIM, becomes very straight forward and even predictable. Tools such as IEEE 802.1ag and ITU Y.1731 provide diagnostics on network paths, end to end and nodal latencies and all of this can be established end to end along the serivce path without any technology demarcations.

In Summary

IEEE 802.1aq Shortest Path Bridging is proving itself to be much more than a next generation data center mesh protocol. As previous articles have shown, the extensive reach of the technology lends itself well to metro and regional distribution as well as true wide area. Additional capabilities added to SPB such as the ability to deliver true L3 IP multicast without the use of a multicast routing overlay such as PIM clearly demonstrates the extensbility of the protocol as well as its exteremely practical implementation uses. The convergence of the routing intelligence directly into the switch forwarding logic result is an environment which can provide for extremely fast (sub-second) stateful convergence which is of definite benefit to the IP multicast service model. As such, IP multicast evironments can benefit fomr enhanced state which in turn results in increased performance and scale.

End to end services transparency provides for a clear diagnostic environment that eliminates the complexities of protocol overlay models. This drastic simplification of the protocol architecture results in the ability for direct end to end visability of IP multicast services for the first time.

So when someone asks “IP Multicast without PIM? No more RP’s?” You can respond with “With Shortest Path Bridging, of course!”

I would also urge you follow the blog site of esteemed colleague, Paul Unbehagen. Chair and Author of the IEEE 802.1aq “Shortest Path Bridging” Standard. you can find it at:


For more information please feel free to visit

Also please visit our VENA video on YouTube that provides further detail and insight. you can find this at:


Seamless Data Migration with Avaya’s VENA framework

November 23, 2011

There are very few technologies that come along which actually make things easier for IT staff. This is particularly true with new technology introductions. Very often, the introduction of a new technology is problematic from a systems service up time perspective. With networking technologies in particular, new introductions often involve large amounts of intermittent down time and a huge amount of human resources to properly plan the outages and migration processes to assure minimal down time. More so than any other, network core technologies tend to be the most disruptive due to their very nature and function. Technologies like MPLS are a good example. It requires full redesign of the network infrastructure as well very detailed design within the network core itself to provide connectivity. While some argue that things like MPLS-TP helps to alleviate this, it is not without cost – and the distruption remains.

IEEE 802.1aq or Shortest Path Bridging (SPB for short) is one of those very few technologies that can introduced in a very seamless fashion with minimal disruption or down time. It can also be introduced with minimal redesign of the existing network if so desired. A good case point example is a recent project that we have been working on with a large health care provider up in the northeast US. This was a long time Avaya networking customer who had an installed base of existing ERS 8600 routing switches. There was particular portion of the topology that interconnected the customer’s two data centers which were located in separate geographic loctions. This was the portion of the network topology that they chose to upgrade and introduce shortest path bridging.

The original intention was to upgrade the existing backbone switches to code that could support shortest path bridging (v7.1). They would then build out a parallel routed core in the resulting new ISIS routing plane. The ISIS environment would be kept latent and secondary by the resetting of its global priority to something lower than OSPF. Typically, this value is set at 130 (the default for ISIS is 7). Once the parallel routed core is built out as a mirror to OSPF, the systems are checked for validity and then once assured of stability, the priority of ISIS is then reset back to its default value of 7. ISIS then becomes the primary routed plane and OSPF is relagated to a secondary role. After system checks and validation, the OSPF network can be kept as secondary for as long as required. Then, at a later point in time, it can be decommissioned to leave ISIS as the sole core routing protocol for the enterprise core. This is a very seamless migration that provides for zero downtime to the overall networking core.

After a survey of the equipment however, it became obvious that due to its age  and slot density requirements (circa- 2000-2001) would need to be completely upgraded – including the switch chassis. Rather than view it as an impediment we quickly realized that by implementing a parallel routed core infrastructure the upgrade and migration of the critical path could be accomplished with little or no down time to the network core. This was in comparison to a gradual swap out and upgrade of the existing core which would have meant multiple outage occurances for each chassis swap out.

The theory was based on the diagram below, which shows the existing OSPF routed core running in parallel to a new SPB based ISIS routed core. By using a series of migration techniques which we will cover shortly, both routed cores would work in tandem with networks gradually migrated over to the new ISIS routed core in a controlled and phased approach.

Figure 1. Parallel OSPF and ISIS routed cores


The first step in the project was to account for the various VLAN’s that were provisioned in the existing OSPF routed core. Part of this was to also identify if they were one of two types. The first being VLAN’s that did not traverse the routed core by the use of Q-tagged trunks. These we identified as ‘peripheral VLAN’s’ in that the only Q-Tagged tunks that they ran over were along the edge and over the SMLT ‘Inter-Switch Trunks’.  The second type was a VLAN that existed in multiple places in the routed core and hence traversed the routed core by the use of Q-tagged trunks. These we labeled as ‘traversal VLAN’s’. Figure 2 illustrates the difference between the two VLAN types. This was an important step in the investigation because as one will see it largely determined the migration method for a given VLAN.
As is noted in other white papers, SPB offers various provisioning options. These are listed below for the convenience of the reader.

                L2 Virtual Service Network

This is a provisioned path across SPB, known as an I-SID in IEEE terms that inter-connects VLAN’s at the SPB edge. Taken as such it can be termed as a VLAN extension method somehwat anologous to Q-tagged extensions.

                L3 Virtual Service Network

This is a provisioned path across SPB, known as an I-SID in IEEE terms that inter-connects VRFs at the SPB edge. Taken as such it can be termed as a IP VPN method somewhat anologous to VRF lite.

Inter-VSN routing

This is a method of interconnecting Virtual Service Networks by the use of external routers or other devices. A good usage example is in a data center topology where user or ‘dirty’ VSN’s interconnect to data center or ‘clean VSN’s by the use of security perimeter technolgies such as firewalls and intrusion protection type devices.

IP Shortcuts

This final method does not involve the use of VSN’s at all but instead works on the injection of IP routing directly into ISIS and utilizing ISIS as an actual internal gateway routing protocol or IGP.


For the purposes of the migration we chose to use a combination of IP shortcuts in order to implement ISIS as the replacement core routed topology and L2 VSN’s to facilitate the connectivity to support the ‘traversal VLAN’s’ which would require multiple points of presence across the routed core.

In essence the network core migration involved three major steps:

1). Build out parallel network segments that match in almost every sense topologically. The new segment will run ISIS/SPBm as its core protocol. A migration link will be set up between the two routed domains to provide for a communication path during the migration. This link will be a MLT configuration for both bandwidth capacity and resiliency.

2). Redistribute VLAN’s and IP routes into the SPBm ISIS core on a switch by switch and VLAN by VLAN basis. Both ISIS and OSPF routing domains will be utilized throughout the migration process.

3). After all network migrations are completed the OSPF network core is to be dismantled.

If properly orchestrated and implemented, we strongly felt that this could be accomplished with zero network downtime for the local core network. There would however be short outages for each switch as it is migrated over to the SPBm/ISIS core. There would also be short outages for the individual VLAN’s during the final migration steps over to the new ISIS core. These however would be minimal and could also be scheduled during opportune windows that the IT staff had on a regular basis. The rest of this document will provide a more detailed outline of the three project phases listed above.

The diagram below illustrates the various types of VLAN’s and how they relate to the overall parallel routed cores. Note that with the introduction of SPB there is an additional type of VLAN (subnet) that is introduced which is a traversal VLAN that is in the process of migrating to the new routed core but still uses OSPF as its IGP. This required a number of items to work successfully. First we need to interconnect the VLAN by the use of L2VSN’s (I-SIDs) across the SPB ISIS routed core. This provided for connectivity, but due to the L2 nature that provides extension back into the OSPF environment, NOT the use of ISIS in an L3 sense. Additionally, we added in OSPF to ISIS and ISIS to OSPF redistribution at the migration link interface between the new and existing cores. This provided for the ability for the migrating VLAN (subnet) to have routed connectivity into the new ISIS routed core via redistribution but still use OSPF as its IGP. As the resident switches and systems were migrated over, the VLAN (subnet) would eventually be redistributed direct into ISIS and effectively decommisioned from the OSPF routed core. Again, by the use of the OSPF to ISIS and ISIS to OSPF redistribution the completely migrated network would still have connectivity over to the older OSPF routed core and visa versa. With the exception of the actual movement of switches and decommissioning of the subnet from OSPF and redistribution into ISIS the network downtime would be zero. More importantly, there would never be a time when the network core was not functional in a holistic sense.

Figure 2. Various migration VLAN types

                Taking a closer look at the ISIS side in the illustration below will provide a better feel for the actual topology in action. As noted in the diagram, we show the three VLAN types in the new SPB ISIS environment. First, for the completely migrated dual homed VLAN; it is simply redistributed into ISIS and routed accordingly. Due to the fact that it is provisioned as a Q-Tagged VLAN over the edge SMLT IST there is no use of VSN’s, the peripheral VLAN is simply redistributed direct into ISIS by the use of IP shortcuts.

                In the case of the traversal VLAN’s the illustration shows VLAN A which is a completely migrated traversal VLAN that is set up with VRRP Master Backup at various points for router redundancy. The VLAN (subnet) is then redistributed direct into the ISIS routed core by the use of IP shortcuts. This provides for the multiple points of presence required in the routed core by the use of L2 VSN’s and for the IP connectivity into the ISIS routed core by the use of IP shortcuts. The third VLAN type (VLAN C) is a migrating VLAN (subnet). As pointed out above, this is a VLAN that is extended over from the old routed core by the use of Q-tagging (old side) and L2 VSN’s (new side). As the diagram also shows, the migrating VLAN C (subnet) will continue to use OSPF as its routing protocol until all systems are moved over to the new core. At that point in time, the subnet is decommisioned in OSPF and redistributed direct into ISIS and will mirror VLAN A with the exception that there will be four VRRP instances and not two.

Figure 3. A closer view of the new ISIS core and various VLAN types

                Normally is such a scenario, one would have to deal with prefix lists and route policies to suppress the advertisment of the networks on one side or the other as they are co-resident in both routed cores. We were able to avoid this by simply not assigning IP addresses to the VLAN’s in the new core during the migration. By not doing this, the VLAN’s would simply not be distributed direct into ISIS and all systems connected to the subnet would use OSPF for all IP routing until the final migration step.

                Prior to the actual migration project we thought it prudent to test out the migration scenario as well as use the environment to provide knowledge transfer to the customer. As a result we set up an OSPF environment in the lab prior to actual deployment that looked like the toplogy below in figure 4. Note that both VLAN types (peripheral and traversal) are represented in the diagram. The switch in the lower left hand portion of the illustration shows a switch that provided for the OSPF routed environment in the lab test. Note that OSPF also is supported on the SPB core in the form of an ASBR function. In this example, IP network has connectivity to via OSPF. ( serves as the redistribution subnet). also has connectivity to respectively by the use of OSPF to ISIS and ISIS to OSPF redistribution. In summary, all IP subnets have routed connectivity to one another.

Figure 4. Existing provisioned SPB ISIS core.

                The next step was to introduce a migration VLAN into the lab test. We did this by creating a new VLAN on the OSPF side and gave it the IP address of As the figure below shows, we were able to extend that VLAN out across the migration link by the use of Q-tags on the OSPF side and L2 VSN’s in the new SPB ISIS core. We then emulated system moves over to the new core. Note that during this time utilized the OSPF protocol as its IGP. As a final step to the migration, the VLAN was deleted from the OSPF environment including any Q-tag extensions and then assigned IP addresses, VRRP Master Backup on the ISIS side and redistributed direct into ISIS.

Figure 5. Migration VLAN case point example

                In summary, the migration steps can be summarized as follows:

1). VLAN is extended over migration MLT from OSPF side

2). VLAN is assigned at required points of presence. NO IP addresses configured yet!

3). Add port members as required and create I-SID to connect VLAN’s togther

4). Migration can now proceed (systems are moved over to new core)

5). Upon completion, decommission network from legacy OSPF side (short outage)

6). Assign IP addresses at required VLAN POP’S , set up & enable VRRP Master Backup

7). Remove VLAN from migration MLT (clean up)

The following shows the CLI sequence to perform these steps. Note that ISIS redistribute direct is already set up in the environment. For clarity and reference, the redistribution method for DC3-8800-1 is shown below:





ISIS to OSPF redistribution.

ip ospf redistribute isis create

ip ospf redistribute isis enable

ip ospf redistribute direct create

Direct to OSPF redistribution.The “supress_IST” route policy is used to not advertise the IST subnet

ip ospf redistribute direct route-policy “suppress_IST”

ip ospf redistribute direct enable

OSPF to ISIS redistribution.

ip isis redistribute ospf create

ip isis redistribute ospf metric 1

ip isis redistribute ospf enable

ip isis redistribute direct create

ip isis redistribute direct metric 1

Direct to ISIS redistribution. The “supress_NETS” route policy is used to not advertise the IST subnet as well as others that may require suppression during migration

ip isis redistribute direct route-policy “suppress_NETS”

ip isis redistribute direct enable




Simple accept policy to ignore advertisements coming from its IST peer (DC3-8800-2). This avoids less than optimal IP routes



ip ospf accept adv-rtr create

ip ospf accept adv-rtr enable

ip ospf accept adv-rtr route-policy “reject”

As a result to the above, as soon as the VLAN is assigned IP addressing and VRRP Master Backup it will have routed connectivity into ISIS, no other steps are required. Also note that the network needs to be decommisioned in OSPF BEFORE being provisioned into the ISIS environment. This will involve a short outage (minutes) for the given subnet. will then have connectivity back into the OSPF side by the use of the route redistribution occuring at the migration link point which again has already been configured as per the above.

1). Set up VID 3 on the MLT (new side) both DC3-8800-1 & DC3-8800-2 (assuming this is done on the 5510)

            config vlan 3 create byport 1 name “TEST_MIG1″

            config vlan 3 add-mlt 2

2). Set up port & I-SID configuration. Both DC3-8800-1&2

            config vlan 3 ports add <members> (i.e. 10/9-10/10)

            config vlan 3 i-sid 3

3). Set up port & I-SID configuration on each required DC1-8800-1 & DC1-8800-2.

config vlan 3 create byport 1 name “TEST_MIG1″

config vlan 3 ports add <members> (i.e. 10/9-10/10)             

config vlan 3 i-sid 3

4). Migrate systems as appropriate (note – during migration still uses OSPF due to the fact that no IP addresses are yet assigned on ISIS side)

5). Once migration is complete (VID3) is decommissioned from 5510 (legacy OSPF environment)

6). Assign IP addresses. Enable VRRP Master Backup and VRRP on DC3-8800-1&2, DC1-8800-1&2

            config vlan 3 ip create 10.0.13.*/

            config vlan 3 ip dhcp-relay enable

            config vlan 3 ip vrrp 3 address

            config vlan 3 ip vrrp 3 backup-master enable

            config vlan 3 ip vrrp 3 enable


* Is,3,4 or  5 as required.


MIGRATION IS COMPLETE! should now be visible to the 5510 across the OSPF-ISIS Redistribution on Every subnet will have routed connectivity to the other.







            As can be seen by the example provided here, what can be a very complex migration project is now greatly simplified into a concise set of simple steps by the use of Shortest Path Bridging and VENA. OP/EX improvements when compared to other network virtualization technologies like MPLS are incomparible. Moreover, network downtime is predicatable, controllable and very short in comparison.

            Avaya’s VENA architecture facilitates a flexible yet powerful infrastructure that allows for this type of capability. It is also important to note that only a subset of the network services offered by VENA is used in this case point example. Very few technologies can claim such ease of introduction and actually ease the migration that they themselves require in order to be effectively used.

For more information please feel free to visit

Also please visit our VENA video on YouTube that provides further detail and insight. you can find this at:

 Happy Holidays to all!

With the very best wishes for the New Year!



Next Generation Mesh Networks

June 10, 2011


The proper design of a network infrastructure should allow for a number of key traits that are very desirable in an overall network design. First, the infrastructure needs to provide redundancy and resiliency without a single point of failure. Second, the infrastructure must be scalable in both geographic reach as well as bandwidth and throughput capacity.

Ideally, as one facet of the network is improved, such as resiliency; it should also improve on bandwidth and throughput capacity as well. Certain technologies work on the premise of an active/standby method. In this manner, there is one primary active link – all other links are in a standby state that will only become active upon the primary links failure. Examples of this kind of approach are 802.1d spanning tree and its descendants rapid and multiple spanning trees in the layer 2 domain and non-equal cost distance vector routing technologies such as RIP.

While these technologies do provide resiliency and redundancy they do so at the assumption that half of the network infrastructure is unusable and that a state of failure needs to occur in order to leverage those resources. As a result, it becomes highly desirable to implement active/active resiliency wherever possible to allow for these resources to be used in the day to day operations of the network.


Active/Active Mesh Switch Clustering


The figure below illustrates a very simple active/active mesh fabric. As in all redundancy and resiliency methods, topological separation is a key trait. As shown in the diagram below the two bottom switches are interconnected by a type of trunk known as an ‘inter-switch trunk or IST, that allows for the virtualization of the forwarding database across the core switches. The best and most mature iteration of this technology is something known as Avaya’s Split Multi-Link Trunking or SMLT. First invented in 2001 and movning into its 3rd generation, this effectively creates a virtualized switch that is viewed as single switch by the other edge switches in the diagram. Due to this fact, the edges switches can utilize defacto or industry standard multiple link technologies such as Multi-Line Trunks (MLT) or link aggregation (LAG) respectively. Because of the fact that the virtualized switch cluster appears as a single chassis these links can be dual homed to the two different switches at the top of the diagram to deliver active/active load balanced connectivity out to the edge switches.


Fig.1 A simple Active/Active Mesh Switch Topology

 Due to the fact that all links are utilized there is far better utilization of network resources. Additionally, because of this active/active mesh design, the resiliency and failover times offered are exponentially faster than comparative active/standby methods.

While the diagram above illustrates a very simple triangulated topology, active/active mesh designs can become much more sophisticated, such as box, full mesh and mesh ladder topologies. These additional topologies are shown in the diagram below. The benefit of these is that as the network topology is extended, both resiliency and capacity need not be sacrificed.

                                        box                   full mesh              ladder mesh

Fig. 2 Extended Active/Active Mesh Topologies

 As can be seen by the diagram above, these topologies can be very sophisticated and provide a very high degree of resiliency while enhancing the over all capacity of the network.


Topological Considerations for Active/Active Mesh Designs –


Most network topologies consist of various regions that provide certain functions. Depending on the region, there may be different features required that are specific to that region. As an example, within the network core high capacity load sharing trunks are a requirement where as at the network edge features like Power over Ethernet (PoE) are required in order to supply DC voltage to power VoIP handsets or other such devices.

Typically, these regions are divided into three sections of the topology; the network Core, Distribution and Edge. Below are short descriptions of each region and the role that they play. It should be noted that the distribution region is not required in all instances and should be viewed as an option.


The Network Core –


In a typical topology model, the individual network regions are interconnected using a core layer. The core serves as the backbone for the network, as shown in Figure 3. The core needs to be fast and extremely resilient because every network region depends on it for connectivity. Hence, active/active mesh topologies such as SMLT provide a very valuable role here. Even though the Core and Distribution Layer may be the same hardware, their role is different and should be looked as logically different layers. Also, as note above, the distribution layer is not always required. In the core of the network a “less is more” approach should be taken. A minimal configuration in the core reduces configuration complexity limiting the possibility for operational error. Ideally the core should be implemented and remain in a stable state with minimal adjustments or changes.

Fig 3. Simple Two Tier Switch Core

 The following are some of the other key design issues to keep in mind:

Design the core layer as a high-speed, Layer 3 (L3) or Layer 2 (L2) switching environment utilizing only hardware-accelerated services. Active/active mesh core designs are superior to routed and other alternatives because they provide:

Faster convergence around a link or node failure.

Increased scalability because neighbor relationships and meshing are reduced.

More efficient bandwidth utilization.

Use active/active meshing as well as topological distribution to enhance the overall resiliency of the network design.

Avoid L2 loops and the complexity of L2 redundancy, such as Spanning Tree Protocol (STP) and indirect failure detection for L3 building block peers.

If topology requires, utilize L3 switching in the active/active mesh core to provide for optimal sizing of the MAC forwarding table within the network core.

The Distribution Layer –

Due to the scale and capacity of active/active mesh core designs, the distribution layer is optional. It is far more efficient to dual home the network edge directly to the network core. This approach negates any aggregation or latency considerations that come in to play by the use of a distribution layer. The active/active mesh topology provides better utilization of trunk feeds and capacity can be scaled by multiple links in a dual homed fashion.

While the ideal topology is what is termed as a two tier design, it is some times necessary to introduce a distribution layer to address certain topology or capacity issues. Instances where a distribution layer might be entertained in a design are as follows:

  • ·         Where the required reach is outside of available trunk distances.
  • ·         Where the port count capacity in that portion of the network core can not support all of the edge connections without expansion and expansion in the core is not desired.
  • ·         Where logical topology issues such as Virtual LAN’s or port aggregation require it

It should be noted though that all of the above instances could be addressed by the expansion of the network core. Examples if this are moving from a dual to a quad core design or going further, moving to a mesh ladder topology as shown in figure 2.
In any instance it is more desirable to maintain a two tier rather than a three tier design if possible. The overall design of the network is far more efficient and resiliency convergence times become optimized. The diagram below shows a three tier design that utilizes an intermediate distribution or aggregation layer.

Fig. 4. Simple Three Tier Network

Note that topologies can be hybrid. As an example, most of the network can be designed around a two tier architecture with one or two regions that are interconnected by distribution layers for one or more of the reasons noted above.

The Network Edge

The access layer is the first point of entry into the network for edge devices, end stations, and IP phones (see Figure 5). The switches in the access layer are connected to two separate distribution layer switches for redundancy. If the connection between the distribution layer switches is to an active/active mesh, then there are no loops and all uplinks actively forward traffic.

A robust edge layer provides the following key features:

High availability (HA) supported by many hardware and software attributes.

Inline power for IP telephony and wireless access points, allowing customers to converge voice onto their data network and providing roaming WLAN access for users.

Foundation services.

The hardware and software attributes of the access layer that support high availability include the following:

Default gateway redundancy using dual active/active connections to redundant systems (core or distribution layer switches) that use industry standard or vendor specific Load Balancing or Virtual Gateway protocols such as VRRP or Avaya’s VRRP w/ Backup Master or R/SMLT. This provides fast failover of default gateway and IP paths. Note that with an active/active core or distribution mesh topology link and node resiliency and convergence are handled by the L2 topology which is much faster than any form of L3 IP routing convergence. As a result, any failover within the active/active mesh is well within the L3 routed timeout.

Operating system high-availability features, such as Link Aggregation or Multi-Line Trunks, which provide higher effective bandwidth that leverages on the active/active mesh while reducing complexity.

Prioritization of mission-critical network traffic using QoS. This provides traffic classification and queuing as close to the ingress of the network as possible.

In figure 5 the diagram illustrates a build out of a hybrid two/three tier network showing active/active load sharing interconnections with all network edge components.

Fig 5.  Full Resilient Active/Active Network Topology

Also note, that as shown in figure 5, active/active connections can also be established within the Data Center via top of rack switching to facilitate load sharing highly resilient links down to server nodes. Again, such resiliency is provided at L2 and is totally independent of the overlying IP topology or addressing.



Provisioned Virtual Network Topologies –

An evolution of active/active mesh topologies is provided by the ratification of IEEE 802.1aq “Shortest Path Bridging” or SPBm (the ‘m’ standing for MAC in MAC – IEEE 802.1ah) for short. This technology is an evolution of earlier carrier grade implementations of Ethernet bridging that were designed for metro and regional level reach and scale. The major drawbacks of these earlier methods were that they were based on modified spanning tree architectures that made the network complex to design and scale. IEEE 802.1aq resolves these issues with the implementation of link state adjacencies within the L2 switch domain in a manner that is the same as occurs by L3 link state adjacencies such as IS-IS and OSPF. All nodes within the SPB domain (which use ISIS to establish adjacencies) then run Dykstra to establish the shortest path to all other nodes in the active/active mesh cloud. Reverse Path Forwarding Checks provide for the ability to prevent loops in all data forwarding instances in a manner that is very similar to that provided in L3 routing. IEEE 802.1aq provides a cornerstone technology for Avaya’s Virtual Enterprise Network Architecture or VENA. The VENA framework utilizes SPBm as a foundational technology for many next generation cloud service models that either offerable today or currently under development at Avaya.

This next generation virtualization technology will revolutionize the design, deployment and operations of the Enterprise Campus core networks along with the Enterprise Data Center. The benefits of the technology will be clearly evident in its ability to provide massive scalability while at the same time reducing the complexity of the network. This will make network virtualization a much easier paradigm to deploy within the Enterprise environment.

Shortest Path Bridging eliminates the need for multiple protocols in the core of the network by separating the connectivity services from the protocol infrastructure. By reducing the core to a single protocol, the idea of build it once and don’t have to touch it again becomes a true reality. This simplicity also aides in greatly reducing time to service for new applications and network functionality.

The design of networks has evolved throughout the years with the advent of new technologies and new design concepts. IT requirements drive this evolution and the adoption of any new technology is primarily based on the benefit it provides versus the cost of implementation.

The cost in this sense is not only cost of physical hardware and software, but also in the complexity of implementation and on-going management. New technologies that are too “costly” may never gain traction in the market even though in the end they provide a benefit.

In order to change the way networks are designed, the new technologies and design criteria must be easy to understand and easy to implement. When Ethernet evolved from a simple shared media with huge broadcast domains to a switched media with segregated broadcast domains, there was a shift in design. The ease of creating a VLAN and assigning users to that VLAN made it commonplace and a function that went without much added work or worry. In the same sense, Shortest Path Bridging allows for the implementation of network virtualization in a true core distribution sense.



The key value propositions for IEEE 802.1aq SPBm include:



          IEEE 802.1aq standard

          Unmatched Resiliency

          Single robust protocol with sub-second failover

          Optimal network bandwidth utilization


          One protocol for all network services

          Plug & Play deployment reduces time to service


          Evolved from Carrier with Enterprise-friendly features

          Separates infrastructure from connectivity services


          No constraints on network topology

          Easy to implement virtualization

There are some major features within SPBm that lend themselves well to a scalable and resilient enterprise design. Two major points are as follows:

1). Separation of the Core and the Edge

SPBm implements IEEE 802.1ah ‘MACinMAC’ which provides for a boundary separation between data forwarding methods in the network core versus the edge. It provides for a clear delineation between the normal Ethernet ‘learning bridge’ environment which is required for local area network operations and the SPBm Core network cut-through switching environment where performance and optimal path selection are the key most important criteria. As a result, the use of SPBm creates a core network that creates smaller edge forwarding environments where the MAC tables are effectively isolated. Within the actual SPBm core network itself the only MAC addresses within the forwarding tables are those of the SPBm switches themselves. As a result, the IEEE 802.1aq SPBm Core is very high performance and very scalable. It is also able to utilize multiple forwarding paths and provide for clear delineation between the network core and edge.

2). Virtual Provisioning Fabric

As noted earlier, IEEE 802.1aq evolved from earlier carrier grade implementations for Provider Based Bridging. There are two things that are key to a provider based offering. First, no customer should ever see another customer’s traffic. There needs to be complete and total services separation. Second, there must be a robust and detailed method for Operation and Maintenance (OAM) and Configuration and Fault Management (CFM) which is addressed by IEEE802.1ag and is used by SPBm for those purposes..

The first requirement is addressed by SPBm’s ability to create isolated data forwarding environments in a manner that are similar to VLAN’s in the traditional learning bridge fashion. In the SPBm core there is no learning function required. As such, these forwarding paths provide for total separation and allow for very determinate forwarding to associated resources across the SPBm core. These paths, termed as Instance Service Identifiers or I-SID’s allow for the ability to provision virtual network topologies that can be of a very wide variety of forms.

In addition, due to the established topology of the SPBm domain, the creation of these I-SID’s are provisioned at the edge of the SPBm cloud. There is no need to go into the core to any provisioning to establish the end to end connectivity. This contrasts with normal VLANs which require each and every node to be configured properly.

The figure below shows the dichotomy of these two features and how they relate to the network edge and in this case a distribution layer.

Fig. 6  MAC-in-MAC and I-SID’s within SPBm

As an example, I-SID’s can be used to connect Data Centers together with very high performance cut through dedicated paths for things such as Virtual Machine Migration, Stretch Server Clusters or Data Storage Replication. The figure below illustrates the use of L2 I-SID in this fashion

 Fig. 7. End to end IEEE 802.1aq L2 I-SID providing a path for V-Motion

Additionally, complete Data Center architectures can be built that provide for all of the benefits of traditional security perimeter design but with the benefits full virtualization of the network infrastructure. The figure below shows a typical Data Canter design implemented by inter-connected I-SID’s in a Shortest Path Bridging network. This effectively shows that not only is SPBm an ideal core network technology, it is also an optimal data center bridging fabric.

Fig. 8. Full Data Center Security Zone


Finally, complex L3 topologies can be built on top of SPBm that can utilize traditional routing technologies and protocols or can provide for the networks L3 forwarding requirements by the use of the native L2 link state routing within SPBm provided by IS-IS. The illustration below shows a network topology in which all methods are utilized to provide for a global enterprise design.

Fig. 9  Full end to end Virtualized Network Topology over an IEEE802.1aq cloud

Shortest Path Bridging Services Types

Avaya’s implementation of Shortest Path Bridging provides a tremendous level of flexibility to support multiple service types simultaneously, singly or in tandem.

One of the key advantages of the SPB protocol is the fact that network virtualization provisioning is achieved by just configuring the edge of the network, thus the intrusive core provisioning that other Layer 2 virtualization technologies require is not needed when new connectivity services are added to an SPB network.

Shortest Path Bridging Layer 2 Virtual Services Network (L2 VSN)

Layer 2 Virtual Services Networks are used to transparently extend VLANs through the backbone.  A SPB L2 VSN topology is simply made up of a number of Backbone Edge Bridges (BEB) used to terminate Layer 2 VSNs. The control plane uses IS-IS for forwarding at a Layer 2 level. Only the BEB bridges are aware of any VSN and associated edge MAC addresses while the backbone bridges simply forward traffic at the backbone MAC (B-MAC) level.

Figure 10. L2 Virtual Service Networks

A backbone service Instance Identifier (I-SID) used to identify the Virtual Services Network will be assigned on the BEB to each VLAN. All VLANs in the network sharing the same I-SID will be able to participate in the same VSN.


Shortest Path Bridging Inter-VSN Routing (Inter-ISID Routing)

Inter-VSN Routing allows routing between IP networks on Layer 2 VLANs with different I-SIDs. As illustrated in the diagram below, routing between VLAN 10, VLAN 100 and VLAN 200 occurs on one of the SPB core switches in the middle of the diagram. 

Figure 11. Inter-VSN routing

Although in the middle of the network, this switch provides “edge services” and has I-SIDs and VLANs provisioned on it, and therefore is designated as a BEB switch.  End users from the BEB switches as shown on the right and left of the diagram are able to forward traffic between their respective VLANs via the VRF instance configured on the switch shown.  For additional IP level redundancy, Inter-VSN Routing may also be configured on another switch and both can be configured with VRRP to eliminate single points of failure.


Shortest Path Bridging Layer 3 Virtual Services Network (L3 VSN)

A SPB L3 VSN topology is very similar to a SPB L2 VSN topology with the exception that a backbone service Instance Identifier (I-SID) will be assigned at a Virtual Router (VRF) level instead of at a VLAN level. All VRFs in the network sharing the same I-SID will be able to participate in the same VSN. Routing within a single VRF in the network occurs normally as one would expect.  Routing between VRF’s is possible by using redistribution policies and injecting routes from another protocol, i.e., BGP even if BGP is not used within the target VRF.

Figure 12. L3 Virtual Service Networks

Layer 3 Virtual Service Networks provide a high level of flexibility in network design by allowing IP routing functionality to be distributed among multiple switches without proliferation of multiple router-to-router transit subnets.


SPB Native IP shortcuts

The services described to this point require the establishment of Virtual Service Networks and their associated I-SID identifiers.  IP Shortcuts enables additional flexibility in the SPB network to support IP routing across the SPB backbone without configuration of L2 VSNs or L3 VSNs.


Figure 13. Native IP GRT Shortcuts

IP shortcuts allow routing between VLANs in the global routing table/network routing engine (GRT). No I-SID configuration is used.

Although operating at Layer 2, IS-IS is a dynamic routing protocol.  As such, it supports route redistribution between itself and any IP route types present in the BEB switch’s routing table.  This includes local (direct) IP routes and static routes as well as IP routes learned through any dynamic routing protocol including RIP, OSPF and BGP.

IP routing is enabled on the BEB switches, and route redistribution is enabled to redistribute these routes into IS-IS.  This provides normal IP forwarding between BEB sites over the IS-IS backbone.


 BGP-Based IP VPN and IP VPN Lite over Shortest Path Bridging

Avaya’s implementation of Shortest Path Bridging has the flexibility to support not only the L2 and L3 VSN capabilities and IP routing capabilities as described above, but also supports additional IP VPN types.  BGP-Based IP VPN over SPB and IP VPN Lite over SPB are features supported in the Avaya implementation of Shortest Path Bridging. 

Figure 14. BGP IP VPN over IS-IS

BGP IP VPNs are used in situations where it is necessary to leak routes into IS-IS from a number of different VRF sources.  Additionally, using BGP IP VPNs support over SPB, it is possible to provide hub and spoke configurations by manipulating the import and export Route Target (RT) values. This allows, for example, a server frame in a central site to have connectivity to all spokes, but, no connectivity between the spoke sites. BGP configuration is only required on the BEB sites where the backbone switches have no knowledge of any Layer 3 VPN IP addresses or routes.


Resilient Edge Connectivity with Switch Clustering Support

As earlier described, the boundary between the MAC-in-MAC SPB domain and 802.1Q domain is handled by the Backbone Edge Bridges (BEBs). At the BEBs, VLANs are mapped into I-SIDs based on the local service provisioning.

Figure 15. Resilient edge switch cluster

Redundant connectivity between the VLAN domain and the SPB infrastructure is achieved by operating two SPB switches in Switch Clustering (SMLT) mode. This allows dual homing of any traditional link aggregation capable device into a SPB network. 

Switch Clustering provides the ability to dual home any edge device that supports standards-based 802.1ad LACP link aggregation, Avaya’s MLT link aggregation, EtherChannel or any similar link aggregation method.  With Switch Clustering, the capability is provided to fully load balance all VLANs across the multiple links to the switch cluster pair.  If either link as depicted fails, all traffic will instantly fail over to the remaining link.  Although two links are depicted, Switch Clustering supports LAGs up to 8 ports for additional resiliency and bandwidth flexibility. 


Quality of Service Support and Traffic Policing and Shaping Support

Quality of Service (QoS) is maintained in a SPB network the same way any IEEE based 802.1Q network is operated. Traffic ingressing a SPB domain which is either already 802.1p bit marked (within the C-MAC header), or is being marked by an ingress policy (remarking), is getting its B-MAC header p-bit marked to the appropriate value.

Figure 16. QoS & Policing over SPB

The traffic in the SPB core is scheduled, prioritized and forwarded according to the 802.1p values in the backbone outside packet header. In the case where traffic is being routed at any of the SPB nodes, the IP Differentiated Services DSCP values are taken into account as well.

The number of I-SID’s available in an SPBm domain are virtually limitless (16 million). Additionally, this technology can be effectively extended over many forms of transport such as dark or dim optics, CWDM or DWDM, MPLS L2 pseudo-wires, ATM and others. This means that it can effectively cover vast geographies in its native form and provide all of the virtualization benefits where ever it reaches.

In instances where required however an SPBm domain can effectively interface to a traditional routed WAN by the use of standard interior and border gateway protocols.

Provider Type Services offerings and larger regional topologies

In instances where larger geographic coverage is desired to leverage IEEE 802.1aq and its inherent provisioned core approach the traditional mash topology has difficulty in scaling due to costs in optical infrastructure and point of presence. In these instances ring based topologies make the most sense. IEEE 802.1aq can not only support ring topologies but can also support various interesting iterations such as dual core rings or the more esoteric 3D torus topology which is intended to support very high core port densities.

The next section of this document will discuss the various ring topology options as well as the combination of their use. The diagram below illustrates the basic components for the dual core ring. There are two basic assumptions in the design. First, the core ring topology is populated with only Backbone Core Bridges (BCB’s). This optimizes one of the key traits of Shortest Path Bridging – separation of core and edge. The result is a design of immense scale from a services perspective. Second, all provisioned service paths are applied at the edge in the Backbone Edge Bridges (BEB’s) which provides the interface to the customer edge.

Figure 17. Basic Dual Core components

As we look below at a complete topology we can see that a very efficient design emerges which uses both minimal node and fiber counts as well as effectively leverage on shortest paths across the topology. Each BEB is dual homed back into the ring fabric by SPB trunks. As such there are multiple options for dual homing the BEB node back into the ring topology.

Figure 18.  A Basic Dual Core Ring

An additional level of differentiation can be provided by the use of a dual home active/active mesh service edge. In this type of edge shown below, there are two BEB’s which are trunked together with active/active Inter-Switch Trunks. These two switches then provide a clustered edge that interoperates with any industry standard dual homing trunk method such as MLT or LAG. The end result is a very high level of mesh resiliency directly down to the customer service edge.

Figure 19. Dual Homed Active/Active Mesh Edge

The diagram below shows a dual core ring design that implements various forms of dual homed resiliency. These can range from simple dual homing of the BEB to a very highly resilient inter-area active/active edge design that can provide sub-second failover into the provider cloud. Again, this supports industry standard methods for active/active dual homing of the Ethernet service edge.

 Figure 20. Dual Core Ring with various methods of dual homed resiliency

More complex topologies can be designed when higher densities of backbone core ports are required. The topology below illustrates a 3D torus design that links together triad nodal areas to build a very highly resilient and dense core port capacity ring.

Figure 21. 3D Torus Ring

As the diagram below shows, the basic construct of the 3D torus is fairly simple and is comprised of only six core nodes. The dotted lines show optional SPB trunks to provide enhanced shortest path meshing. With these optional trunks every node is directly connected for shortest path forwarding.

Figure 22. 3D Torus Section

These sections can be linked together to build a complete torus as shown above, or used in a hybrid fashion as shown below to build up or down core port densities as required by subscriber population. The illustration below shows a hybrid ring topology that scales up or down according to population and subscriber density requirements.

Figure 23. Hybrid Ring Topology

As this section illustrates, IEEE 802.1aq is an excellent technology for regional and metropolitan area networks. It allows for scalability and reach as well as a great degree of flexibility in supported topologies. Moreover, these different degrees of scale can be accomplished in the same network without any degree of sacrifice to the overall resiliency of the whole.

Provisioned Virtual Service Networks

As mentioned earlier, IEEE 802.1aq offers several methods of service connectivity across the SPB cloud. In the context of a service offering however, the use of I-SID’s will have a different focus. Rather than a departmental or organizational focus as was used in the above example, here we are concerned with shared service offerings or services separation. As an example, in the area of voice service offerings, a service may be shared in that it is much like the PSTN only over IP. In contrast, a service might be offered for a virtual PBX service for a private company that would expect that service to be dedicated. The figure below shows how IEEE 802.1aq can easily provide the dedicated service paths for both modes of service offering. The PSTN service I-SID offering is shown in green while the private virtual PBX service I-SID is shown in red.

Figure 24.  Shared vs. Dedicated Services


In typical deployment an offering of services might be as follows –

Private Sector – Voice/Shared – Video/Shared – Data/Shared

Business – Voice/Private – Video/Shared – Data/Private

These are of course general and can be customized to any degree. The diagram below shows how the use of IEEE802.1aq I-SID’s allows for the support of both service models with no conflict. Note that the private sector shares a common I-SID for video services with the business sector. Also note that the business sector profile allows for the use of a dedicated virtual PBX service that is private to that business.

Figure 25.  Voice and Video I-SID’s across SPB

Figure 26.  Multiple ‘Service Separated’ data service paths across SPB

The illustration above highlights the data networking services. Note that the private sector is using a shared I-SID (shown in green) much as is done today with DOCSIS type solutions. Note also that the business is using L3 I-SID’s with VRF’s to build out a separate private and dedicated IP topology over the IEEE 802.1aq managed offering. This creates separate and discrete data forwarding environment that are true ‘ships in the night’. They have no ability to support end to end communications unless the routing topology explicitly allows it. As such all of the traditional IT security frameworks such as firewalls and intrusion detection and prevention come into play and are used in a rather traditional fashion to protect key corporate resources. On the private residential space, end point anti-virus & protection as is typical with ISP’s today.


IP Version 6 Support

Introducing new technology is always a move into the unknown. IPv6 is no different. While the technology has been under development so some time (over ten years), there has been no great impetus that has been the motivation for large scale adoption. This is changing now that IANA/ARIN has announced that the last contiguous block of IPv4 addresses has been sold. Now it is down to non-contiguous blocks and recycling of address blocks. These efforts will not provide any significant extension to the availability of IPv4 addresses. With these events, many organizations are now actively investigating how IPv6 can be deployed into their networks.


This section is intended to provide an overview of a tested topology over shortest path bridging (IEEE 802.1aq) environments for the distribution of globally routable IPv6 addressing using L2 VSN’s and inter-VSN routing.

The high level results of the work demonstrate that an enterprise can effectively use SPB to provide for the overlay of a routed IPv6 infrastructure that is incongruent to the existing IPv4 topology. Furthermore, with IPv4 default gateways resident on the L2 VSN’s, dual stack end  stations can have full end to end hybrid connectivity without the use of L3 transition methods such as 6to4, ISATAP, or Teredo. This results in a clean and simple implementation that allows for the use of allocated globally routable IPv6 addresses in a native fashion.


IPv6 in General –


IPv6 is the next generation form of IP addressing. Replacing IPv4 it is intended to provide greatly enhanced address space as well as end to end transparency which was becoming more and more difficult by the increasing use of Network Address Translation (NAT) in IPv4. NAT was created in order to provide for the use of ‘private’ IPv4 addressing within an organization and then allow for a gateway device to interface out to the public Internet. Even this technology however could not forestall the unavoidable event that occurred earlier this year contiguous blocks of IPv4 addresses have run out.

Currently, there are address recycling efforts that will provide some reprieve but in the immanent future even this effort will be exhausted.

These events have caused a recent surge in the interest in IPv6. Many enterprises that had it on the back burner are now taking a new look at this technology and the requirements that need to met for their organizations to deploy it. For the first time investigator this can be a daunting task. Beyond the knowledge of IPv6 itself, one needs to learn all of the methods required in order to co-exist in an IPv4 network environment. This is a strict requirement because no one will completely forklift their complete communications environment and even if they could there are issues with contact to the outside world that need to be addressed. The reason for this is that the IPv6 suite is NOT directly backwards compatible to IPv4. This complication has caused quite a bit of effort within the IETF to resolve. There are a number of RFC’s, drafts as well as deprecated drafts that cover a wide variety of translation or transition methods. Each has its own set of complications and security or resiliency issues that need to be dealt with. At the end of the day, most IT personnel walk away with a headache and wish for the good old days of just IPv4.


During the time since IPv6 was first introduced different schools of thought evolved as to how this co-existence between IPv4 and IPv6 could be addressed. Network and Port Translation (NAT-PT) came into vogue but has since faded off into deprecation as the approach has largely proved to be intractable. Other methods have stayed and even become ‘default’. As an example, all Microsoft OS’s running IPv6 run 6to4, ISATAP and Teredo tunneling methods.

So it has become clear. One school has won out and that school of thought is… dual stack in the end stations and tunneling across the IPv4 network to tie IPv6 islands together. These methods work, but as I pointed out earlier, they all have complications and issues that need to be dealt with.

If one looks at the evolution long enough though something else becomes apparent. If you could provide the paths between IPv6 islands by Layer 2 methods, things like 6to4, ISATAP and Teredo are no longer required. Furthermore, without these methods an enterprise is free to use formally allocated globally routable address space. The only requirement for the dual stack host is that they have clear default routes for both IPv6 and IPv4. With typical VLAN based networks however, this design while feasible does not scale and quickly becomes intractable due to the complications of tagged trunk design within the network core. With the evolution of Shortest Path Bridging (IEEE 802.1aq) this scalable layer two method is now available. The rest of this solution guide will describe the test bed environment and then discuss ramifications that this work has on larger network infrastructures.


The IPv6 over SPB Example Topology –


The figure below shows the minimal requirements for a successful hybrid IPv6 deployment over shortest path bridging. As can be seen the requirements are fairly concise and simple. You require an SPB Virtual Service Network configured which is then associated with edge VLAN’s. These VLAN’s will host dual stack end stations.

Addtionally, this VSN will need to attach to default IPv6 and IPv4 default gateways. Again, this would occur by the use of edge VLAN’s that interface to the relevant devices.


Figure 27. Required elements for a native hybrid IPv6 deployment over SPB


So as one can see the requirements are straightforward and easy to understand. We implemented the following topology in a lab to demonstrate the proposed configuration.

The diagram below illustrates this topology in a simplified form for clarity. 

 Figure 28. Native IPv6 Dual Stack over L2 VSN Test bed


In the test bed we implemented a common VSN that would support the IPv6 deployment. This was for simplicity only. More complicated IPv6 routed topologies can easily be achieved by using inter-VSN routing. Examples later in the brief will be shown where this is illustrated. In the lab we created VLAN ID 500 at the three different key points at the edge of the SPB domain. A Virtual Service Network was created within the SPB domain (also using 500 as its identifier) that ties the different VLAN’s together. At one edge VLAN a Win7 end station running dual stack had the IPv4 address of and the IPv6 address of 3000::2. For IPv4 the end stations default gateway was and for IPv6 the Default Gateway was 3000::1. The IPv6 default gateway is also attached to VLAN 500 and is able to provide directly routable paths in and out of the VSN. Additionally, the IPv4 default gateway is also attached and reachable as well. The dual stack end station enjoys end to end hybrid connectivity to both IPv6 and IPv4 environments without the use of any L3 transition method. In the topology shown in figure 3, we show that from the dual stack end stations perspective, there is complete hybrid connectivity and available routed paths to both IPv4 and IPv6 environments. Due to the fact that formally allocated global addressing is used there is connectivity out into INET2 to native IPv6 resources.

Figure 29. Dual Stack end stations perspective on default routed paths


The ramifications on larger IPv6 deployments


One of the major drawbacks of L3 transition methods for IPv6 is that they bind the IPv6 topology to IPv4. Many find this as undesirable. After all, why implement a new globally routed protocol and then lock it down to an existing legacy topology? As a result, it was realized very early on that if you could run IPv6 as ships in the night with IPv4 it would be a very good solution. The problem with this was that the only method to accomplish this was by the use of VLAN’s and tagged trunks or with routed overlays. As a result, while the previous test bed shown in figure 2 was feasible and provable, the approach quickly suffers from complexity in larger topologies and does lend itself well to scale.

With Shortest Path Bridging these issues are vastly simplified making this approach tractable on an enterprise scale. The reason for this is that the IPv6 deployment becomes an overlay L3 environment that rides on top of SPB. As such, there is no need to make detailed configuration changes to the network core to deploy it. This original ‘ships in the night’ vision can now be realized in real world designs.


The diagram below shows a large network topology that interconnects two data centers. The topology in blue shows the IPv6 native dual stack deployment. The topology in green shows the IPv4 legacy routed environment. Note that while there are common touch points between the two environments for legacy dual stack IPv4 use, the two IP topologies are quite independent of one another.

Figure 30. Totally Independent IP topologies



In Summary –


This document has provided a review of active/active mesh network topologies and the significant benefits that they bring to an overall network design. With networking speeds now at plus 10 Gb/s it is no longer sufficient to have very high speed expensive switch ports sitting in a totally passive state waiting for a network failure. It is also no longer sufficient to tolerate failover times in the range of seconds or even tenths or hundreths of seconds. The amount of data loss and the performance impacts are just too serious. Active/active mesh networking addresses this by providing for multiple load sharing paths across the network topology. Additionally, due to the active nature of the trunking method, SMLT can very easily provide for failovers in the subsecond range. As a note, recent testing of Avaya’s 3rd generation of SMLT reliably shows failovers in the range of 6 ms. This is practically instantaneous from the persapective of the overall network. This failover speed is unrivaled in the industry and is a testament to Avaya’s dedication to this technology space.

Additionally, newer active/active mesh technologies are being introduced such as IEEE 802.1aq Shortest Path Bridging – a key foundational component of Avaya’s VENA framework that promise to take active/active mesh network topologies into a new era of scale and flexibility never before realized with switched Ethernet topologies. The provisioned virtual network capability of VENA allows for one touch provisioing of the network serivce paths with zero touch requirements to the transport core. This new innovation not only vastly simplifies administration and reduces configuration errors. It can provide for dramatic improvements in IT OP/EX costs in that changes that would normally take hours are brought down to minutes with an exponential reduction in the probablity for error.

In addition, this paper has shown that this new addition to active mesh networking is totally complatible and complimentary with older active/active mesh switched Ethernet topologies such as SMLT. The results of the combination are a flexible core meshing technology that allows for almost unlimited permutations of topologies and a very highly resilient dual homed edge with sub-second failover.

Another more mundane but equally important aspect of Avaya’s SPBm offering is that it can be easily migrated to within their existing Ethernet Routing Switch 8600. The result of this upgrade is to make it the equivalent of an Ethernet Routing Switch 8800 which can participate in an SPBm domain as either a Backbone Edge Bridge (BEB) or a Backbone Core Bridge (BCB), including all service modes detailed earlier in this article. This mean that an existing ERS 8600 customer can implement the technology without the needs for a forklift upgrade.

Even when considering networks with alternative vendors, Avaya’s SPBm VENA framework – due to it’s strict compliance to IEEE 802.1aq and other IEEE standards – allows for the seamless introduction of SPBm into the network as a core distribution technology with minimal disruption to the network edge. Additionally, network edges that are Spanning Tree based today because of core networking limitations can then move to implement the active/active dual homing model spoken to earlier by the use of LAG or MLT at the edge, both of which are widely supported throughout the industry.

The end result is a technology that brings immense value.  It is easy to implement in both new and existing networks, and migration can be virtually seamless.

Could it be that the days of spanning tree have finally passed?

I would like to extend both credit and thanks to my esteemed Avaya colleagues, Steve Emert and John Vant Erve for both input and use of facilities for solution validation.

IPv6 Deployment Practices and Recommendations

June 7, 2010

Communications technologies are evolving rapidly. This pace of evolution, while slowed somewhat by economic circumstances, still moves forward at a dramatic pace. This is indicative to the fact that while the ‘bubble’ of the 1990’s is past, society and business as a whole has arrived to the point where communications technologies and their evolution are a requirement for proper and timely interaction with the human environment.

This has profound impact on a number of foundations upon which the premise of these technologies rest. One of the key issues is that of the Internet Protocol, commonly referred to simply as ‘IP’. The current widely accepted version of IP is version 4. The protocol, referred to as IPv4 has served as the foundation to the current Internet since its practical inception in the public arena. As the success of the Internet attests, IPv4 has performed its job well and has provided the evolutionary scope to adapt over the twenty years that has transpired. Like all technologies though IPv4 is reaching the point where further evolution will become difficult and cumbersome if not impossible. As a result, IPv6 was created as a next generation evolution to the IP protocol to address these issues.

Many critics cite the length of time that IPv6 has been in development. It is after all, a project that has over a ten year history in the standards process. However, when one considers the breadth and complexity of the standards involved a certain maturity is conveyed that the industry can now leverage upon. The protocol has evolved significantly since the first proposals for its predecessor, IPng. Many or most of the initial shortcomings and pitfalls have been addressed to the point where actual deployment is a very tractable proposition. Along this evolution several benefits have been added to the suite that directly benefits the network staff and end user populous. Some these benefits are listed below. Note that this is not an inclusive list.

  • Increased Addressing Space
  • Superior mobility
  • Enhanced end to end security
  • Better transparency for next generation multimedia applications & services

Recently, there has been quite a bit of renewed activity and excitement around IP version 6. The recent announcements by the United States Federal Government for IPv6 deployment by 2008 and the White House Civilian Agency mandate by 2012 has helped greatly to fuel this. Also many, if not most of the latest projects being implemented by providers in the Asia Pacific regions are calling for mandatory IPv6 support. Clearly the protocols’ time is coming. We are seeing the two vectors of maturity and demand meeting to result in market and industry readiness.

There is a cloud on this next generation horizon however. It is known as IPv4. From a practical context all existing networks are either based on or in some way leverage IPv4 communications. Clearly, if IPv6 is to succeed, it must do so in a phased approach that allows hybrid co-existence with it. Fortunately, many in the standards community have put forth transition techniques and methodologies that allow for this co-existence.  A key issue to consider in all of this is that the benefits of IPv6 are somewhat (sometimes severely) compromised by their usage. However, like all technologies, if usage requirements and deployment considerations are considered prior to implementation the proposition is realistic and valid.

Setting the Foundation

IPv6 has several issues and dependencies that are common with IPv4. However, the differences in address format and methods of acquisition require modifications that need to be considered to them. Much of the hype in the industry is on the aspects of support within the networking equipment. While this is of obvious importance, it is critical to realize that there are other aspects that need to be addressed to assure a successful deployment.

The first Block – DNS & DHCP Services

While IPv6 supports auto-configuration of addresses, it also allows for managed address services. DNS does not require, or from a technical standpoint require DHCP, but the two are often offered in same product suite.

When considering the new address format (128 byte colon delimited hexadecimal), it is clear that it is not human friendly. A Domain Name System (DNS) infrastructure is needed for successful coexistence because of the prevalent use of names (rather than addresses) to refer to network resources.  Upgrading the DNS infrastructure consists of populating the DNS servers with records to support IPv6 name-to-address and address-to-name resolutions. After the addresses are obtained using a DNS name query, the sending node must select which addresses are used for communication. This is important to consider both from the perspective of the service (which address is offered as primary) and the application (which address is used). It is obviously important to consider how a dual addressing architecture will work with naming services. Again, the appropriate due diligence needs to be done by investigating product plans but also in limited and isolated test bed environments to assure predictable and stable behavior with the operating systems as well as the applications that are being looked at.

As mentioned earlier, DHCP services are often offered in tandem with DNS services in many products. In instances where IPv6 DHCP services are not supported, but DNS services are, it is important to verify that it will work with standard auto-configuration options.

The second Block – Operating Systems

Any of the operating systems that are being considered to use in the IPv6 deployment should be investigated for compliance and tested so that the operation staff are familiar with any new processes or procedures that IPv6 will require. Tests should also occur between the operating systems and the DNS/DHCP services using simple network utilities such as ping and FTP to assure that all of the operating elements, including the operating systems interoperate at the lowest common denominator of the common IP applications.

It is important to test behaviors of dual stack hosts (hosts that support both IPv4 and IPv6). Much of the industry supports a dual stack approach as being the most stable and tractable approach to IPv6 deployments. Later points in this article will illustrate why this is the case.

The third Block – Applications

Applications should be considered first off to establish the scope of operating systems and the extent to which IPv6 connectivity needs to be offered. Detailed analysis and testing however should occur last after the validation of network services and operating systems. The reason for this is that the applications are the most specific testing instances and strongly depend on the stable and consistent operation of the other two foundation blocks. It is also important to replicate the exact intended mode of usage for the application so that the networking support staff are aware of any particular issues around configuration and or particular feature support. On that note, it is important to consider if there are any features that do not work in IPv6 and what impact that they will have on the intended mode of usage for the application. Finally, considerations need to be made for dual stack configurations and how precedence is set for which IP address to use.

The forth Block – Networking Equipment

Up to this point all of the validation activity referred to can be performed on a ‘link local’ basis. As a result a typical layer two Ethernet switch would suffice. A real world deployment requires quite a bit more however. It is at this point where the networking hardware needs to be considered. It is important to note that many pieces of equipment, particularly layer two type devices will forward IPv6 data. If expressed management via IPv6 is not a requirement then these devices could be used in the transition plans provided they are used appropriately in the network design.

Other devices such as routers, layer three switches, firewalls and layer 4 through 7 devices will require significant upgrades and modification to meet requirements and perform effectively. Due diligence should be done with the network equipment provider to assure that requirements are met and timelines align with project deployment timelines.

As noted previously in the other foundation blocks, dual stack support is highly recommended and will greatly ease transition difficulties as will be shown later. With networking equipment things are a little more complex in that in addition to meeting host system requirements for IPv6 communications of the managed element, the requirements of data forwarding, route computation and rules bases need to be considered. Again, it is important to consider any features that will not be supported in IPv6 and the impact that this will have on the deployment. The figure below illustrates an IPv6 functional stack for networking equipment.

Figure 1. IPv6 network element functional blocks

As shown above, there are many modifications that need to occur at various layers within a given device. The number of layers as well as the specific functions implemented within each layer is largely determined by the type of networking element in question. Simpler layer two devices are only required to provide dual host stack support primarily for management purposes, products like routers and firewalls will be much more complex. When looking at IPv6 support in equipment it makes sense to establish the role that the device performs in the network. This role based approach will best enable an accurate assessment of the real requirements and features that need to be supported rather than industry or vendor hype.

The burden of legacy – Dual stack or translation?

The successful deployment of IPv6 will strongly depend on a solid plan for co-existence and interoperability with existing IPv4 environments. As covered earlier, the use of dual stack configurations whenever possible will greatly ease transition. Today this is an issue for any device supporting IPv6 to speak to IPv4 devices. As time moves on however, the burden will shift to the IPv4 devices to speak to IPv6 devices. As we shall see there are only a certain set of applications that require dual stack down to the end point. Most client server applications will work fine in a server only dual stack environment supporting both IPv4 and IPv6 only clients as shown in the figure below.

Figure 2. A dual stack client server implementation

As shown above both IPv4 and IPv6 client communities have access to the same application server each served by their own native protocol. In the next figure however we see that there are some additional complexities that occur with certain applications and protocols such as multimedia and SIP. In the illustration below we see that there are not only client/server dialogs but client to client dialogs as well. In this instance, at least one of the clients needs to support a dual stack configuration in order to establish the actual media exchange.

Figure 3. A peer to peer dual stack implementation

As shown above, with one end point supporting a dual stack configuration and the appropriate logic to determine protocol selection, end to end multimedia communications can occur. Note that this scenario will typically be lieu of IPv6 only devices as these will become more prevalent over time.

There are many benefits to the dual stack approach. By analyzing applications and mandating dual stack usage, a very workable transition deployment can be attained.

There are arguments that address space, one of the primary benefits of IPv6 is drastically compromised by this approach. After all, by using dual stack you do not remove any IPv4 addresses. In fact you are forced to add IPv4 addresses to accommodate an IPv6 deployment. The truth to this is directly related to the logic of the approach in deployment. By understanding the nature of the applications and giving preference to the innovative (Ipv6 only) population these arguments can be mitigated. The reason for this is that you are only adding IPv6 addresses to existing IPv4 hosts that require communication with IPv6. If this happens to be the whole IPv4 population, so be it. There are plenty of IPv6 addresses to go around! As new hosts and devices are deployed they should be IPv6 only preferentially, or dual stack if required but NOT IPv4 only.

An alternative to the dual stack approach is the use of intermediate gateway technologies to translate between IPv6 and IPv4 environments. This approach is known as NAT-PT. The diagram below illustrates a particular architecture for NAT-PT usage that will provide for the multimedia scenario used previously.

Figure 4. Translation Application Layer Gateway

In this approach the server is supporting a dual stack configuration and is using native protocols to support the client/server dialogs to each end point. Each end point is single stack, one is IPv4 the other is IPv6. In order to establish end to end multimedia communications, there is an intermediate NAT-PT gateway function that provides for the translation between IPv4 and IPv6. There are many issues and caveats with this approach. These can be researched in IETF records.  As a result to this, there is work towards deprecating NAT-PT to an experimental status.  It should be noted that a recent draft revision has been submitted so it is worth keeping on the radar map.

Tunnel Vision

There has been quite a bit of activity around another set of transition methods known as tunneling. In a typical configuration, there are two IPv6 sites that require connectivity across an IPv4 network. The use of tunneling would involve the encapsulation of the IPv6 data frames into IPv4 transport. All IPv6 traffic between the two sites would traverse this IPv4 tunnel. It is a simple and elegant, but correspondingly limited approach that provides co-existence not necessarily interoperability between IPv4 and IPv6. In order to achieve this we need to invoke one of the approaches (dual stack vs. NAT-PT) discussed earlier.  Tunneling by itself only provides the ability to link IPv6 sites and networks over IPv4.

This is a very important point. A point that, if taken to its logical conclusion, indicates that if the network deployment is appropriately engineered, the use of transition tunneling methods can be greatly reduced and controlled, if not eliminated. Before we take this course in logic however it is important to consider the technical aspects of tunneling and why it is something that needs to be thought out prior to using.

The high level use of tunneling is reviewed in RFC 2893 for those interested in further details. Basically there are two types of tunnels; the first is called configured tunnels. Configured tunnels are IPv6 into IPv4 tunnels that are set up manually on a point to point basis. Because of this, configured tunnels are typically used in router to router scenarios. The second type of tunnels is automatic. Automatic tunnels use various methods to derive IPv4/IPv6 address mappings on a dynamic basis in order to support an automatic tunnel setup and operation. As a result, automatic tunnels can be used not only for router to router scenarios but for host to router or even host to host tunneling as well. As a result we are able to build a high level summary table of the major accepted tunneling methods.

Method                Usage                               Risk

Configured          Router to router                 Low


Automatic           Router to router/             Medium

6 to 4                  Host to router

Automatic           Host to host                      High


With out going into deep technical detail on each automatic tunneling methods behavior, we can assume that there is some sort of promiscuous behavior that will activate the tunneling process on recognition of a particular pattern (IP packet type 41 (IPv6 in IPv4)). This promiscuous behavior is what warrants the increased security risk associated with the automatic methods. RFC 3975 goes into detail on the security related issues around automatic tunneling methods. At a high level there is the ability for Denial of Service attacks on the tunnel routers as well as the ability to spoof addresses into the tunnel for integrity breach. The document goes into recommendations on risk reduction practices but they are difficult to implement and maintain properly.

An effective work around to these issues is to use IPSEC VPN branch routing over IPv4 to establish secure encrypted site to site connectivity and then running the automatic tunneling method inside the IPv4 IPSEC tunnel.

The figure below shows a scenario where two 6 to 4 routers have a tunnel set up to establish site to site connectivity inside an IPv4 IPSEC VPN tunnel. With this approach any IP traffic will have site to site connectivity via the VPN branch office tunnels. The IPv6 hosts would have access to one another via the 6 to 4 tunnels. Any promiscuous activity required by 6 to 4 can now be used with relative assurances of integrity and security. The drawback to this approach is that additional features or devices are required to complete the solution.

Figure 5. Using Automatic Tunneling inside IPv4 IPSec VPN

The primary reason for using transition tunnel methods is to transport IPv6 data over IPv4 networks. In essence, the approach ties together islands of IPv6 across IPv4 and allows for connectivity to the IPv6 network.  If we follow this logic, then the use of transition tunneling can be reduced or even eliminated by getting direct connectivity to the IPv6 Internet by at least one IPv6 enabled router in a given organizations network. The figures below illustrate the difference between the two approaches. In the top example, the organization does not have direct access to the IPv6 Internet. As a result transition tunneling must be used to attain connectivity. In the lower example, the organization has a router that is directly attached to the IPv6 Internet. As a result there is no need to invoke transition tunneling. By using layer two technologies such as virtual LAN’s IPv6 hosts can acquire connectivity to the IPv6 dual stack native router.

Figure 6. Using transition tunneling to extend IPv6 connectivity

Figure 7. Using L2 VLAN’s to extend IPv6 connectivity

Within the organization – Use what you already have

As we established by providing direct connectivity to the IPv6 Internet the use of transition tunneling can be eliminated on the public side. Within the organization prior to implementing transition tunneling it makes sense to review the existing methods that may already exist in the network to attain connectivity.

All of the issues in dealing with IPv6 transition revolve around the use of layer 3 approaches. By using layer 2 networking technologies, transparent transport can be provided. There are multiple technologies that can be used for this approach. Some of these are listed below:

  • Optical Ethernet
  • Ethernet Virtual LAN’s
  • ATM
  • Frame Relay

As listed above there are many layer two technologies that can be used to extend IPv6 connectivity within an organizations network.

Virtual LAN’s can be used to extend link local connectivity to IPv6 enabled routers in a campus environment. The data will traverse the IPv4 network with out the complexities of layer 3 transition methods. For the regional and wide area, optical technologies can extend the L2 virtual LAN’s across significant distances and geographies again with the goal of reaching an IPv6 enabled router. Similarly, traditional L2 WAN technologies such as ATM and frame relay can extend IPv6 local links across circuit switched topologies. As the diagram above illustrates, by placing the IPv6 dual stack routers strategically within the network and interconnecting them with L2 networking topologies, an IPv6 deployment can be implemented that co-exists with IPv4 without any transition tunnel or NAT-PT methods.

The catch is of course that these layer two paths can not traverse any IPv4 only routers or layer 3 switches. As long as this topology rule is adhered to this simplified approach is totally feasible. By incorporating dual stack routers, both IPv4 and IPv6 Virtual LAN boundaries can effectively be terminated and in turn propagated further with virtual LAN’s or other layer two technologies on the other side of the routed element. A further evolution on this is to use policy based virtual LAN’s that determine membership according to IP version type of the data received on a given edge port. As the figure below illustrates, dual stack hosts will have access to all required resources in both protocol environments.

Figure 8. Using Policy Based VLAN’s to support dual stack hosts

In essence, where dual stack capability is provided end to end, layer three transition methods can be avoided all together. While it is unlikely that this can be made to occur in most networks, such logic can greatly reduce any layer three transition tunnel usage. By taking additional considerations regarding application network behaviors and characteristics as noted in the beginning of this article the use of intermediate protocol and address translation methods like NAT-PT can also be mitigated.

In conclusion

This article was written to clarify deployment issues for IPv6 with a particular focus on interoperability and co-existence with IPv4. A step by step summary of the deployment considerations can be now summarized as follows:

1). Build the foundation

There are four basic foundation blocks that need to be established prior to deployment consideration. Details on each particular foundation block are provided. In summary they are:

1). DNS/DHCP services

2). Network Operating Systems

3). Applications

4). Network Equipment

As pointed out several times, plan for dual stack support wherever possible in all of the foundation blocks. Such an approach will greatly ease the transition issues around deployment. Ongoing work in multiple routing and forwarding planes such as OSPF-MT (  and Multi-protocol BGP (MBGP) may have beneficial and simplifying merits to interconnect dual stack routing elements and exclusively identify them and build forwarding overlays or route policies based on the traffic type (IPv4 vs. IPv6). While the OSPF-MT work is in preliminary draft phases it has very strong merits in that it can in combination with MBGP effectively displace MPLS type approaches to accomplish the same goal. Again, no transition methods would be required within the OSPF-MT boundary as long as overlay routes exist between the dual stack routing elements.

2). Establish connectivity

Once the foundations have been provided for the next step is to establish how connectivity will be made between different sites. Assuming that dual stack routers are available, it makes sense to closely analyze campus topologies and establish methods that connectivity can be established in concert with layer two networking technologies. Once all available methods have been exhausted and it is clear that one is dealing with an IPv6 ‘island’. It is at this point where one should look at using one of the IPv6 transition tunneling methods with configured tunneling being the most secure and conservative approach and is appropriate for this type of site to site usage.. Host to router tunneling may have valid usage in remote access VPN applications, particularly where local Internet providers do not offer IPv6 networking services. Host to host tunneling applications should be used only in initial test bed or pilot environments and because of manageability and scaling issues is not recommended for general practice usage.

To connect sites across a wide area network, layer two circuit switched technologies such as frame relay and ATM can extend connectivity between the dual stack enabled sites. In some next generation wide area deployments, layer two virtual LAN’s can be extended across RPR optical cores to accomplish the end to end connectivity requirements. Again, only after all other options have been exhausted should the use of IPv6 transition tunneling methods be entertained.

At this point, a dual stack native mode deployment has been achieved with only the minimal use of tunneling methods. It is only at this point that the use of any NAT-PT functions should be entertained to accommodate any applications that do not comply to the deployment. It is strongly urged that such an approach be used in a very limited form and be relatively temporary in the overall deployment. Timelines should be established to move away from the temporary usage by incorporating a dual stack native approach as soon as feasible.

3). Test, test, test

As noted at several points throughout this article testing is critical to deployment success. The reason for this is that requirements are layered and they are interdependent. Consequently, it is important to validate all embodiments of an implementation. Considerations need to be made according to node type, operating system, application as well as any variations that need to be considered for legacy components. It is like the great law of Murphy, it is the implementation that you do not test that will be the one to have the problems.