Archive for the ‘Technology & Society’ Category

Advanced Persistent Threats

November 23, 2016

titleAdvanced Persistent Threats

Coming to a network near you, or maybe your network!

 

There are things that go bump in the night and that is all they do. But once in a while things not only go bump in the night, they can hurt you. Sometimes they make no bump at all! They hurt you before you even realize that you’re hurt. No, we are not talking about monsters under the bed or real home intruders; we are talking about Advanced Persistent Threats. This is a major trend that has been occurring at a terrifying pace across the globe. It targets not the typical servers in the DMZ or the Data Center, but the devices at the edge. More importantly, it targets the human at the interface. In short, the target is you.

Now I say ‘you’ to highlight the fact that it is you, the user who is the weakest link in the security chain. And like all chains, the security chain is only as good as its weakest link. I also want to emphasize that it is not you alone, but myself or anyone or any device for that matter that accesses the network and uses its resources. The edge is the focus of the APT. Don’t get me wrong, if they can get in elsewhere they will. They will use whatever avenue they find available. That is another point. The persistence part, they will not go away. They will keep at it until eventually they find a hole, however small and exploit it. Once inside however they will be as quiet as a mouse. Being unknown and undetected is the biggest asset to the APT.

How long this next phase goes is not determinable. It is very case specific. Many time’s its months if not years. The reason why is that is not about attacking, it’s about exfiltration of information from your network and its resources and/or totally compromising your systems and holding you hostage. This will obviously be specific to your line of business. In the last article we made it plain that regardless of the line of business there are some common rules and practices that can be applied regardless to the practice of data discovery. This article hopes to achieve the same goal. To not only edify you as to what the APT is but illustrate its various methods and of course provide advice for mitigation.

We will obviously speak to the strong benefits of SDN Fx and Fabric Connect to the overall security model. But as in the last article, it will take a second seat as it is the primary practices and use of technology regardless of its type, as well as the people, policies and practices that are mandated. In other words, a proper security practice is a holistic phenomenon that is transient and is only as good as the moment of space and time it is in. We will talk to our ability and perhaps soon the ability of artificial intelligence (AI) to think beyond the current threat landscape and even perhaps learn to better predict the next steps of the APT. This is how we will close. So, this will be an interesting ride. But its time you took it.

What is the Advanced Persistent Threat?

In the past we have dealt with all sorts of viruses, Trojans and worms. Many folks ask, what is different now? Well in a nutshell, in the past these things were largely automated software devices that were not really discerning on the actual target. In other words, if you were a worm meant for a particular OS or application and you found a target that was not updated with the appropriate protection you nested there. You installed and then looked to ‘pivot’ or ‘propagate’ within the infected domain. In other words in the past these malicious software were opportunistic and non-discretionary in the way they worked. The major difference with the APT is that they are targeted. They are also typically staffed and operated by a dark IT infrastructure. They will still use the tools, the viruses, the Trojans the worms. But they will do so with stealth and the intent is not to kill but to compromise, perform exfiltration and even establish control. They will often set up traps that once it is clear they have been discovered they will run a ransomware exploit as they leave the target. This gives them a lasting influence and extension of impact.

In short, this is a different type of threat. This is like moving from the moving columns of ancient roman armies to the fast and flexible mounted assaults of the steppe populations out of Asia. The two were not well suited for one another. In the open lands, the horseback was the optimal. But in the populated farm areas and particularly in the cities, the Roman method proved superior. This went on for centuries until history and biology decided the outcome. But afterwards there was a new morphing, the mounted knight. A method which took the best from both worlds and attempted to combine them and by that created a military system that lasted for almost a thousand years. So we have to say that it had a degree of success and staying power.

We face a similar dilemma. The players are different, as are the weapons, but the scenario is largely the same. The old is passing away and the new is the threat on the horizon. But I also want to emphasize that no one throughout the evolution of warfare probably threw a weapon away unless it was hopelessly broken. Folks still used swords and bows long after guns were invented. The point is that the APT will use all weapons, all methods of approach until they succeed. So how do you succeed versus them?

Well, this comes back to another dilemma. Most folk cannot account for what is on their networks. As a result they have no idea of what a normal baseline of behavior is. If you do not have any awareness of that how do you think you will catch and see the transient anomalies of the APT? This is the intention of this article. To get you think in a different mode.

The reality of it is that APT’s can come from anywhere. They can come from any country, even internally to your organization! They can be for any purpose, monetary, political, etc. They will also tend to source in the country where the target is and use the ambiguity of DNS mapping to trace ‘home’. This is what makes them advanced. They have very well educated and trained staffs who are mounting a series of strong phases of attack against your infrastructure. Their goal is to gain communications and control (C2) channels to either gain exfiltration of information of actual control of certain subsystems. They are not out to expose themselves by creating issues. As a curious parallel there has been a noted decrease in DOS and DDOS attacks on networks as the APT trend has evolved. It’s not that it isn’t used anymore; it’s just that it is now used in a very limited and targeted fashion. Which makes it far more dangerous. Often to cover up some other clandestine activity that the APT is executing and this would be a very last resort. For them being stealth is key to their long term success. So the decreases in these types of attacks make sense when looked at holistically. But note that a major IOT DDOS attack just occurred with home video surveillance equipment. Was it just an isolated DDOS or was it to get folks to turn their attentions to it? We may never know. These organizations may be nation states, political or terrorist groups, even corporations involved in industrial espionage. The APT has the potential to be anywhere and it could put its targets on anything, anywhere, at any time according to its directives. The reason why they are so dangerous is that they are actual people who are organized and who use their intelligence and planning against you. In short, if they know more about your network than you do… you lose. Pure and simple.

So what are the methods?

There has been a lot of research on the methods that APT’s will use. Due to the fact that this is largely driven by humans, the range can be very wide and dynamic. Basically it all gets down to extending the traditional kill chain. This concept was first devised by Lockheed Martin to footprint a typical cyber-attack. This is shown in the illustration below.

figure-1

Figure 1. The traditional ‘kill chain’

The concept of infiltration needs to occur in certain fashion. An attacker can’t just willy-nilly their way into a network. Depending on the type of technology, the chain might be rather long. As an example, compare a simple WEP hacking example against a full grade Enterprise WPA implementation with strong micro-segmentation. There are many degrees of delta in the complexity of the two methods. Yet, many still run WEP. The APT will choose the easiest and most transparent method.

Reconnaissance

In the first initial phase of identifying a target a dark IT staff is called together as a team. This is known as the reconnaissance or information gathering phase. In the past, this was treated lightly at best by security solutions. Even now with highlighted interest in this area by security solutions, it tends to be the extended main avenue of knowledge acquisition. The reason for this is that much of this intelligence gathering can take place ‘off line’. There is no need to inject probes or pivots at this point. This is like shooting into a dark room and hoping you hit something. Instead the method is to gain as much intelligence about the targets as possible. This may go on for months or even years, as it continues as the next step and even the others occur. Note how I say ‘targets’. This notes that the target, when analyzed will result in a series of potential target systems. Now in the past these were typically servers, but now this may not be the case. The APT is more interested in the users or edge devices. These devices are typically more mobile with a wider degree of access media type. There is also another key thing on many of these devices. They have you or me at the interface.

Infiltration

Once the attacker feels that there is enough to move forward the next step is to try to establish a beach head into the target. In the past this was typically a server somewhere, but folks have been listening and following the advice of the security communities. They have been hardening their systems and keeping up to date and consistent with code releases. Score one for us.

There is the other side of the network though. This is more of a Wild West type of scenario. In the old west of the United States law was a tentative thing. If you were in a town out in the middle of nowhere and some dark character came into town, your safety was as good as the sheriff, which typically didn’t last the first night. Your defense was ‘thin’. Our end points are much the same way. As a result, truly persistent professional teams that are advanced in nature will target the edge, more specifically, the human at the edge. No one is immune. In the past a phishing attempt was easier to see. This has changed recently in that many times these attempts will be launched from a disguised email or other correspondence with an urgent request. The correspondence will appear very legitimate. Remember the APT has done their research. It appears to have the right format and headers; it is also from your manager. He is also referring to a project that you currently are working on with a link indicating that he needs to hear back immediately as he is in a board meeting. The link might be a spreadsheet; a word document… the list goes on. Many people would click on this well devised phish. Many have. There are also many other ways, some that in the right circumstances does not even require the user to click.

There are also methods to create ‘watering holes’ which is basically an Infiltration of websites that are known to be popular or required with the target. Cross site scripting is a very common set of methods to make this jump. Once visited the proper scripts are run and the infiltration then begins. A nice note is that this has fallen off due to improvements in the JRE.

There are also physical means. USB ‘jump sticks’. These devices can carry malware that can literally jump into any designed system interface. There is no need to log on to the computer. Only access to the USB port is necessary and even then only momentarily. In the right circumstances a visitor could wreak a huge amount of damage. In the past this would have been felt immediately. Now you might not feel anything at all. But it is now inside your network. It is wreaking no damage. It remains invisible.

Exploitation (now the truth of the matter is that it’s complicated)

When the APT does what it does if it is successful you will not know it. The exploit will occur and if undiscovered continue on. It is a scary point to note that most APT infiltrations are only pointed out after the fact to the target by a third party such as a service provider or the law enforcement. This is sad. It means that both the infiltration and exploitation capabilities of the APT are very high. The question is how does this get accomplished? The reality of it is that each phase in the chain will yield information and the need to make decisions as to the next best steps in the attack. Well, the realization is that this is the next step in the tree. This is shown in the figure below there are multiple possible exploits and further infiltrations that could be leveraged off of the initial vector. It is in reality a series of decisions that will take the intruder closer and closer to its target.

figure-2

Figure 2. The Attack Tree

Depending upon what the APT finds as it moves forward its strategy will change and optimize over time. In reality it will morph to your environment in a very specific and targeted way. So while many folks think that exploitation is it. It’s really not. In the past it was visible. Now it’s not. The exploitation phase is used to further implant into the network.

 

Execution or Weaponization

In this step there is some method established to the final phase which is either data exfiltration or complete command and control (C2). Note that again, these steps may be linked and traced back. This is important as we shall see shortly. Note that execution is a process that will have a multitude of methods ranging from complete encryption (ransomware) to simple probes or port and keyboard mappers to gain yet further intelligence. Nothing is done to expose its presence. Ideally, it will gain access to the right information and then begin the next phase.

 

Exfiltration

This is one of the options. The other is command and control (C2) which to some degree is required for exfiltration anyways. So APT’s will do both. Hey, why not? Seeing as you are already into the belly of the beast why are you not leveraging all avenues available to you? It turns out that both require a common trait; an outbound traffic requirement. At this point if the APT wants to pull the desired data out of the target it must establish an outbound communication. This is also referred to as a ‘phone home’ or ‘call back’. These channels are often very stealthy and they also are typically encrypted and mixed within the profile of the normal data flow. Remember, while there are well known ports assigned that we all should comply to, an individual with even limited skills can generate a payload with ‘counterfeit’ port mappings. DNS, ICMP and SMTP are three very common protocols for this type of behavior. It’s key to look for anomalies in behavior at these levels. The reality of it is that you need some sort of normalized baseline before you can judge whether there is an anomaly. This makes total sense.

If you bring me to edge of a river and say “Ed, tell me the high and low levels”, I could not reliably provide you with that information given what I am seeing. I would need to monitor the river for a length of time. To ‘normalize’ it, in order to tell you the highs and the lows. Even then with the possibility of extreme outliers. This is very much the same with security. We need to normalize our environments in order to see anomalies. If we can see these odd outbound behaviors early, then we can cut the intruder off and prevent the exploit from completing.

The APT needs systems to communicate in order for the tools to work for them. This means that they need to leave some sort of ‘footprint’ as they look to establish outbound channels. They will often use encryption to maintain a cloak of darkness for the transport of the data.

Remember, unlike the typical traditional threat which you probably are well prepared for. The APT will look to establish a ‘permanent’ outbound channel. The reason I use quotes around permanent is that these channels may often jump sessions, port behaviors or even whole transit nodes if the APT has built enough supporting malicious infrastructure into your network. Looking at the figure below, if the APT has compromised a series of systems; it has a choice on how to establish outbound behaviors.

figure-3

Figure 3. Established exfiltration channels

The larger the footprint the APT has the better it can adjust and randomize its outbound behaviors, which makes it much more difficult to tease out. So catching the APT early is very key. Otherwise it’s much like trying to stamp out a fire that is growing out of control.

 

Command and Control (C2)

This is the second option. Sometimes the APT wants more than just data from you. Sometimes they want to establish C2 channels. This can be for multiple purposes. As in the case above, it might be to establish a stealth outbound channel network to support exfiltration of data. On the other side of the spectrum this might be complete (C2). Think power grids, high security military, intelligent traffic management systems, automated manufacturing, subways, trains, airlines. The list goes on and on.

The reality of it is that once the APT is inside most networks it can move laterally. This could be through the network directly but it might also be through social venues that might traverse normal segment boundaries. So the lateral movement could be at the user account level, the device, or completely random based on a set of rules. Also, let’s not forget the old list of viruses, web bots and worms that the APT can use internally within the target and on a very focused basis. It has the vectors for transport and execution. Note how I do not say outright propagation, in this case it is much more controlled. As noted above once the APT has established a presence at multiple toeholds it’s very tough to knock it out of the network. A truly comprehensive approach is required to mitigate these outbound behaviors. It starts at the beginning, the infiltration. Ideally we need to catch it there. But the reality is that in some instances this will not be the case. I have written about this in the past. With the offense there is the nature of surprise. The APT can come up with a novel method that has not been seen before by us. So we are always vulnerable to infiltration to some degree. But if not cutting it off before it enters we can work to prevent the exploit and later phases of attack. While not perfect, this has merit. If we can make the infiltration limited and transient in its nature the later steps become much more difficult to accomplish. We will speak to this later as it is a very key defense tactic that if done properly is very difficult to penetrate past. Clearly these outbound behaviors are not the time to finally detect something, particularly if you pick it out of weeks of logs. The APT has already established its infrastructure, you are in reaction mode.

The overall pattern (hint – its data centric)

By now hopefully you are seeing a strong pattern. It is still nebulous and quite frankly it always will be. The offense still has a lot of flexibility. For us to think that the APT will not evolve is foolish. So we need to figure out a way to somehow co-exist with its constant and impinging presence. Due to its advanced and persistent nature (hence the APT acronym) the threat cannot be absolutely eliminated. To do so would make systems totally isolated. And while this might be desired to a certain level for certain systems as we will cover later, we have to expose some systems to the external Internet if we wish to have any public presence.

Perhaps this is another realization. We should strongly limit our public systems and strongly segment with no confidential data access. When you get down to it, the APT is not about doing a DDOS attack on your point of sales. It’s not even about it absconding credit card data on a one time hit. None of these are good for you obviously. But the establishment of a persistent dark covert channel out of your network is one of the worst scenarios that could evolve. By this time you should be seeing a pattern. It’s all about the data. They are not after general communications or other such data unless they are doing further reconnaissance. They are about moving specific forms of information out or executing C2 on specific systems within the environment. Once we recognize this we see that the intent of the APT is long term residence and preferably totally stealth. The figure below shows a totally different way to view these decision trees.

figure-4

Figure 4. A set of scoped and defined decision trees

Each layer from outer to center represents different phases in the extended kill chain. As can be seen they move from external (access), to internal (pivot compromise) and target compromise kill chains. You can also see that the external points are exposed vulnerabilities that the APT could leverage. These might be targeted and tailored email phishing or extensive water holing. There may also be explicit attacks against service points discovered. The goal is to establish a network of pivot points that can allow for a better exposure of the target. The series of decision trees all fall inward towards the target and if the APT gets its way and goes undiscovered, this will be the footprint of its web within the target. It is always looking to expand and extend it but not at the cost of losing secrecy. Its major strength lies in its invisibility.

So the concept of a linear flow to the attack has to go out the window. Again, this is the key term to persistence. This is very cyclic is the way it evolves over time. The OODA loop comes to mind which is typically taught to military pilots and quick response forces is – Orient, Observe, Decide, Access. The logic that the APT uses is very similar. This is because it is raw constructive logic. Trying to break down OODA any further becomes counterproductive, believe me many have tried. So you can see that the OODA principle is well established by the APT. Remain stealth, morph and move. But common to all of this is the target. Note how everything revolves around that center set of goals. If you are starting to see a strategy of mitigation and you haven’t read my previous article then my hat is off to you. If you have read my article and see the strategy then my hat is off to you as well. If you have not read my article and are puzzled – hang on. If you have read my last article and you are still puzzled I need to say emphatically. It’s all about the data!!!

We also should start to see and understand another pattern. This is shown in simpler terms in the diagram above; there is an inbound, a lateral and an outbound movement to the APT. This is the signature of the APT. While it looks simple, the mesh of pivots that the APT establishes can be quite sophisticated. But from this we can begin to discern that if we have enough knowledge of how our network normally behaves we can perhaps tease out these anomalies, which obviously did not exist before the APT gained residence. Note the statement I just made. Normalization means normalization against a known secure environment. A good time to establish this might be after compliance testing for example. You want to see the network as it should be.

Once you have that, you should with the right technologies and due diligence be able to see any anomalies. We will talk later about these in detail, but it can range from odd DNS behavior to random encrypted outbound channels. We will speak to methods of mitigation, detection as well as provide a strategic roadmap on goals against the APT realizing that we have limited resources available in our IT budgets.

So is this the end of IT Security as we know it?

Given all of the trends that we have seen in the industry one is tempted to throw up their arms and give up. Firewalls have been shown to have shortcomings and compromises; encryption has been abused as a normal mode of operation by the APT. What good is anti-virus in any of this? Many senior executives are questioning the value of the investment that they have made into security infrastructure, particularly if you are an executive of an organization that has been recently compromised.

After all, encryption is now being used by the bad guys, as are many other ‘security’ approaches. The target has shifted from the server to the edge. Does this mean that we jettison all of what we have built because it is no longer up to the challenge? Absolutely not! It does however indicate that we need to rethink how we are using these technologies and how they can be used with newer technologies that are coming into existence. Basically, the concept of what a perimeter is needs to change and we will discuss this in detail later on, but additionally we need to start thinking more aggressively in our security practice. We can no longer be sheep sitting behind our fences. We must learn to be more like the wolves. This may sound parochial but take a look at the recent news on the tracking and isolation of several APT groups not only down to the country of origin but the actual organization and in some instances even the site! This is starting to change the rules on the attackers.

But this is the stuff of advanced nation state cyber-warfare, what can the ‘normal’ IT practitioner do to combat this rising threat? Well, it turns out there is quite a bit. And it turns out that aside from launching your own attacks (which you shouldn’t do obviously), there is not much that the nation states can do that you can’t do. So let’s put on some different hats for this article. Let’s make them not black, but very nice dark gray. The reason why I say this is that in order to be really effective in security today you need to think like the attacker. You need to do research; you should attempt penetration and exploitation yourself (in a nice safe ISOLATED lab of course!). In short, you need to know them better than they know you because in the end it’s all about information. We will return to this very shortly. But we also need to realize that we need to create a security practice that is ‘data centric’. It needs to place the most investment in the protection of critical data assets that are often tiered in importance. Gone are the days of the hard static perimeter and the soft gooey core. We need to carry micro-segmentation to the nth degree. The microsegments need to not only strongly but exactly correspond to the tiers of the risk assets mentioned earlier. Assets with more risk should be ‘deeper’ and ‘darker’ and should require stronger authentication and much more granular monitoring and scrutinizing. All of this makes sense but it only makes sense if you have your data in order and have knowledge as to its usage, movement and residence. This gets back to the subject of my previous article and it sets the stage well for this next conversation. If you have not read it, I strongly urge you to do so before you continue.

 

Information and warfare

This is a relationship that is very ancient, as ancient as warfare itself. The basic premise is three fold. First, aggressors (hence weapon technology to a large part) has had the advantage in the theory of basic conflict. After all, it’s difficult to design defenses against weapons that you do not know about yet. But it doesn’t mean the defense lacks the ability to innovate either. As a matter of fact with a little ingenuity almost anything used in offense can be used for defense as well. So we need to think aggressively in defense. We cannot be passive sheep. Second, victory is about expectation. Expectation on a plan, on a strategy of some sort to achieve an end goal; in essence very few aggressive conflicts have no rationale. There is always a reason and a goal. Third, information is king. It to a very large degree will dictate the winners and the losers in any conflict, whether its Neolithic or modern day cyber-space. If the attacker knows more that you do, then you are likely to lose.

OK Ed! You might be saying wow! We are talking spears and swords here! Well, the point is that it’s not much has changed since the inception of conflict itself. Spying and espionage goes back as far as history, perhaps further. Let us not forget that it was espionage, according to legend that was the downfall of the Spartan 300. I can give you dozens (and dozens) of examples of espionage throughout history right up to modern times. Clandestine practice is certainly nothing new. But there may be a lot of things that we as security folk have forgotten along the way. Things that that the attackers might still remember; in today’s world if the APT knows more about your network and applications than you do; if they know more about your data than you do. You are going to lose.

Here you may be startled at the comment. How dare I. But if the question is extended to “Do you have a comprehensive data inventory? Is it mapped to relevant systems and validated? Do you know where its residence is? Who has access?” Many cannot answer these questions. The problem is that that APT can. They know where your data is and they know how it moves through your network, or at least they are in constant effort to understand that. They also understand where they can do exfiltration of the data as well. If they know and you don’t, they could be pulling information for quite a long time and you will not know. Do you think I am kidding? Well consider this. About 90% of the information compromises that occur are not discovered by internal IT security staff, they are notified of them by third parties such as their service providers or law enforcement agencies. Here is another sobering fact, the APT on average had residence in the victims network for 256 days.

So clearly things are changing. The ground as it were, is shifting underneath our feet. The traditional methods of security are somehow falling short. Or perhaps they always were and we just didn’t realize it until the rules changed. In any event, the old ‘keep ‘em out’ strategy is no longer sufficient. We need to realize that our networks will at some point be compromised. We will talk a little later as to some of the methods. Because of this, we need to shift our focus to detection. We need to identify the foreign entity and hopefully remove it before it does to much damage or gains to much knowledge. So IT security as we know it will not go away. We still require firewalls and DMZ’s, we will still require encryption and strong identity policy management as well as intrusion detection technologies. We will just need to learn to use them differently then we have in the past. We also have to utilize new technologies and concepts to create discrete visibility into the secure data environments. New architectures and practices will evolve over time to address these imminent demands. This article is intended to provide baseline insight into these issues and how they can be addressed.

 

It’s all about the user (and I’m not talking about IT quality of experience!)

Whenever you see a movie about hacking you always see someone standing in front of several consoles, cracking into various servers and doing their mischief. It’s fast moving and very intense. I always laugh because this is most definitely not the case. Slow and steady is always best and the server is most definitely not the place to start. It’s you. You are the starting point.

Think about it, you move around. You have multiple devices. You probably have less stringent security practices than the IT staff that maintains the server. You are also human. You are the weakest link in the security chain. Now I’ve spoken about this before but it has always been from the perspective of IT professionals who are not as diligent as they should be in the security practice of their roles. Here we are talking about the normal user, who may not be very technically savvy at all. Also, let’s consider that as humans we are all different. Some are more impulsive. Some who are more trusting. Some who simply don’t care. This is the major avenue or rather set of avenues that an attacker could use to gain compromised access into the network. Let’s look at a couple.

Deep Sea Phishing –

Many folks are aware of the typical ‘phishing’ email that says ‘Hey, you’ve won a prize! Click on this URL below!’ Hopefully, most folks now know not to click on the URL. But the problem is that this has moved into new dimensions with whole orders of magnitude in the increase of intelligence behind these types of attacks. As I indicated earlier, much of the reconnaissance that an APT does is totally off of your network. They use publicly posted information. News updates, social media, blog post (yikes – I’m writing one now!). They will not stop there either. There is a lot of financial data and profiling as well as the tagging of individuals to certain organizational chains and projects. Once the right chain is identified the phishing attack is launched. The target user receives a rather normal looking email from his or her boss. The email is about a project that they are currently working on and that they need to hear back on some new numbers that are being crunched. Could they take a look and get back to them by the end of the day. Time is of the essence as we are coming to the end of the quarter. They need to hear back by end of day. Many would open the spreadsheet and understandably so. HTML enabled email makes it even worse in that the SMTP service chain is obscured making it difficult to see the odd chain. And even then, many users wouldn’t even notice that. Many data breaches have occurred in just such a scenario. Once the url is clicked or the document is opened, the malicious code goes to work and establishes two things. The first is command and control back to the attacker, the second is evasion and resilience. From that point of presence the attacker will usually privilege escalate the local machine and then utilize it as a launching point to gain access to other systems.

The Poisoned Watering Hole or Hot Spot –

We all go out on the web and we all probably have sties that we hit regularly. We all go out to lunch and most probably go to our favorite places regularly. This is another thing that attackers can leverage the concept that we are creatures of habit. So let’s change the scenario. Let’s say that the attacker gets a good profile of the targets web behavior. They also learn where the target goes for lunch. But they don’t even need to know that. Typically they will select a place that is popular with multiple individuals in the target organization. That way the probability of will provide greater hits. Then they will emulate the local hot spot with aggressive parameters to force the targets to associate with it. Once that occurs the targets would gain internet access as always but now the attacker is in the middle. As the targets go about using the web they can be re-directed to poisoned sites. Once the hit occurs the attacker shuts down the rogue hot spot and then waits for the malicious code that is now resident on the targets to dial back. From the target users perspective the WLAN dropped and they simply re-associate to the real hot spot. Once the users go back to work, they log on and as a part of it they establish an outbound encrypted TCP connection to the APT. These will not be full standing sessions however, but intermittent. This makes the behavior seem more innocuous. The last thing that the APT wants is to stand out. From there the scenario proceeds much like before.

In both of the scenarios the user is the target. There are dozens of other examples that could be given but I think the two suffice. The human behavior dimension is just too wide to expect technology to fulfill the role, at least at this point. Until then we need firm clear policies that are well understood by everyone in the organization. There also needs to be firm enforcement of the policies in order for them to be effective. This is all in the human domain, not in the technology domain. But technology can help.

 

It’s all about having a goal as well

When an advanced persistent threat organization first starts to put you in their sites, they usually have a pretty good idea of what they are looking for or what they want to do. Only amateurs gain compromised access and then rummage or blunder about. It’s not that an APT wouldn’t take information that it comes across if it found it useful, but they usually have a solid goal and corresponding target data set. What that is depends on what the target does. Credit Card data is often a candidate, but it could be patient record data, confidential financial or research information, the list can be endless. We discussed this in my previous article on data discovery and micro-segmentation practices. It is critical that the critical data gets identified and accounted for. Because you can bet that the APT has.

This means that there is deliberate action on behalf of the APT. Again, only amateurs are going to bungle about. The other thing is that time is, unlike in the movies, not of essence! The average residency number that I quoted earlier illustrates this. In short, they are highly intelligent to their targets, they are very persistent and will wait many months until the right opportunity to move and they are very quiet.

This means that you need to get your house in order on the critical data that you need to protect. You need to know how it moves through your organization and you need to establish a solid idea of what normal is within those data flows. Then you need to move to fight to protect it.

The Internet – The ultimate steel cage

When you think about it, you are in the ultimate steel cage. You have to have a network. You have to have an Internet presence of some sort. You need to use it. You cannot go away. If you do you will go out of business. You are always there and so is the APT. The APT also will not go away. It will try and wait and wait and try and go on and on until it succeeds in compromising access. This paradigm means that you cannot win. No matter what you as a security professional does in your practice, the war can never be won. But the APT can win. It can win big. It can win to the point on putting you out of business. This creates a very interesting set of gaming rules if you are interested in that sort of thing. In a normal zero sum game, there is a set of pieces or tokens that can be won. Two players can sit down and with some sort of rules and maybe some random devices such as dice play the game. The winner is established by the first player to win all of the tokens. But if we remove the dice we have a game more like chess where the players’ lose or win pieces based on skill. This is much more akin to the type of ‘game’ that like to think that we play in information security. Most security architects I know do not use dice in their practice. Now in a normal game of chess, each player is more or less equal with the only real delta being skill. But remember you are sitting at the board with the APT. So here are the new rules. You cannot win all of his or her pieces. You may win some but even if you come down to the last one, you need to give it back. What’s more, there will not be just one. There will be ‘some number’ of pieces that you cannot win. Let’s say that it’s a quarter or maybe even half of the pieces are ‘unwinnable’. Well, it is pretty clear that you are in a losing proposition. You cannot win. The best you can do is stay at the board for as long as you can. Then also consider that the APT’s skill and resources may be just as great if not greater than yours. Does that help put things in perspective?

So the scenario is stark, but it is not hopeless. The game can actually go on for quite some time if you are smart in the way you play. Remember, I said ‘some number’ of pieces that you cannot win I did not say which types. If you look at a chess board you will note that the power pieces and the pawns are exactly half the count. This means that you could win all or most of the power pieces and leave the opponent with a far minimized ability to do damage to you as long as you aren’t stupid. So mathematically the scenario is not hopeless, but it is not bright either. While you can never win you can establish a position of strength that allows you to stand indefinitely.

Realize that the perimeter is now everywhere

Again, an old notion is that we can somehow draw a line our network and systems is becoming antiquated. The trends in BYOD, mobility, virtualization and cloud have forever changed what a security perimeter is. We have to realize that we are in a world of extreme mobility. Users crop up everywhere demanding access from almost anywhere, with almost any consumer device. These devices are also of consumer grade with little or no thought to systems security. As a result these devices, if not handled correctly with the appropriate security practices become a very attractive vector for malicious behavior.

This means that the traditional idea of a network perimeter that can be protected is no longer sufficient. We need to realize that there are many perimeters and these can be dynamic due to the demands of wireless mobility. This doesn’t mean that firewalls and security demarcations are no longer of any use; it just means that we need to relook at the way we use them and compare them with new technologies that can vastly empower them.

It is becoming more and more accepted is that micro-segmentation is one of the best strategies for a comprehensive security practice and to make it as difficult as possible for the APT. But this can’t be a simple set of segments off of a single firewall but multiple tiered segments with traffic inspection points that can view the isolated data sets within. The segmentation provides for two things. First, it creates a series of hurdles for the attacker, both on the way in and on the way out as they seek the exfiltration of data. Second and perhaps less obvious, segmentation provides for isolated traffic patterns with very narrow application profiles as well as interacting systems. In short, these isolated segments are much easier to ‘normalize’ from a security perspective. Why is this important? It is important because in the current environment 100% prevention is not a realistic proposition. If an APT has targeted you, they will get in. You are dealing with a very different beast here. The new motto you need to learn is that “Prevention is an ideal, but detection is a MUST!”

In order to detect you need to know what is normal. In order to make this clear let’s use a mundane example of a shoplifter in a store. The shoplifter wants to look like any other normal shopper, they will browse and try on various items like anyone else. In other words they strongly desire to blend into the normal behavior of the rest of the shoppers in the store. An APT is no different. They want to blend into the user community and appear like any other user in the network. As a matter of fact they will often commandeer normal users machines by the methods discussed earlier. They will learn the normal patterns of behavior and try as much as possible to match them. But at some point, in order to shoplift the shopper needs to diverge from the normal behavior. They need to use some sort of method to take items out of the store undetected. In order to do this, they need to avoid video surveillance direct views and allow for a time where they can ‘lift’ the items. But regardless of the technique, there needs to be delta. Point A, product… point B, no product. The question is will it be noticed. This is what detection is all about. In a retail environment it is also accepted that a certain amount of loss needs to be ‘accepted’ as the normal business risk for operations. The reasons being for this is that there is a cost point where further expense in the areas of prevention and detection do not make any fiscal sense.

It is very much the same thing with APT’s. You simply cannot seal off your borders. They will get in. The question is how far they penetrate and how much they are able to discover about you and what information they are able to pull out. There is common joke in the security industry, it goes like this. “If you want a totally secure computer, unplug all network connections. Seal it off physically with thick walls, including all and any RF with no entrance. Then take several armed guards and an equivalent number of very large attack dogs and place around the perimeter 24 x 7. Also you need to be sure that you have total independence of power, which means you need a totally separate micro grid that in turn cannot be compromised by using the above methods.” Like all tech sector jokes, the humor is dry at best and serves to show the irony of a thought process. Such a perfectly secure computer would be perfectly useless! We like the shop owner need to assume and accept a certain amount of risk and exposure to be on line. It is simply the reality of the situation, hence the steel cage analogy I used earlier. So detection is of absolute key importance to the overall security model.

How to catch a thief

So the next question is how do you detect an APT is in your network? Additionally how do you do it as early as possible taking into consideration that time is on the attackers’ side – not yours. Once again, it serves to revisit the analogy of the shoplifter. Retail outfits usually have store detectives. These individuals are specialists in retail security. They know the patterns of behavior and inflections of movement that will cause a highlight around a certain individual. Many of these individuals have a background in psychology and have been specifically trained to watch for telltale signs. Note that such indicators cannot cause arrest or even ejection from the store. They can only serve to highlight that additional attention is needed on a certain individual. Going further, there are often methods to get into dressing rooms and the counting of items before entry and upon exit. This could be viewed both as a preventative as well as a detective measure. There are also usually RF tags that will flag an alarm if the item is removed from the premises. Often these tags are ink loaded so that they will despoil the product if removal is attempted without the correct tool. All of this can be more or less replicated in the cyber environment. The key is what to look for and how to spot it.

A compromised system

This is the obvious thing to look for as it generally all starts here. But the problem is that APT’s are pretty good at hiding and staying under cover until the right time. So the key is to look for patterns of behavior that are unusual from a historical standpoint. This gets back to the concept of normalization. In order to know that a user’s behavior is abnormal, it is important to have a good idea on what the normal behavior profile is. Some things to look for are unusual patterns of session activity. Lots of peer to peer activity where in the past there was little or none. Port scanning and the use of discovery methods should be monitored as well. Look for unusual TCP connections, particularly peer to peer or outbound encrypted connections.

Remember that there is a theory to all types of intrusion. First, an attacker needs to compromise the perimeter to gain access to the network. Unless the attacker is very lucky, they will not be where they need or want to be. This means that a series of lateral and northbound moves will be required in order to establish a foothold and command and control. This is why it is not always a good idea to take a suspicious or malicious node off of the network. You can gain quite a bit by watching it. As an example, if a newly compromised system begins to implement a series of scans and no other behavior then it is probably an isolated or early compromise. If the same behavior is accompanied by a series of encrypted TCP sessions then there is a good probability that the attacker has an established footprint and is working to expand their presence.

Malicious or suspicious activities

Once again normalization is required in order to flag unusual activities on the network. If you can set up a lab to provide an idealized ‘clean’ runtime environment, a known good pattern and corresponding signature can be developed. This idealized implementation provides a clean reference that is normalized by its very nature. After all, you don’t want to normalize an environment with an APT in it now do you? Once this clean template is created, it is easy to spot deltas and unusual patterns of behavior. These should be investigated immediately. Systems should be located and identified with the corresponding user if appropriate. There may or may not be the confiscation of equipment. As pointed out earlier, sometimes it is desirable to monitor their activities in a controlled fashion with the option of quarantine at any point.

 

Exfiltration & C2  There must be some kind of way out of here                                  (Said the joker to the thief)

In order for any information to leave your organization there has to be an outbound exfiltration channel that is set up prior. Obviously, this is something that the APT has been working to accomplish in the initial phases of compromise. Again, going back to the analogy of the shoplifter, this is another area where the APT has to diverge from the normal behavior of a user. The APT needs to establish a series of outbound channels to move the data out of the organization. In the earlier days, a single outbound TCP encrypted channel would be established to move data as quickly as possible. But now that most threat protection systems are privy to this, they tend to establish networks that can utilize a series of shorter lived outbound sessions, moving only smaller portions of the data so as to blend in to the normal activities of the network. But even with this improvement in technique, they still have to diverge from the normal user pattern. If you are watching close enough you will catch it. But you have to watch close and you have to watch 24 by 7.

Here is a list of things that you want to look for,

1). Logon activity

Logon’s to new or unusual systems can be a flag of malicious behavior. New or unusual session types are also an important flag to watch for, particularly new or unusual out bound encrypted session. Other flags are unusual time of day or location. Watch also for jumps in activity or velocity as well as shared account usage or privileged accounts.

2). Program execution

Look for new or unusual program executions or the execution of the programs at unusual times of the day or from new or unusual locations. Or the executing of the program from privileged account status rather than a normal user account.

3). File access

You want to catch data acquisition attempts before they succeed with access, but if you can’t, you at least want to catch the data as it attempts to leave the network. Look for unusual high volume access to files servers or unusual file access patterns. Also be sure to monitor cloud based sharing uploads as these are a very good way to hide in the flurry of other activity.

4). Network activity

New IP addresses or secondary addresses can be a flag. Unusual DNS queries should be looked into, particularly those with a bad or no reputation. Look for the correlation between the above points and new or unusual network connection activity. Also look for unusual or suspicious application behaviors. These could be dark outbound connections that may use lateral movement internally. Many C2 channels are established in this fashion.

5). Database access

Most users do not have to access the database directly. This is an obvious flag, but also look for manipulated applications calls that doing sensitive table access, modifications or deletions. Also be sure to lock down the database environment by disabling many of the added options that most modern databases provide. Be aware that many of them are enable by default. Be sure to be aware of what services are exposed out of the database environment. An application proxy service should be implemented to prevent direct access in a general fashion.

6). Data Loss Prevention methods

Always monitor sensitive data movement. As pointed out in the last blog, if you have performed your segmentation design correctly according to the confidential data footprint then you should already have isolated communities of interest that you can monitor very tightly, particularly at the ingress and egress to the microsegments. Always monitor FTP usage as well as mentioned earlier cloud services.

Analysis, but avoid the paralysis

The goal is to arrive at a risk score based on the aggregate of the above. This involves the session serialization of hosts as they access resources. As an example a new secondary IP address is created and an outbound encrypted session is established to a cloud service, but earlier in the day or perhaps during the wee hours that same system accessed several sensitive file servers with the administrator profile. Now this is a very obvious set of flags, these can and will be increasingly more subtle and difficult to tease out. This is where security analytics enters the picture. There are many vendors out there who can provide products and solutions in this space. There are several firms and consortiums that can provide ratings for these various vendors so we will not even attempt to replicate here. The goal of this section is on how to use it.

The problem with us humans is that if we are barraged with tons of data and forced to do the picking out of significant data, we are woefully inefficient. First of all, we have a very large capacity for missing certain data sets. How often have you heard the saying, “Another set of eyes”? It’s true, though we don’t like to admit it, when faced with large data sets we can miss certain patterns that others will see and visa-versa. This brings two lessons two lessons. First never manually analyze data alone, always have another set of eyes go over it. Second, perhaps we are not the best choice for this type of activity. There is another reason to look at though. It’s called bias. We are emotional beings. While we like to think we are always intellectual in our decisions this has been proven not to be the case. As a matter of fact, many neurologist researchers are saying that without emotions, we really can’t make a decision. At its root decision making for us is an emotional endeavor.

So enter computers and the science of data analytics. Computers and algorithms do not exhibit the same shortcomings as us humans. But they exhibit others. They are extremely good at sifting through large sets of data and identifying patterns then analyzing them against certain rules such as those noted above. They are also extremely fast in these tasks when compared to us. What they offer will be unadulterated and pure without bias, IF and only if the algorithms are written correctly and do not induce any bias in their design. This whole subject warrants another blog article sometime, but for now let be safe to say that algorithms and theories of operation as well as application design are all done by us. So the real fact of the matter is that there will be biases that are embedded into any solution. But there is one thing that computers do not do well yet. They can’t look at patterns and emotionally ‘suspect’ an activity ‘knowing’ the normal behavior of a user. As an example, to say to itself, “Fred just wouldn’t do this type of thing. Perhaps his machine has been compromised. I think I should give him a call before I escalate this. We can confiscate the machine if this is true, get him a replacement and get the compromised unit into forensics.” Note that I say for now. Artificial intelligence is moving forwards at rapid pace, but what is to say that AI will eventually roadblock on bias just like we have! Many cognitive researches are now coming to this conclusion. So it is clear that we and computers will be co-dependent for the foreseeable future, each side keeping the other from invoking bias. The real fact is that there will always be false negatives and false positives. The cyber-security universe simply moves too fast to assume otherwise. So the concept of setting and forgetting is not valid here. These systems will need assistance from humans, particularly once a system has been identified as ‘suspect’.

Automation and Security

At Avaya we have developed a shortest path bridging networking fabric we refer to as SDN Fx that is based on three basic self-complimentary security principles.

Hyper-segmentation

This is a new term that we have coined to indicate the primary deltas of this new approach to traditional network micro-segmentation. First, hyper-segments are extremely dynamic and lend themselves well to automation and dynamic service chaining as is often required with software defined networks. Second, they are not based on IP routing and therefore do not require traditional route policies or access control lists to constrict access to the micro-segment. These two traits create a service that is well suited to security automation.

 

Stealth

We have spoken to this many times in the past. Due to the fact that SDN Fx is not based on IP, it is dark from an IP discovery perspective. Many of the topological aspects to the network, which are of key importance to an APT simply cannot be discovered by traditional port scanning and discovery techniques. So the hyper-segment holds the user or intruder into a narrow and dark community which has little or no communications capability with the outside world except through well-defined security analytic inspection points.

Elasticity

This refers to the dynamic component. Due to the fact that we are not dependent on IP routing to establish service paths, we can extend or retract certain secure hyper-segments based on authentication and proper authorization. Just as easily however, SDN FX can retract a hyper-segment, perhaps based on an alert from security analytics that something is amiss with the suspect system. But as we recall, we may not want to simply cut the intruder off but place them into a forensic environment where we can watch their behavior and perhaps gain insight into methods used. There may even be the desire to redirect them into Honey pot environments where whole network can be replicated in SDN Fx for little or no cost from a networking perspective.

Welcome to my web (It’s coated with honey! Yum!)

If we take the concept of the honey pot and extend it with SDN Fx, we can create a situation where the APT no longer has complete confidence of where they at and whether they are looking at real systems. Recall that the APT relies on shifting techniques that evolve over time, even during a single attack scenario. There is no reason why you could not so the same. Modern virtualization of servers and storage along with the dynamic attributes of SDN Fx create an environment where we can keep the APT guessing and ALWAYS without a total scope of knowledge about the network. Using SDN Fx we can automate paths within the fabric to redirect suspect or known malicious systems to whatever type of forensic or honey pot service required.

Avaya has been very active in building out the security ecosystem in an open system approach with a networking fabric based on IEEE standards. The concept of closed loop security now becomes a reality. But we need to take it further. Humans still need to communicate and interact about these threats on a real time basis. The ability to alert staff for threats and even set up automated conferencing where staf can compare data and decide on the next best course of action are now possible as such services can be rendered in only a couple of minutes in an automated fashion.

figure-5
Figure 6. Hyper-segmentation, Stealth and Elasticity to create the ‘Everywhere Perimeter’

All of this places the APT in a much more difficult position. As the illustration above shows, hyper-segmentation creates a series of hurdles that need to be compromised before access to a given resource is possible. Then it becomes necessary to create out bound channels for the exfiltration of data across the various hyper-segment boundaries and associated security inspection points. Also note that as the figure above illustrates, you can create hyper-segments where there simply is no connectivity to the outside world. For all intents and purposes they are totally and completely orthogonal. The only way to gain access is to actually log into the segment. This creates even more difficultly for the APT as exfiltration becomes more difficult and if you are watching, easier to catch.

In summary

One could say and most probably should say that this was occurrence that was bound and destined. While I don’t like the term ‘destined’, I must admit that it is particularly true here. As our ability to communicate and compute has increased it has created a new avenue for illegal and illegitimate usage. The lesson here is that the Internet does not make us better people. It only makes us better at being what we already are. It can provide immense transformative power to convert folks to perform unspeakable acts and it can in a few hours’ notice take a global enterprise to its knees.

But it can also be a force for a very powerful good. As an example, I am proud to be involved in the effort on behalf of colleagues such as Mark Fletcher and Avaya in the wider sense to support Kari’s law for the consistent behavior of 9-1-1 emergency services. Mark is also actively engaged abroad in the subject of emergency response as I am for security. The two go hand in hand in many respects because the next thing the APT will attempt is to take out our ability to respond. The battle is not over. Far from it.

 

 

 

 

 

 

 

 

Advertisements

What’s the Big Deal about Big Data?

July 28, 2014

Title

It goes without saying that knowledge is power. It gives one the power to make informed decisions and avoid miscalculation and mistakes. In recent years the definition of knowledge has changed slightly. This change is the result of increases in the ease and speed in computation as well as the shear volume of data that these computations can be exercised against. Hence, it is no secret that the rise of computers and the Internet has contributed significantly to enhance this capability.
The term that is often bantered about is “Big Data”. This term has gained a certain mystique that is comparable to cloud computing. Everyone knows that it is important. Unless you have been living in a cave, you most certainly have at least read about it. After all, if such big names as IBM, EMC and Oracle are making a focus of it then it must have some sort of importance to the industry and market as a whole. When pressed for a definition of what it is however, many folks will often struggle. Note that the issue is not that it deals with the computation of large amounts of data as its name implies, but more so that many folks struggle to understand what it would be used it for.
This article is intended to clarify the definition of Big Data and Data Analytics/Data Science and what they mean. It will also talk about why they are important and will become more important (almost paramount) in the very near future. Also discussed will be the impact that Big Data will have on the typical IT departments and what it means to traditional data center design and implementation. In order to do this we will start first with the aspect of knowledge itself and the different characterizations of it that have evolved over time.

I. The two main types of ‘scientific’ knowledge

To avoid getting into an in depth discussion of epistemology, we will limit this section of the article to just the areas of ‘scientific’ knowledge or even more specifically, ‘knowledge of the calculable’. This is not to discount other forms of knowledge. There is much to be offered by spiritual and aesthetic knowledge as well as many other classifications including some that would be deemed as scientific, such as biology*. But here we are concerned with knowledge that is computable or knowledge that can be gained by computation.

* This is rapidly changing however. Many recent findings show that many biological phenomena have mathematical foundations. Bodily systems and living populations have been shown to exhibit strong correlations to non-linear power law relationships. In a practical use example, mathematical calculations are often used to estimate the impact of an epidemic on a given population.

Evolving for centuries but coming to fruition with Galileo in the 16th century, it was discovered that nature could be described and even predicted in mathematical terms. The familiar dropping of balls of different sizes and masses from the tower of Pisa is a familiar myth to anyone with even a slight background in the history of science. I say myth, because it is very doubtful that this had ever literally taken place. Instead, Galileo used inclined planes and ‘perfect’ spheres of various densities to calculate the fact that the gravitational pull is a constant regardless of size or mass. Lacking an accurate timekeeping device, he would sing a song to keep track of the experiments. Being an accomplished musician, he had a keen sense of timing. The inclined planes provided him the extended time for such a method. He correctly realized that it was resistance or friction that caused the deltas that we see in the everyday world. While everyone knows that when someone drops a cannon ball and a feather off of a roof, the cannon ball will strike the earth first. It is not common sense that in a perfect vacuum both the feather and the cannonball will fall at the exact same rate. It actually takes a video to prove it to the mind and this can be found readily if one looks on the Internet. The really important thing about this is that Galileo calculated this from his work with spheres and inclined planes and that the actual experiment was not carried out until many years after his death as the ability to generate a perfect vacuum did not exist at the time he lived. I find this very interesting as it says two things about calculable knowledge. First, it allows one to explain why things occur as they do. Second, and perhaps more importantly, it allows one to predict the results once one knows the mathematical pattern of behavior. Galileo realized this. Even though he was not able to create a perfect vacuum, by the meticulous calculation of the various values involved (with rather archaic mathematics – the equal sign had not even been invented yet, nor most of the symbols that we know as familiar) he was able to arrive at this fact. Needless to say, this goes against all common sense and experience. So much so, that this, as well as his workings with the fledgling science of astronomy, almost landed him on the hot seat (or stake) with the Church. As history attests however, he stuck to his guns and even after the Inquisitional Council had him recant his theories on the heliocentric nature of the solar system, he whispered of the earth… “Yet it still moves”.
If we fast forward to the time of Sir Issac Newton, this insight was made crystalline by Newton’s laws of motion which described the movement of ‘everything’ from the falling of an apple (no myth – this actually did spark his insight but it not hit him on the head) to the movement of the planets with a few simple mathematical formula. Published as the ‘Philosophiae Naturalis Principia Mathmatica’ or simply ‘Principia’ in 1687, this was the foundation of modern physics as we know it. The concept that the world was mathematical or at least could be described by mathematical terms was now something that was not only validated but demonstrable. This set of events led to the eventual ‘positivist’ concept of the world that reached its epitome with the following statement made by Pierre Laplace in 1814.
“Consider an intelligence which, at any instant, could have knowledge of all forces controlling nature together with the momentary conditions of all the entities of which nature consists. If this intelligence were powerful enough to submit all of this data to analysis, it would be able to embrace in a single formula the movements of the largest bodies in the universe and those of lighter atoms; for it, nothing would be uncertain; the future and the past would be equally present to its eyes.”

Wow. Now THAT’s big data! Sound’s great! What the heck happened?

Enter Randomness, Entropy & Chaos

In the roughly same time frame as Laplace, many engineers were using these ‘laws’ to attempt in the optimization of new inventions like the steam engine. One such researcher was a French scientist by the name of Nicholas-Leonard-Sadi Carnot. The research that he focused on was the movement of heat within the engine and to conserve as much of the energy as possible for work. In the process he came to realize that there was a feedback cycle within the engine that could be described mathematically and even monitored and controlled. He also realized the fact that some heat is always lost. It just gets radiated out and away from the system and is unusable for the work of the engine. As anyone that has stood next to a working engine of any type will attest, they tend to get hot. This cycle bears his name as the Carnot cycle. This innovative view led to the foundation of a new branch in physics (with the follow on help of Ludwig Boltzman) known as thermodynamics; the realization that all change in the world (and the universe as a whole) is the movement of heat, more specifically, hot to cold. Without going into detail on the three major laws of thermodynamics, the main point to this discussion is that as change occurs it is irreversible. Interestingly, recently developed information theory validates this as it shows that order can actually be interpreted as ‘information’ and that over time this information is lost to entropy in that there is a loss of order. Entropy is as such a measurement of disorder within a system. This brings us to the major inflection point on our subject. As change occurs, it cannot be run in reverse like a tape and arrive at the same inherent values. This is problematic, as the laws of Newton are not reversible in practice, though they may be on a piece of paper. As a matter of fact, many such representations up to modern times, such as the Feynman Diagrams to illustrate the details of quantum reactions are in fact reversible. What gives?
The real crux of this quick discussion is the realization that reversibility is largely a mathematical expression that starts to fall apart as the number of components in the overall system gets larger. A very simple example is one with two billiard balls on a pool table. It is fairly straightforward to use the Newtonian laws to reverse the equation. We can also do so in practice. But now let us take a single queue ball and strike a large number of other balls. Reversing the calculation is not nearly so straightforward. The number of variables to be considered begins to go beyond our ability to calculate much less control. They most certainly are not reversible in the everyday sense. In the same sense, I can flip a deck of playing cards in the air and bet you with ultimate confidence that the cards will not come down in the same order (or even the same area!) as in which it was thrown. Splattered eggs do not fall upwards to reassemble on the kitchen counter. And much to our chagrin, our cars do not repair themselves after we have had a fender bender. This is the term of entropy, the 2nd law of thermodynamics which states that some energy within a system is always lost to friction and heat. This dissipation could be minimized but never eliminated. As a result the less entropy an engine generates the more efficient it is in its function. Hmmmm, what told us that? A lot of data, that’s what, and back then things were done with paper & pencil! A great and timely discovery for its time as it helped move us into the industrial age. The point of all of this however is that in some (actually most) instances, information on history is important in understanding the behavior of a system.

The strange attraction of Chaos

We need to fast forward again. Now we are in the early 1960’s with a meteorologist by the name of Edward Lorenze. He was interested in the enhanced computational abilities that new computing technology could offer in the goal of predicting the weather. Never mind that it took five days worth of calculation to arrive at the forcast for the following day. At least the check was self evident as it already occurred four days ago!
As the story goes he was crunching some data one evening and one of the machines ran out of paper tape. He quickly refilled the machine and started it from where the calculations left off… manually by typing them in. He then went off and grabbed a cup of coffee to let the machine churn away. When he returned he noticed that the computations where way off the values that the sister machines were running. In alarm he looked over his work to find that the only real major difference was the decimal offset of the initial values (the interface only allowed a three place offset while the actual calculation was running with a six place offset). As it turns out the rounded values he typed in manually created a different result to the same calculation. This brought about the realization that many if not most systems are sensitive and at times extremely so to something now termed as ‘initial conditions’.
There is something more however. Lorenze discovered that if some systems are looked at long enough and with the proper focus of granularity, a quasi-regular or quasi-periodic pattern becomes discernible that allows for the general qualitative description of a system and its behavior without the ability to quantitatively say what the state of any particular part of the system may be at a given point in time. These are termed as mathematical ‘attractors’ within a system. A certain set of power law based formula that a system is, if left unperturbed, drawn to and will be maintained. These attractors are quite common. They are somewhat required for all dissipative systems. In essence, it is a behavior that can be described mathematically that by its nature keeps a system as a system, with just enough energy coming in to offset the entropy that must inevitably go out. The whole thing is fueled by the flow of energy (heat) through it. By the way, both you and I are examples of dissipative systems and yes we are based on a lot of information. But here is something to consider, stock markets are dissipative systems too. The only difference is that energy is replaced by money.

The problem with Infinity

The question is how sensitive do we have to be and to what level of focus will reveal a pattern? How many decimal places can you leave off and still have faith in the calculations that result? This may sound like mere semantics, but the calculable offset in Lorenzes’ work created results that were wildly different. (Otherwise he might very well have dismissed it as noise*)

* Actually in the electronics and communications area this is exactly what the phenomenon was termed as for decades. Additionally, it was termed as ‘undesirable’ and engineers sought to remove or reduce it so it was never researched further as to its nature. Recently efforts to leverage these characteristics are being investigated.

Clearly the accuracy in a given answer is dependent on how accurately the starting conditions are measured. Again, one might say that, OK perhaps this is the case for a minority of cases but that in most cases any difference will be minor. Again, this is alas not true. Most systems are like this. The term is ‘non-linear’. Small degrees of inaccuracy in the initial values of the calculations in non-linear systems can result in vastly different end results. One of the reasons for this is that with the seemingly unassociated concept of infinity, we touch on a very sticky subject. What is an infinite or infinitely accurate initial condition? As an example, I can take a meter and divide it by 100 to arrive at centimeters and then take a centimeter and divide it further to arrive at millimeters and so forth… This process can go on forever! Actually, this is not the case but the answer is not appeasing to our cause. We can continue to divide until we arrive at Planck’s constant which is the smallest recognizable unit of difference before the very existence of space and time become meaningless! In essence a foam of quantum probability from which emerges existence as we know it.
The practical question must be, when I make a measurement how accurate do I need to be? Well, if I am cutting a two by four for the construction of some macro level structure such as a house or shed, I only need to be accurate to the 2nd maybe 3rd decimal place. On the other hand, if I am talking about cutting a piece of scaffolding fabric to fit surgically into a certain locale within an organ to facilitate a substrate for regenerative growth, the orders of magnitude are very much increased. Possibly out to 6 or 8 decimal places. So the question to ask is how do we know how accurate we have to be? Here comes the pattern part! We know this by the history of the system we are dealing with! In the case of a house, we have plenty of history (a strong pattern – we have built a lot of houses) to deduce that we need only be accurate to a certain degree and the house will successfully stand. In the case of micro-surgery we may have less history (a weaker pattern – we have not done so many of these new medical procedures), but enough to know that a couple of decimal places will just not cut it. Going further we even have things like the weather where we have lots and lots of historic data but the exactitude and density of the information still limits us to only a few days of relatively accurate predictive power. In other words, quite a bit of our knowledge is dependent on the granularity and focus in which it’s analyzed. Are you starting to see a thread? Wink, wink.

Historical and Ahistorical knowledge

It all comes down to the fact that calculable knowledge is dependent on us having some idea of the history & conditions of a given system. Without these we can not calculate. But how do we arrive at these initial values? Well, by experiment of course. We all recall the days back in school with the tedious hours of experimentation in exercises where we knew full well the result. But think of the first time that this was realized by the likes of say Galileo. What a great moment it would have been! But an experiment by definition cannot be a ‘onetime thing’. One would have to run an experiment multiple times with ‘exactly’ the same conditions or varying the conditions slightly in a controlled fashion depending on what one was trying to prove. This brings about a strong concept of history. The experimental operations have been run, and we know that such a system behaves in such a way due to historical and replicable examples. Now we plug those variables into the mathematics and let it run. We predict from those calculations and then validate with further experiments. Basic science works on these principals, so as such we should say that all calculable knowledge is historic in nature. But it could also be said in argument that for certain immutable ‘mathematical truths’ that some knowledge is ahistorical. In other words, like Newton’s laws* and like the Feynman diagrams some knowledge just doesn’t care about the nature or direction of times arrow. Be that as it may it would further be argued that any of these would require historical knowledge in order to interpret their meaning or even find that they exist!

* Newton’s laws are actually approximations of what is reality. In normal everyday circumstances the linear laws work quite well. When speed or acceleration is brought to extremes however the laws fail to yield a correct representation. Einstein’s General Theory of Relativity provides for a more accurate way to represent the non-linear reality under these extreme conditions (actually they exist all the time, but in normal environments the delta to the linear is so small as to be negligible). The main difference – In Newton’s laws space and time are absolute. The clock ticks the same regardless of motion or location, hence linear. In Einstein’s theory space and time are mutable and dynamic. The clock ticks differently for different motions or even locations. Specifically, time slows with speed as the local space contracts, hence non-linear.

As an example, you can toss me a ball from about ten feet away. Depending on the angle and the force of the throw I can properly calculate where the ball will be at a certain point in time. I have the whole history of the system from start to finish. I may use an ahistorical piece of knowledge (i.e. the ball is in the air and moving towards me), but without knowledge of the starting conditions for this particular throw I am left with little data and will likely not catch the ball. In retrospect though, it’s amazing that our brains can make this ‘calculation’ all at once. Not explicitly of course but implicitly. We know that we have to back up or run forward to catch the ball. We are not doing the actual calculations in our heads (at least I’m not). But if I were to run out onto the field and see the ball that you threw in mid air with no knowledge of the starting conditions, I am essentially dealing with point zero in knowledge of a system that is pre-existing. Sounds precarious and it is. Because this is the world we live in. But wait! Remember I have a history in my head on how balls in air behave! I can reference this library and get a chunk of history in very small sample periods (the slow motion effect we often recall) and yes perhaps I just might catch that ball – provided that the skill of the thrower was consummate with the skill of those I have knowledge of. Ironically, the more variability there is in my experience with throwers of different skill levels; the higher the probability of my catching the ball in such an instance. And it’s all about catching the ball! But it also says something important about calculable knowledge.

Why does this balloon stay round? The law of large numbers

Thankfully, we live in a world full of history. But ironically, too much history can be a bad thing. More properly put, too specific of a history about a component within a system can be a bad thing. This was made apparent by Ludwig Boltzman in his studies of gasses and their inherent properties. While it is not only impractical but impossible to measure the exact mass and velocity of each and every constituent particle at each and every instant, it is still possible to determine their overall behavior. (He was making the proposition based on the assumption of the existence of as of yet unproven molecules and atoms.) As an example, if we have a box filled with air on one side and no air (a vacuum) on the other, we can be certain that if we lift the divider between the two halves, the particles of air will spread or ‘dissipate’ into the other side of box. Eventually, the gas in the now expanded box will have diffused to every corner. At this point any changes will be random. There is no ‘direction’ in which the particles will have to go. This is the realization of equilibrium. As we pointed out earlier this is simply entropy, reaching its ultimate goal within the limits of the system. Now let us take this box and make it a balloon. If we blow into it, the balloon will inflate and there will be equal distribution of whatever is used to fill it. Note that now the balloon is a ‘system’. After it cools to uniform state the system will reach equilibrium. But the balloon still stays inflated. Regardless of the fact that there is no notable heat movement within the balloon, it still remains inflated by the heat contained within the equilibrium. After all we did not say that there was no heat. We just said that there was no heat movement or more so that it has been slowed drastically. In actuality, it was realized that it was the movement of the molecules and this residual energy (i.e. the balloon at room temperature) that caused the pressure to keep the balloon inflated.*

* Interesting experiment… blow up a balloon and then place it in the freezer for a short while.

Boltzman, as a result of this realization was able to manipulate the temperature of a gas to control its pressure in a fixed container and visa-versa. This showed that the increase in heat actually caused more movement within the constituent particles of gas. He found that while it was futile to try and calculate what is occurs to a single particle; it was possible to represent the behavior of the whole mass of particles in the system by the use of what we now call statistical analysis. An example is shown in figure 1. What it illustrates is that as the gas heats up the familiar bell curve flattens and hence widens the probability that a given particle will be at a certain speed and heat level.

Figure 1

Figure 1. Flattening Bell curves to temperature coefficients

This was a grand insight, and it has enabled a whole new branch of knowledge which for better or worse; has helped shape our modern world. Note I am not gushing over the virtues of statistics, but it does when properly used have strong merits and it has enabled us to see things to which we would otherwise be blind. And after all, this is what knowledge is all about right? But wait, I have more to say about statistics. It’s not all good. As it turns out even if used properly, it can have blind spots.

Those pesky Black Swans…

There is a neat book written on the subject by a gentleman by the name of Nicholas Teleb*. In it he artfully speaks to the improbable but possible. Those events that occur every once in a while to which statistical analysis is often blind. These events are termed as ‘Black Swans’. He goes on to show these events are somewhat invisible to normal statistical analysis in that they are improbable events on the ‘outside’ of the Bell Curve. (Termed as ‘outliers’) He also goes on to indicate what he thinks is the cause. We tend to get myopic on the trends and almost convince ourselves of their dependability. We also do not like to think of ourselves as wrong or somehow flawed in our assumptions. He points out that in today’s world of information, there is almost too much of it and that you can find stats or facts just about anywhere to fit and justify your belief in that dependability. He is totally correct. Statistics is vulnerable to this. Yet, I need to correct that just a bit. It’s not statistics that is at fault. The fault lies with those using it as a tool.

* The Black Swan – Random House

Further, Taleb provides some insight to things that might serve as flags or ‘tell tales’ to Black Swans. As an example, he notes that prior to all drastic market declines they behaved in a spiky, intermittent behavior that, while still in norm with the Gaussian, had an associated ‘noise’ factor. Note that parallel phenomenon exists within electronics, communications and yes you guessed it, the weather! This ‘noise’ tends to indicate ‘instability’ where the system is about to change in a major topological fashion to another phase. These are handy things to know. Note how they deal with the overall ‘pattern’ of behavior. Not the statistical mean or even median.

Why is this at all important?

At this point you might be asking yourself. Where am I going with all of this? Well, it’s all about Big Data! As we pointed out, all knowledge is historical even if gained by ahistorical (law) insight. Properly understanding a given system means that one needs to understand not only those statistical trends, but higher level patterns of behavior that might betel outliers and black swans. All of this requires huge amounts of data of potentially wide varieties as well. Think of a simple example of modeling for a highway expansion. You go through the standard calculation and then consider that you want to add into consideration the local seasonal weather patterns. Things have exponentially increased in computation and data store requirements. This is what the challenge of Big Data is all about. It is in the realization, that it is not intended on handling the ‘simple’ questions. It is intent on pushing out the bounds of what is deemed tractable or calculable in the sense of knowledge. It’s not that the mathematics did not exist in the past. It’s just now that capability is within ‘everyday’ computational reach. Next let’s consider the use cases for Big Data and perhaps touch on a few actual implementations that you could actually run in your data center.

 

II. Big Data – What’s it good for? Absolutely everything! Well, almost…

If you will recall we spoke about dissipative systems. As it turns out, almost everything is dissipative in nature. The weather, the economy, the stock market, international political dynamics, our bodies, one could even say our own minds. Clearly, there is something to consider in all of that. The way humans behave is a particularly quirky thing. They (we) are also as a result the primary drive and input into the many of the other systems such as economics, politics, the stock market and yes even the weather. Further understanding in these areas could and actually have proven to be profound.
These are important things to know and we will talk a little later as to these lofty goals. But in reality Big Data can have far more modest goals and interests. A good real world example is for retail sales. It gets back to the age old adage… “Know your customer.” But in today’s cyber-commerce environment that’s often easier said than done. Fortunately, there are companies that are working in this area. One of the real founders to this is Google. Google is an information company at its heart. When one thinks about the sheer mass of information that it possesses it is simply boggling. Yet, Google strongly needs to leverage and somehow make sense of that data. At the same time however it had practical limits on computational power and associated costs for it. Out of these competing and contradictory requirements came the realization of a parallel compute infrastructure that leverages off the shelf commodity systems. Initially it was introduced to the public in a series of white papers as the Google File System or GFS and other ‘sister’ papers such as MapReduce, which provides for key/value mappings and Big Table, which can represent structured data into the environment. This technology has since been embraced by the open source community and is now known as Apache Hadoop Distributed File System or HDFS. The figure below shows the evolution of these efforts into the open source community.

Figure 2

Figure 2. Hadoop outgrowth and evolution into the open source space

The benefits of these developments are important as they provide for the springboard for the use of big data and data analytics in the typical Enterprise IT environment. Since this inception a literal market sector has sprung up with major vendors such as EMC and IBM but also startups such as Cloudera and MapR. This article will not go into the details of these different vendor architectures but be it safe to say that each has its spin and secret sauce that differentiates their approach. You can feel free to look into these different vendors and research others. For the purposes of this article we are concerned more so with the architectural principles of Hadoop and what it means to a Data Center environment. In data analytics a lot of data has to be read very fast. The longer it takes for the read time the longer the overall analytics process. HDFS leverages parallel processing at a very low level to provide for a highly optimized read time environment.

Figure 3

Figure 3. A comparison of sequential and parallel reads

In the above we show the same 1 terabyte data file being read by a conventional serial read process versus a Hadoop HDFS cluster which optimizes the read time by an order of ten. Note that the same system type is being used in both instances, but in the HDFS scenario there is just a lot more of them. Importantly, the actual analytic programming runs in parallel as well. Note also that this is just an example. The typical HDFS block size is 64 or 128MB. This means that relatively large amounts of data can be processed extremely fast with a somewhat modest infrastructure investment. As an additional note, HDFS also provides for redundancy and resiliency of data by the use of replication of the distributed data blocks within the cluster.
The main point is that HDFS leverages on a distributed data footprint rather than a singular SAN environment. Very often HDFS farms are comprised completely of Direct Attach Storage systems that are tightly coupled via the data center network.

How the cute little yellow elephant operates…

Hadoop is a strange name, and a cute little yellow elephant as its icon is even more puzzling. As it turns out one the key developers’ young son had a yellow stuffed elephant that he had named Hadoop. The father decided it would be a neat internal project name. The name stuck and the rest is history. True story, strange as it may seem.
Hadoop is not a peer to peer distribution framework. It is hierarchical, with certain master and slave roles within its architecture. The components of HDFS are fairly straight forward and shown in simplified form in the diagram below.

Figure 4

Figure 4. Hadoop HDFS System Components

The overall HDFS cluster is managed by an entity known as the Namenode. You can think of it as the library card index for the file system. More properly, it generates and manages the meta-data for the HDFS cluster. As a file gets broken into blocks and placed into HDFS, it’s the namenode that indicates where, and the namenode that tracks and replicates if required. The meta-data always provides a consistent map of the distributed file system as to where specific data resides. This is used not only for writing into or extracting out of the cluster, but also for data analytics which requires a reading of the data for its execution. It is important to note that in first generation Hadoop, it was a single point of failure. The secondary namenode in generation 1 Hadoop is actually a housekeeper process that extracts the nodename run-time metadata and copies it to disk in what is known as a namenode ‘checkpoint’. Recent versions of Hadoop now offer redundancy for the namenode. Cloudera for instance provides high availability for the namenode service.
There is a second node known as the Jobtracker. This service tracks the various jobs required to maintain and run over the HDFS environment. Both of these nodes are master role nodes. As such, Hadoop is not a peer to peer clustering technology, it is more so hierarchical.
In the slave role are the datanodes. These are the nodes that actually hold the data that resides within the HDFS cluster. In other words the blocks of data that are mapped by the namenode reside within these systems disks. Most often datanodes are direct attached storage and only leverage SAN to a very limited extent. The tasktracker is a process that runs on the datanodes and are managed and report back to the jobtracker for the various executions that occur within the Hadoop HDFS cluster.
And lastly, one of these nodes, referred to as the ‘edge node’ will have an ‘external’ interface that allows the HDFS environment to be exposed so that PC’s running the Hadoop HDFS client can be provided access.

Figure 5

Figure 5. HDFS Data Distribution & Replication

HDFS is actually fairly efficient in that it incorporates replication into the write process. As shown above, when a file is ingested into the cluster it is broken up into a series of blocks. The namenode utilizes a distribution algorithm to accomplish the mapping of where the actual data blocks will reside within the cluster. A HDFS cluster will have a default replication factor of three. This means that each individual block will be replicated three times and then placed algorithmically. The namenode in turn develops a meta-data map of all resident blocks with the distributed file system. This meta-data is in turn a key requirement for the read function, which is a requirement for analytics.
If a datanode were to fail within the cluster, HDFS will ‘respawn’ the lost data to meet the distribution and replication requirements. All of this means east/west data but it also means consistent distribution and replication which is critical for parallel processing.
HDFS is also rack aware. By this we mean that the namenode can be programmed to recognize that certain datanodes are common to racks and consequently should be taken into consideration during the block distribution or replication process. This awareness is not automatic. It must be programmed by batch or python script. However once it is done it allows the span algorithm to place the first data block on a certain rack and then placing the two replicated blocks into a separate common rack. As shown in the figure below, data blocks A and B are distributed evenly across the cluster racks.

Figure 6

Figure 6. HDFS ‘Rack Awareness’

Note that while the default replication factor is three for HDFS it can be increased or decreased at the directory or even file level. As adjustment to the R factor is done for a certain data set, the namenode assures that data is replicated, spawned or deleted according to that adjusted value.
HDFS uses pipelined writes to move data blocks into the cluster. In figure 7, a HDFS client executes a write for file.txt. As an example, the user might use the copyFromLocal command. The request is sent to the namenode. The namenode responds with a series of metadata telling the client where to write the data blocks. Datanode 1 is the first in the pipeline so it receives the request and sends a ready request to nodes 7 and 9. Nodes 7 and 9 respond and then the write process begins by placing the data block on datanode one where it is then pipelined to datanodes 7 and 9. The write process is not complete until all datanodes respond with a write success. Note that most data center topologies utilize a spine & leaf type topology meaning that most of the rack to rack data distribution must flow up and through the data center core nodes. In Avaya’s view, this is highly inefficient and can lead to significant bottlenecks that will limit the parallelization capabilities of Hadoop.

Figure 7

Figure 7. HDFS pipelined writes

Additionally, recent recommendations are to move to 40 GB interfaces for this purpose. These interfaces most certainly are NOT cheap. With the leaf and spline approach this means rack to rack growth requires large cap/ex outlay at each expansion. Suddenly, the aspect of Big Data and Data Science for the common man is becoming a myth! The network costs start to become the big key investment as the cluster grows and with big data, they always grow. We at Avaya have been focusing on this east/west capacity issue within the data center top of rack environment.
Reads within the HDFS environment happen in a similar fashion. When the Hadoop client requests to reads a given file the name node will respond with the appropriate meta-data so that the client can in turn request the separate data blocks from the HDFS cluster. It is important to note that the meta-data for a given block is in an ordered list. In the diagram below the name node responds with meta-data for data block A as being on datanodes 1, 7 & 9. The client will request the block from the first datanode in the list. Only after a failed response will it attempt to read from the other data nodes.

Figure 8

Figure 8. HDFS ordered reads

Another important note is that the read requests for data blocks B & C occur in parallel. It is only after all data blocks have been confirmed and acknowledged that a read request is deemed complete. Finally, similar to the write process, any rack to rack east/west flows need to flow over the core switch in a typical spine and leaf architecture. But it is important to note that most analytic processes will not utilize this type of methodology for ‘reading’. Instead, ‘jobs’ are sent in and partitioned into the environment where the read and compute processes occur on the local data nodes and then reduced into an output from the system as a whole. This provides for the true ‘magic’ of Hadoop, but it requires a relatively large east/ west (rack to rack) capacity and that capacity only grows as the cluster grows.
We at Avaya have anticipated this change of data center traffic patterns. As such we have taken a much more straightforward approach. We call it Distributed Top of Rack or “D-ToR”. ToR switches are directly interconnected using very high bandwidth backplane connections. These 80G+ connections provide ultra-low latency, direct connections to other ToRs to address the expected growth. The ToRs are also connected to the upstream core which can allow for the use of L3 and IP VPN services to ensure security and privacy.

Figure 9

Figure 9. Distributed Top of Rack benefits for HDFS

Note that the D-TOR approach is much better suited for high capacity east/west data flows rack to rack within the data center. Growth of the cluster no longer depends on continual investment in the leaf spline topology, now new racks are simply extended into the existing fabric mesh. Going further, by using front port capacity, direct east/west inter-connects between remote data centers can be created. We refer to this as Remote Rack to Rack. One of the unseen advantages of D-ToR is the reduction of north-south traffic. Where many architects were looking at upgrading to 40G or even 100G uplinks, Avaya’s approach negates this requirement by allowing native L2 east-west server traffic to stay at the rack level. The ports required for this are already in the TOR switches. This provides relief to these strained connections. It also allows for seamless expansion of the cluster without the need to continual capital investment in high speed interfaces.
Another key advantage of D-ToR is the flexibility it provides:
• Server to server connections, in rack, across rows or building to building or even site to site!
The architecture is far superior to other approaches in supporting advanced clustering technologies such as Hadoop HDFS.
• Traffic stays where it needs to be, reserving the North/South links for end user traffic or for advanced L3 Services. Only traffic that classifies as such need traverse the north/south paths.
• The end result is a vast reduction in the traffic on those pipes as well as a significant performance increase for east/west data flows. At far lesser cost.

Figure 10

Figure 10. Distributed Top of Rack modes of operation

Avaya’s Distributed Top of Rack can operate in two different ways-
• Stack-Mode can dual connect up to eight D-ToR switches. The interconnect is 640Gb without losing any front ports! Additionally dual D-ToR switches can be used to scale up to 16 giving a maximum east/west profile of 10 Tb/s
• Fabric-Mode creates a “one hop” mesh which can scale up to hundreds D-ToR switches! The port count tops out at 10 thousand plus 10Gig ports and a maximum east/west capacity of Hundreds of Terabits.

Figure 11

Figure 11. A Geo-distributed Top of Rack environment

Avaya’s D-ToR solution can scale in either mode. Whether the needs are small, large or unknown, D-ToR & Fabric Connect provides unmatched scale, flexibility and perhaps most importantly, the capability to solve the challenges, even the unknown ones that most of us face. As the HDFS farm grows, the seamless expansion capability of Avaya’s D-TOR environment can accommodate it without major architectural design changes.
Another key benefit is that Avaya has solved the complex PCI or HIPAA compliance issues without having to physically segment networks or by adding layers & layers of Firewalls. The same can be said for any sensitive data environments that might be using Hadoop, such as patient medical records, banking and financial information, smart power grid or private personal data. Avaya’s Stealth networking technology (referred to in the previous “Dark Horse” article) can keep such networks invisible and self-enclosed. As a result any attack or scanning surfaces to the data analytics network are removed. The reason for this is that Fabric Connect as a technology is not dependent upon IP as a protocol to establish and end to end service path. This removes on of the primary scaffolding for all espionage and attack methods. As a result the Fabric Connect environment is ‘dark’ to the IP protocol. IP scanning and other topological scanning techniques will yield little or no information.

Using MapReduce to extract meaningful data

Now that we have the data effectively stored and retrievable we will obviously want to exercise certain queries against the data and hopefully receive meaningful answers. MapReduce is the original methodology documented in the Google white papers. Note that it is also a utility within HDFS and is used to chunk and create meta-data for the stored information within the HDFS environment. Data can also be analyzed with MapReduce to extract meaningful secondary data such as hit counts & trends which can serve as the historical foundation for predictive analytics.

Figure 12

Figure 12. A Map Reduce job

Figure 12 shows a MapReduce project being sent into the HDFS environment. The HDFS cluster runs the MapReduce program against the data set and provides a response back to the client. Recall that HDFS leverages parallel read/write paths. MapReduce builds on this foundation. As a result, east/west capacity and latency are of important consideration in the overall solution.
• Avaya’s D-TOR solution provides easy and consistent scaling of the rack to rack environment as the Hadoop farm grows.

The components of MapReduce are relatively simple.

First there is the Map function, which provides the meta-data context within the cluster. So there is an independent record transformation that is a representation of the actual data. This includes deletions, replications to the system. For analytics, the function is performed against key value (K,V) pairs. The best way to describe it is to give an example. Let’s say a word, and we want to see how often it appears in a document or a given set of documents. Let’s say that we are looking for the word ‘cow’. This becomes the ‘key’. Every time the MapReduce function ‘reads’ the word cow it ticks a ‘value’ of 1. As the function proceeds through the read job various ticks are appended into a list of key/value pairs such as cow,31 or there are ‘31’ instances of the word ‘cow’ in the document or set of documents. For this type of job the reduce function is a method to aggregate the results from the Map phase and provide a list of key value pairs that are to be construed as the answer to the query.
Finally, there is the framework function which is responsible for scheduling and re-running of tasks. It also provides all utility functions such as providing a split to the input, which becomes more apparent on the figure below. But it actually refers to the chunking functionality that we spoke of earlier as data is written into HDFS. Typically, these queries are constructed into a larger framework. The figure shows a simple example of a query framework.

Figure 13

Figure 13. A simple Map Reduce word count histogram

Above we see a simple word count histogram, which is the exact process we talked about previously. The upper arrow shows data flow across the MapReduce process chain. As data is ingested into the HDFS cluster it is chunked into blocks as previously covered. The map function makes this read against the individual blocks of data. For purposes of optimization there are copy, sort and merge functions that provide for the ability to aggregate the resulting lists of key value pairs. This is referred to as the shuffle phase and it is accomplished by leveraging on east/west capacity within the HDFS cluster. From this the reduce function reduces the received key value outputs as a single statement (i.e. cow,31)
In the example above we show a construct to count for three words; Cow, Barn and Field. The details for two of the key value queries are shown. The third is simply an extension of that which is shown. From this we can infer that among these records cow appears with field more often than barn. This is obviously a very simple example with no real practical purpose unless you are analyzing dairy farmer diaries. But it illustrates the potential of the clustering approach in facilitating data farms that are well suited to the process of analytics which leverage very heavily on read performance.
In another more practical example, let’s say that we want to implement an analytics function for customer relationship management. We would want to know things like how often key words such as ‘refund’ or ‘dissatisfied’ or even terms like ‘crap’ and ‘garbage’ come up in queries of emails, letters or even transcripts of voice calls. Such information is obviously valuable and can gain an insight to customer satisfaction levels.
As one might guess, things could very quickly get unwieldy dealing with large numbers of atomic key/value queries. YARN, which stands for ‘Yet Another Resource Nanny’, allows for the building of complex tasks that are represented and managed by application masters. The application master starts and recycles tasks and also requests resources from the YARN resource manager. As a result a cycling self-managing job could be run. Weave is an additional developing overlay that provides for more extensive job management functions.

Figure 14

Figure 14. Using Hadoop and Mahout to analyze for credit fraud

The figure above illustrates a practical functional use of the technology. Here we are monitoring incoming credit card transactions for flagging to analysts. Transaction data will be flagged key value pairs. Indeed there may be dozens of key value pairs that are part of this initial operation. This provides for the consistent input into the rest of the flow. LDA scoring based on Latent Dirichlet Allocation allows for a comparative function against the normative set. It can also provide a predictive role. This step provides a scoring function on the generated key value pairs. At this point LDA provides a percentile of anomaly to a transaction. From there further logic can then impact a given merchant score.
All of this is based on yet another higher level construct known as Mahout. Mahout provides for an orchestration and API library set that can execute a wide variety of operations, such as LDA.
Examples are, Matrix Factorization, K Means & Fuzzy K Means, Logic Regression, Naïve Bayes and Random Forest. All of which in essence are packaged algorithmic functions that can be performed against the resident data for analytical and/or predictive purposes. Further these can be cycled such as the example above which would operate on each fresh batch presented to it.
Below is a quick definition of each set of functions for reference:

Matrix Factorization –
As its name implies this function involves factorizing matrixes. Which is to say to find two or more matrixes that when multiplied will yield the original matrixes (i.e. the other matrixes as a result must be subsets of the original). This can be used to discover latent features between entities. Factoring more than two matrixes requires the use of tensor mathematics which would be more complicated. A good example of use is in movie popularity and ratings matches such as done by NetFlix. Film recommendations can be made fairly accurately based on identifying these latent features. A subscriber rating, their interests in venues and the rating of those with similar interests can yield an accurate set of recommended films that the subscriber is likely to enjoy.

K-Means –
K-Means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. K-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into something termed as Voronoi cells. These cells are based on common attributes or features that have been identified. Uses for this are learning common aspects or attributes to a given population so that it can be subdivided or partitioned into various sub populations. From there things like logic regression can be run on the sub-populations.

Fuzzy K-Means –
K-Means clustering is what is termed ‘hard clustering’. In hard clustering, data is divided into distinct clusters, where each data element belongs to exactly one cluster and only one. In fuzzy clustering, also referred to as soft clustering, data elements can belong to more than one cluster, and associated with each element is a set of membership levels. These indicate the strength of the association between that data element and a particular cluster. Fuzzy clustering is a process of assigning these membership levels, and then using them to assign data elements to one or more clusters. A particular data element can then be rated as to its strongest memberships within the partitions that the algorithm develops.

Logic Regression –
In statistics, logistic regression, or logic regression, is a type of probabilistic statistical classification model. Logistic regression measures the relationship between a categorical dependent variable and one or more independent variables, which are usually (but not necessarily) continuous, by using probability scores as the predicted values of the dependent variable. Logic regression is hence used to analyze probabilistic relationships between different variables within a particular set of data.

Naïve Bayes –
In machine learning environments, naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes’ theorem with strong (naive) assumptions of independence between the features. In other words, it knows nothing to start. Naive Bayes is a popular (baseline) method for categorizing text, the problem of judging documents as belonging to one category or the other (such as spam or legitimate, classified terms, etc.) with word frequency as a large part of the features considered. This is very similar to the usage and context information provided by Latent Dirichlet Allocation

Random Forest –
Random Forests is another method for learning & classification of large sets of data from which further regression techniques can be used. Random Forests are in essence constructs of decision trees that are induced in a process known as training. Data is then run through the forest and various decisions are made to learn and classify the data. When building out large forests the concept comes into effect of allowing to decision tree subsets. Weights can then be given to each set and from that further decisions can be made.

The end result of all of these methods is a very powerful environment that is capable of machine learning type phenomena. The best part of it is that it is accomplished with off the shelf technologies. No super computer required. Just a solid distributed storage/compute framework and superior east/west traffic capacity in the top of rack environment. Big Data and Analytics can open our eyes to relationships between phenomena that we would otherwise be blind to. It can even provide us insight into causal relationships. But here we need to tread a careful course. Just because two features are related in some way does not necessarily mean that one causes the other.

A word of caution –

While all of this is extremely powerful, the last comments above should raise a flag to you. Just because you have lots of data and you have all of these fancy mathematical tools at your disposal you can still make some very bad decisions if your assumptions about the meaning of the data is somehow flawed. In other words, good data plus good math with bad assumptions will still yield bad decisions. We also need to remember Mr. Taleb and his black swans. Just because a system has behaved in the past within a certain pattern or range does not mean that it will continue to do so ad infinitum. Examples of these types of systems range from stock exchanges to planetary orbits to our very own bodies! In essence, most systems exhibit this behavior. Does that mean that all of the powerful tools referred to above are rendered invalid and impotent? Absolutely not. But we must remember that knowledge without context is somewhat useless, and knowledge with incorrect context is worse than ignorance. Why? Because we are confident about what it tells us. We like sophisticated mathematical tools that tell us in an oracle like fashion what the secrets of knowledge are within a given system. We have confidence in their findings because of their accuracy. But no amount of accuracy will make an incorrect assumption correct. This is where trying to prove ourselves wrong about our assumptions is very important. One might wonder why there are so many methods that sometimes appear to do the same thing but from a different mathematical perspective. The reason is that these various methods are often run in parallel to yield comparative data sets with multiple replicated studies. By generating large populations of comparative sets another level or hierarchy of trends and relationships becomes visible. Consistency of the sets will generally (but not always) indicate sound assumptions about the original data. Wild variations between sets in turn will usually indicate that something is flawed and needs to be revisited. Note that we are now talking about analyzing the analytical results. But this is not always done. Why? Because many times we don’t want to prove our own assumptions wrong. We want them to be right… no let’s go further – we need them to be right.
A good example is the recent market crash of 2006-2009. Many folks don’t know it but there is a little equation that actually holds a portion of the blame. Well, not really. As it turns out equations are a lot like guns. They are only dangerous when someone dangerous is using it. The equation in question is the Black-Scholes equation. Some have called it one of the most beautiful equations in mathematics. It is a very eloquent piece. Others would call it that because it had another name, the Midas equation. It made folks a ton of money! That is until…
The Black Scholes equation was an attempt to bring rationality to the futures market. This sounds good, but it is based on the concept of creating a systematic method of establishing a value for options before they mature. This also might not be a bad thing if your assumptions about the market are correct. But if there are things that you don’t know (and there always is), then those blind spots could in reality affect your assumptions in an adverse way. As an example, if you are trading on the futures of a given commodity and something happens in the market to affect demand that you did not consider or perhaps weighed its impact incorrectly then guess what… That’s right, you lose money!
In the last market crash that commodity was real estate. As one looks into the detailed history of the crash we can see multiple flawed assumptions that built upon one another. Then to compile the problem the market began to create obscurity by the use of blocks or bundles of mortgages that had absolutely no window into the risk factors associated with those assets. While the banks were buying blind, the banks were of the thought that foreclosures would be a minority and that the foreclosed home can always be sold for the loan value or perhaps greater. To the banks it seems that they couldn’t loose. We all know what happened. Even though the mathematics was elegant and accurate, the conclusions and the advice that was given as a result was drastically flawed and cost the market billions. The lesson, Big Data can lead us astray. It reminds us of the flawed premise of Laplace’s rather arrogant comment back in 1814. There is always something we don’t know about a given system such as a scope of history that we do not know or levels of detail that are unknown to us or perhaps even beyond our measurement. This does not disable data analytics but it puts a limit to its tractability in dealing with real world systems. In the end Big Data does not replace good judgment, but it can complement it.

So how do I build it and how do I use it?

Hadoop is actually fairly easy to install and set up. The major vendors in this space have gone much further in making it easy and manageable as a system. But there are a few general principles that should be followed. First, be sure to size your Hadoop cluster and maintain that sizing ratio as the cluster grows. The basic formula is 4 x D, where D is the data footprint to be analyzed. Now one might say ‘what’? I have to multiply my actual storage requirements by a factor of four!? But do not forget about the Map Reduce flow. The shuffle phase requires datanodes that will act as transitory nodes for the job flow. This extra space needs to be available. So while it might be tempting to float this number, it’s best not to. Below are a few other design recommendations to consider

Figure 15

Figure 15. Hadoop design recommendations

Another issue to consider is the sizing of the individual datanodes within the HDFS cluster. This is actually a soft set of recommendations that greatly depends on the type of analytics. In other words, are you looking to gauge customer satisfaction or model climate change or the stock market? These are obviously many degrees of complexity from one another. So it is wise to think about your end goals with the technology. Below is a rough sizing chart that provides some higher level guidance.

Figure 16

Figure 16. Hadoop HDFS Sizing Guidelines

Beyond this, it is wise to refer to the specific vendors design guidelines and requirements, particularly in the areas of high availability for master node services.
Another question that might be asked is “How do I begin?” In other words, you have installed the cluster and are ready for business but, what to do next? Actually this is very specific to usage and expectations. But we can at least boil it down to a general cycle of data ingestion, analytics and corresponding actions. This is really very similar to well-known systems management theory. A diagram of such a cycle is shown below.

Figure 17

Figure 17. The typical data analytics cycle

Aside from the work flow detail, it cannot be stressed enough, “Know your data”. If you do not know it then make sure that you are working very closely with someone who does. The reason for this is simple. If you do not understand the overall meaning of the data sets that you are analyzing then you are unlikely to be able to identify the initial key values that you need or should be focusing on. So often data analytics is done on a team basis with individuals from various backgrounds within the organization and the data analytics staff will work in concert with this disparate group to identify the key questions that need to be asked as well as the key data values that will help lead towards the construct of an answer to the query. Remember that comparative sets will allow for the validation of both the assumptions that are made on the data model but also on the techniques that are being used to extract and analyze the data sets in question. While it is tempting to jump to conclusions on initial findings, it is always wise to do further studies to validate those findings, particularly if it is a key strategic decision that will result from the analysis.

In summary

We have looked at the history of analytics from its founding fathers to its current state. Throughout, many things have remained consistent. This is comforting. Math is math. Four plus four back in Galileo’s time was the same answer as is today. But we must remember that math is not the real world. It is merely our symbolic representation of it. This was shown by the various discoveries on the aspects of randomness, chaos and infinitudes. We have gone on further in the article to show that the proper manipulation of large sets of data placed against a historical context can yield insights into it that might not be otherwise apparent. Recent trends are to establish methods to visualize the data and the resulting analytics by the use of graphic displays. Companies such as Tableau provide for the ability to generate detailed charts and graphs that can provide a visual view of the results of the various analytic functions noted above. Now a long table or spreadsheet of numbers becomes a visible object that can be manipulated and conjectured against. Patterns and trends can much more easily be picked out and isolated for further analysis. These and other trends are accelerating in the industry and become more and more available to common user or enterprise.
We also talked about the high east/west traffic profiles that are required to support Hadoop distributed data farms and the work that Avaya is doing to facilitate this in the Data Center top of rack environment. We talked about the relatively high costs of leaf spline architectures and Avaya’s approach to the top of rack environment as the data farm expands. Lastly, we spoke to the need for security in data analytics, particularly in the analysis of credit card or patient record data. Avaya’s Stealth Networking Services can effectively provide a cloak of invisibility over the analytics environment. This creates a Stealth Analytics environment from which the analysis of sensitive data can occur with minimal risk.
We also looked at some of the nuts and bolts of analytics and how, once data is teased out, it may be analyzed. We spoke to various methods and procedures, many times which are often worked in concert to yield comparative data sets. These comparative data sets can then be used to check assumptions made about the data and hence the analytic results. Comparative sets can help us measure the validity of the analytics that have been run, or more importantly the assumptions we have made. In this vein we wrapped up with a word of warning as to the use of big data and data analytics. It is not a panacea, nor is it a crystal ball but it can provide us with vast insights into the meaning of the data that we have at our fingertips. With these insights, if the foundational assumptions are sound we can make decisions that are better informed. It can also enable us to process and leverage the ever growing data that we have at our disposal at the pace required for it to be of any value at all! Yet, in all of this we are only at the beginning of the trail. As computing power increases and our algorithmic knowledge of systems increases the technology of data science will reap larger and larger rewards. But it is likely to never provide the foundation for Laplace’s dream.

 

‘Dark Horse’ Networking – Private Networks for the control of Data

September 14, 2013

Dark HorseNext Generation Virtualization Demands for Critical Infrastructure and Public Services

 

Introduction

In recent decades communication technologies have realized significant advancement. These technologies now touch almost every part of our lives, sometimes in ways that we do not even realize. As this evolution has and continues to occur, many systems that have previously been treated as discrete are now networked. Examples of these systems are power grids, metro transit systems, water authorities and many other public services.

While this evolution has brought on a very large benefit to both those managing and using the services, there is the rising spectre of security concerns and the precedent of documented attacks on these systems. This has brought about strong concerns about this convergence and what it portends for the future. This paper will begin by discussing these infrastructure environments that while varied have surprisingly common theories of operation and actually use the same set or class of protocols. Next we will take a look at the security issues and some of the reasons of why they exist. We will provide some insight to some of the attacks that have occurred and what impacts they have had. Then we will discuss the traditional methods for mitigation.

Another class of public services is more so focused on the consumer space but also can be used to provide services to ‘critical’ devices. This mix and mash of ‘cloud’ within these areas are causing a rise in concern among security and risk analysts. The problem is that the trend is well under way. It is probably best to start by examining the challenges of a typical metro transit service. Obviously the primary need is to control the trains and subways. These systems need to be isolated or at the very least very secure. The transit authority also needs to provide for fare services, employee communications and of course public internet access for passengers. We will discuss these different needs and the protocols involved in providing for these services. Interestingly we will see some paradigms of reasoning as we do this review and these will in turn reveal many of the underlying causes for vulnerability. We will also see that as these different requirements converge onto common infrastructures conflicts arise that are often resolved by completely separate network infrastructures. This leads to increasing cost and complexity as well as increasing risk of the two systems being linked at some point in way that would be difficult to determine. It is here where the backdoor of vulnerability can occur. Finally, we will look at new and innovative ways to address these challenges and how they can take our infrastructure security to a new level without abandoning the advancement that remote communications has offered. The fact is, sometimes you do NOT want certain systems and/or protocols to ‘see’ one another. Or at the very least there is the need to have very firm control over where and how they can see one another and inter-communicate. So, this is a big subject and it straddles many different facets. Strap yourself in it will be an interesting ride!

Control and Data Acquisition (SCADA)

Most process automation systems are based on a closed loop control theory. A simple example of a closed loop control theory is a gadget I rigged up as a youth. It consisted of a relay that would open when someone opened the door to my room. The drop in voltage would trigger another relay to close causing a mechanical lever to push a button on a camera. As a result I would get a snapshot of anyone coming into my room. It worked fairly well once I worked out the kinks (they were all on the mechanical side by the way). With multiple siblings it came in handy. This is a very simple example of a closed loop control system. The system is actuated by the action of the door (data acquisition) and the end result is the taking of a photograph (control). While this system is arguably very primitive it still demonstrates the concept well and we will see that the paradigm does not really change much as we move from 1970’s adolescent bedroom security to modern metro transit systems.

In the automation and control arena there are a series of defined protocols that are of both standards based and proprietary nature. These protocols are referred to as SCADA, which is short for Supervisory Control and Data Acquisition. Examples of these protocols on the proprietary side are Modbus, BACnet and LonWorks. Industry standard examples are IEC 61131 and 60870-5-101[IEC101]. Using the established simple example of a closed loop control we will take the concept further by looking at a water storage and distribution system. The figure below shows a simple schematic of such a system. It demonstrates the concepts of SCADA effectively. We will then use that basis to extend it further to other uses.

Figure 1

Figure 1. A simple SCADA system for water storage and distribution

The figure above illustrates a closed loop system. Actually, it is comprised of two closed loops that exchange state information between. The central element of the system is the water tank (T). Its level is measured by sensor L1 (which could be as simple as a mechanical float attached to a potentiometer). As long as the level of the tank is at a certain range it will keep the LEVEL trace as ON. This trace is provided to a device called a Programmable Logic Controller (PLC) or Remote Terminal Unit (RTU). In the case of the diagram it is provided to PLC2. As a result PLC2 sends a signal to a valve servo (V1) to keep it in the OPEN state. If the level were to fall below a defined value in the tank then the PLC would turn the valve off. There may be additional ‘blow off’ valves that the PLC might invoke if the level of the tank grew too high. But this would be a precautionary emergency action. In normal working conditions this would be handled by the other closed loop. In that loop there is a flow meter (F1) that provides feedback to PLC1. As long as PLC1 is receiving a positive flow signal from the sensor it will keep the pump (P1) running and hence feeding water into the system. If the rate on F1 falls below a certain value then it is determined that the tank is nearing full and PLC1 will tell the pump to shut down. As an additional precaution there may be an alternate feed from sensor L1 that will only cause a flag to shut down the pump if the tank level reaches full. This is known as a second loop failsafe. As a result, we have a closed loop self monitoring system that in theory should run on its own without any human intervention. Such systems do. But they are usually monitored by Human Management Interfaces (HMI). In many instances these will literally be the schematic of the system with a series of colors (as an example yellow for off, orange & red for warning & alarm, green for running). In this way, an operator has visibility into the ‘state’ of the working system. HMI’s can also offer human control of the system. As an example, an operator might shut off the pump and override the valve close to drain the system for maintenance. So in that example the closed loop would be extended to include a human who could provide an ‘ad hoc’ input to the system.

The utility of these protocols are obvious. They control everything from water supplies to electrical power grid components. They are networked and need to be due to the very large geographic area that they often are required to cover. This is as opposed to my bedroom security system (it was never really intended on security – it was just a kick to get photos of folks who were unaware) which was a ‘discrete’ system. In such a system, the elements are hardwired and physically isolated. It is hard to get into such a room to circumvent the system. One would literally have to climb in the through the window. This offers a good analogy of what SCADA like systems are experiencing. But also one has to realize that discrete systems are very limited. As an example, it would be a big stretch to take a discrete system to manage a municipal water supply. One would argue that it would be so costly as to make no sense. So SCADA systems are a part of our lives. They can bring great benefit but there is still the spectre of security vulnerability.

Security issues with SCADA

Given that SCADA systems are used to control facilities such as oil, power and public transportation, it is important to ensure that they are robust and have the connectivity to the right control systems and staff. In other words they must be networked. Many implementations of SCADA are L2 using only Ethernet as an example for transport. Recently, there are TCP/IP extensions to SCADA that allow for true Internet connectivity. One would think that this is where the initial concerns for security would lie but actually they are just a further addition the systems vulnerabilities. There are a number of reasons for this.

First, there was a general lack in concern for security as many of these environments were at one time fairly discrete. As an example, a PLC is usually used in local control type scenarios. A Remote Terminal Unit does just what it says. It creates a remote PLC that can be controlled over the network. While this extension of geography has obvious benefits, along with it creep the window of unauthorized access.

Second, there was and still is the general belief that SCADA systems are obscure and not well known. Its protocol constructs are not widely published particularly in the proprietary versions. But as is well known, ‘security by obscurity’ is a partial security concept at best and many true security specialists would say it is a flawed premise.

Third, initially these systems had no connectivity to the Internet. But this is changing. Worse yet, it does not have to be the system itself that is exposed. All an attacker needs is access to a system that can access the system. This brings about a much larger problem.

Finally, as these networks are physically secure it was assumed that some form of cyber-security was realized, but as the above reason points out this is a flawed and dangerous assumption.

Given that SCADA systems control some of our most sensitive and critical systems it should be no surprise that there have been several attacks. One example is a SCADA control for sewer flow where a disgruntled ex-employee gained access to the system and reversed certain control rules. The end result was a series of sewage flooding events into local residential and park areas. Initially, it was thought to be a system malfunction, but eventually the hacker’s access was found out and the culprit was eventually nabbed.  This can even get into International scales. As critical systems such as power grids become networked the security concern can grow to the level of national security interests.

While these issues are not new, they are now well known. Security by Obscurity is no longer a viable option. Systems isolation is the only real answer to the problem.

 

The Bonjour Protocol

On the other side of the spectrum we have a service that is often required at public locations that is the antithesis of the prior discussion. This is a protocol that WANTS services visibility. This protocol is known as Bonjour. Created by Apple™, it is an open system protocol that allows for services resolution. Again it is best to give a case point example. Let’s say that you are a student that is at a University and you want to print a document from your iPAD. You can simply hit the print icon and the Bonjour service will send a SRV query for @PRINTER to the Bonjour multicast address of 224.0.0.251. The receiver of the multicast group address is the Bonjour DNS resolution service which will reply to the request with a series of local printer resources for the student to use. To go further, if the student were to look for an off site resource such as a software upgrade or application, the Bonjour service would respond and provide a URL to an Apple download site. The diagram shows a simple Bonjour service exchange.

Figure 2

Figure 2. A Bonjour Service Exchange

Bonjour also has a way for services to ‘register’ to Bonjour as well. A good example as shown above is in the case of iMusic. As can see the player system can register to local Bonjour Service as @Musicforme. Now when a user wishes to listen they simply query the Bonjour service for @Musicforme and the system will respond with the URL of the player system. This paradigm has obvious merits in the consumer space. But we need to realize that consumer space is rapidly spilling over into the IT environment. This is the trend that we typically hear of as ‘Bring Your Own Device’ or BYOD. The University example is easy to see but many corporations and public service agencies are dealing with the same pressures. Additionally, some true IT level systems are now implementing the Bonjour protocol as an effective way to advertise services and/or locate and use them. As an example, some video surveillance cameras will use Bonjour service to perform software upgrades or for discovery. Take note that Bonjour really has no conventions for security other than the published SRV. All of this has the security world in a maelstrom. In essence, we have disparate protocols evolving out of completely different environments for totally different purposes coming to nest in a shared environment that can be of a very critical nature. This has the makings for a Dan Brown novel!

 

 

Meanwhile, back at the train station…

Let’s now return to our Transit Authority who runs as a part of its services high speed commuter rail service. As a part of this service they offer Business Services such as Internet Access and local business office services such as printing and scanning. They also have a SCADA system to monitor and control the railways. In addition they obviously have a video surveillance system and you guessed it, those cameras use the Bonjour service for software upgrade & discovery. They also have the requirement to run Bonjour for the Business Services as well.

In legacy approaches the organization would need to either implement totally separate networks or a multi-services architecture via the use of Multi-Protocol Label Switching or MPLS. This is an incredibly complex suite of protocols that have very well known CAP/EX and OP/EX requirements and they are high. Running an MPLS network is most probably the most challenging IT financial endeavor that an organization can take on. The figure below illustrates the complexity of the MPLS suite. Note that it also shows a comparison to Shortest Path Bridging IEEE 802.1aq and RFC 6329 as well as the IETF drafts to extend L3 services across the Shortest Path Bridging Fabric.

Figure 3

Figure 3. A comparison between MPLS and SPB

There are two major points to note. First, there is a dramatic consolidation of dependency overlay control planes into a single integrated one provided by IS-IS. Second, as a result to consolidation there results a breaking of the mutual dependence of the service layers into mutually independent service constructs. An underlying benefit is that services are also extremely simple to construct and provision. Another benefit is that these services constructs are correspondingly simpler from an elemental perspective. Rather than requiring a complex and coordinated set of service overlays, SPB/IS-IS provides a single integrated service construct element known as the I-Component Service ID or I-SID.

In previous articles we have discussed how an I-SID is used to emulate end to end L2 service domains as well as true L3 IP VPN environments. Additionally, we covered how I-SID’s can be used dynamically to provide solicited demand services for IP multicast. In this article, we will be focusing on their inherent traits of services separation and control as well as how these traits can be used to enhance a given security practice.

For this particular project we developed the concept of three different network types. Each network type is used to provide for certain protocol instances that require services separation and control. They are listed as follows:

1). Layer 3 Virtual Service Networks

These IP VPN services are used to create a general services network access for general offices and internet access.

2). Local User Subnets (within the L3 VSN)

These are local L2 broadcast domains that provide for normal internet ‘guest’ access for railway passengers. These networks can also support ‘localized’ Bonjour services for the passengers but the service is limited to the station scope and not allowed to be advertised or resolved outside of that local subnet boundary.

3). Layer 2 Virtual Service Networks

These L2 domains are used at a more global level. Due to SPB’s capability to extend L2 service domains across large geographies without the need to support end to end flooding, L2 VSN’s become very useful to support extended L2 protocol environments. Here we are using dedicated L2 VSN’s to support both SCADA and Bonjour protocols. Each protocol will enjoy a private non-IP routed L2 environment that can be placed anywhere within the end to end SPB domain. As such, they can provide global L2 service separated domains simply by not assigning IP addresses to the VLAN’s. IP can still run over the environment as Bonjour requires it, but that IP network will not be visible or reachable within the IS-IS link state database (LSDB) via VRF0.

Figure 4

Figure 4. Different Virtual Service Networks to provide for separation and control.

The figure above illustrates the use of these networks in a symbolic fashion. As can be seen, there are two different L3 VSN’s. The blue L3 VSN is used for internal transit authority employees and services. The red L3 VSN is used for railway passenger internet access. Note that there are two things of signifigance here. First, this is a one way network for these users. They are given a default network gateway to the Internet and that is it. There is no connectivity from this L3 VSN to any other network or system in the environment. Second, each local subnet also allows for local Bonjour services so that users can use their different personal device services without concern that they will go beyond the local station or interfere with any other service at that station.

There are then two L2 VSN’s that are used to provide for inter-station connectivity for the transit authorities use. The green L2 VSN is used to provide for the SCADA protocol environment while the yellow L2 VSN provides for the Bonjour protocol. Note that unlike the other Bonjour L2 service domains for the passengers, this L2 domain can not only be distributed within the stations but between the stations as well. As a result, we have five different types of service domains each one is separated, scoped and controlled over a single network infrastructure. Note that in the case of a passenger at a station who is bringing up their Bonjour client, they will only see other local resources, not any of the video surveillance cameras that also use Bonjour but do so in the totally separate L2 service domain that has absolutely no connectivity to any other network or service. Note also that the station clerk has a totally separate network service environment that gives them confidential access to email, UC and other internal applications that tie back into the central data center resources. In contrast, the passengers at the station are provided Internet access only for general browsing or VPN usage. There is no viable vector for any would be attacker in this network.

Now the transit authority enjoys the ability to deploy these service environments at will any where they are required. Additionally, if requirements for new service domains come up (entry and exit systems for example), they can be easily created and distributed without a major upheaval of the existing networks that have been provisioned.

 

Seeing and Controlling are two different things…

Sometimes one service can step on another. High bandwidth resource intense services such as multicast based video surveillance can tend to break latency sensitive services such as SCADA. In a different example project, these two applications were in direct conflict. The IP multicast environment was unstable causing loss of camera feeds and recordings in the video surveillance application. The SCADA based traffic light control systems experienced daily outages. In a traditional PIM protocol overlay we require multiple state machines that run in the CPU. Additionally, these state machines are full time meaning that they need to consider each IP packet separately and forward accordingly. For multicast packets there is an additional state machine requirement where there may be various modes of behavior based on whether it is a source or a receiver and whether or not the tree is currently established or extended. These state machines are complex and they must occur for every multicast group being serviced.

Figure 5

Figure 5. Legacy PIM overlay

Each PIM router needs to perform this hop by hop computation, and this needs to be done by the various state machines in a coordinated fashion. In most applications this is acceptable. As an example, for IP television delivery there is a relatively high probability that someone is watching the channels being multicast (if not, they are usually promptly identified and removed. Ratings will determine the most viewed groups). In this model, if there is a change to the group membership, it is minor and at the edge. Minor being the fact that one single IP set top box has changed the channel. The point here is that this is a minor topological change to the PIM tree and might not even impact it at all. Also, the number of sources is relatively small to the community of viewers. (200-500 channels to thousands if not tens of thousands of subscribers)

The problem with video surveillance is that this model reverses many of these assumptions and this causes havoc with PIM. First, the ratio of sources to receivers is reversed.  Also, the degree of the ratio changes as well.  As an example, in a typical surveillance project of 600 cameras there could be instances as high as 1,200 sources with transient spikes that will go higher during state transitions. Additionally, video surveillance applications typically have the phenomenon of ‘sweeps’, where a given receiver that is currently viewing a large group of cameras (16 to 64) will suddenly change and request another set of groups.

At these points the amount of required state change in PIM can be significant. Further, there may be multiple instances of this occurring at the same time in the PIM domain. These instances could be humans at viewing consoles or they could be DVR type resources that automatically sweep through sets of cameras feeds on a cyclic basis. So as we can see, this can be a very heavy lift application for PIM and tests have validated this. SPB offers a far superior method for delivering IP multicast.

Now let us consider the second application, the use of SCADA to control traffic lights. Often referred to as Intelligent Traffic Systems or ITS. Like all closed loop applications, there is a fail safe instance which is the familiar red and yellow flashing lights that we see occasionally during instances of storms and other impediments to the system. This is to assure that the traffic light will never fail in a state of permanent green or permanent red. As soon as communication times out, the failsafe loop is engaged and maintained until communications is restored.

During normal working hours the traffic light is obviously controlled by some sort of algorithm. In certain high volume intersections this algorithm may be very complex and based on the hour of the day. In most other instances the algorithm is rather dynamic and based on demand. This is accomplished by placing a sensing loop at the intersection. (Older systems were weight based while newer systems are optical.) As a vehicle pulls up to the intersection its presence is registered and a ‘wait set’ period is engaged. This presumably allows enough time for passing traffic to move through the intersection. In instances or rural intersection this wait set period will be ‘fair’. Each direction will have equal wait sets. In urban situations where there are minor roads intersecting with major routes the wait set period will be in strong favor of the major route. With a relatively large wait set period for the minor road. The point in all of this is that these loops are expected to be fairly low latency and there is not expected to be a lot of loss in the transmission channel. Consequently, SCADA tends towards very small packets that expect very fast round trip with minimal or no loss. You can see where I am going here. The two applications do not play well together. They require separation and control.

Figure 6

Figure 6. Separation of IP multicast and Scada traffic by the use of I-SIDs

As was covered in a previous article (circa June 2012) and also shown in the illustration above. SPB uses dynamic build I-SIDs with a value greater than 16M to establish IP multicast distribution trees. Each multicast group uses a discrete and individual I-SID to create a deterministic reverse path forwarding environment. Note also that the SCADA is delivered via a discrete L2 VSN that is not enabled for IP multicast or any IP configuration for that matter. As a result, the SCADA elements are totally separated from any IP multicast or unicast activity. There is no way for any traffic from the global IP route or ip vpn environment to get forwarded into the SCADA L2 VSN. There is simply no IP forwarding path available. The figure above illustrates a logical view of the two services.

 The end result of the conversion changed the environment drastically. Since then they have not lost a single camera or had any issues with SCADA control. This is a direct testament to the forwarding plane separation that occurs with SPB. As such both applications can be supported with no issues or concerns that one will ‘step on’ the other. It also enhances security for the SCADA control system. As there is no IP configuration on the L2 VSN (note that IP could still ‘run’ within the L2VSN – as for example as is possible with the SCADA HMI control consoles), there is no viable path for spoofing or launching a DOS attack.

What about IP extensions for SCADA?

As was mentioned earlier in the article there are methods to provide for TCP/IP extension for SCADA. Due to the criticality of the nature of the system however, this is seldom used due to the costs of securing the IP network from threat and risk. As with any normal IP network, protecting them to the required degree is difficult and costly. Particularly since the intention of the protocol overlay is provide for things like mobile and remote access to the system. Doing this with the traditional legacy IP networking with would be a big task.

With SPB, L3 VSN’s could be used to establish a separated IP forwarding environment that can then be then directed to appropriate secure ‘touch points’ within a predefined point in the topology of the network. Typically, this will be a Data Center or a Secured DMZ adjunct from it. There all remote access is facilitated through a well defined security series of Firewalls, IPS/IDS’s and VPN Service points. As it is the only valid ingress into the L3 Virtual Service environment, it is hence much easier and less costly to monitor and mitigate any threats to the system with clear forensics in the aftermath. The illustration below shows this concept. The message is that while SPB is not a security technology in and of itself, it is clearly a very strong compliment to those technologies.  If used properly it can provide the first three of the ‘series of gates’ in the layered defense approach. The diagram below shows how this operates.

Figure 7

Figure 7. SPB and the ‘series of gates’ security concept

In a very early article on this blog post I talked to the issues and paradigms of trust and assurance. (See Aspects and characteristics of Trust and its impact on Human Dynamics and E-Commerce – June 2009)
 There I introduced the concept of composite identities and the fact that all identities in cyber-space are as such. This basic concept is rather obvious when it speaks to elemental constructs of device/user combinations, but it gets smeared when the concept extends to applications or services. Or it can extend further to elements such as location or systems that a user is logged into. These are all elements of a composite instance of a user and they are contained within a space/time context. As an example, I may allow user ‘A’ for access application ‘A’ from location ‘A’ with device ‘A’. But any other location, device or even time combination may provide a totally different authentication and consequent access approach. This composite approach is very powerful. Particularly when combined with the rather strong path control capabilities of SPB. This combination yields an ability to determine network placement based on user behavior patterns. Those expected and within profile, but more importantly for those that are unusual and out of the normal users profile. These instances require additional challenges and consequent authentications.

As noted in the figure above, the series of gates concept merges well within this construct. The first gate provides identification of a particular user/device combination. From this elemental composite, network access is provided according to a policy. From there the user is limited to the particular paths that provide access to a normal profile. As a user goes to invoke a certain secure application, the network responds with an additional challenge. This may be an additional password or perhaps a certain secure token and biometric signature to reassure identity for the added degree of trust. This is all normal. But in the normal environment the access is provided at the systems level thereby increasing the ‘smear’ of the user’s identity. A critical difference in the approach I am referring to is that the whole network placement profile of the user changes. In other words, in the previous network profile the system that provides the said application is not even available by any viable network path. It is by the renewal of challenge and additional tiers of authentication that such connectivity is granted. Note how I do not say access but connectivity. Certainly systems access controls would remain but by and large they would be the last and final gate. At the user edge, whole logical topology changes occur that place the user into a dark horse IP VPN environment where secure access to the application can be obtained.

Wow! The noise is gone

In this whole model something significant occurs. Users are now in communities of interest where only certain traffic pattern profiles are expected. As a result, zero day alerts of anomaly based IPS/IDS systems become something other than white noise. They become very discrete resources with an expected monitoring profile and any anolamies outside of that profile will flag as a true alert that should be investigated. This enables zero day threat systems to work far more optimally as their theory of operation is to look for patterns outside of the expected behaviors that are normally seen in the network. SPB compliments this by keeping communities strictly separate when required. With a smaller isolated community it is far easier to use such systems accurately. The diagram below illustrates the value of this virtualized Security Perimeter. Note how any end point is logically on the ‘outer’ network connectivity side. Even though I-SID’s traverse a common network footprint they are ‘ships in the night’ in that they never see one another or have the opportunity to inter-communicate except by formal monitored means.

Figure 8

Figure 8. An established ‘virtual’ Security Perimeter

Firewalls are also notoriously complex when they are used for community separation or multi-tenant applications. The reason for this is that all of the separation is dependent on the security policy database (SPD) and how well it covers all given applications and port calls. If a new application is introduced and it needs to be isolated the SPD must be modified to reflect it. If this gets missed or the settings are not correct, the application is not isolated and no longer secure. Again SPB and dark horse networking help in controlling user’s paths and keeping communities separate. Now the firewall is white listed with a black list deny all policy after that. Now as new applications get installed unless they are added to the white list, they will be isolated by default within the community that they reside in. There is far less manipulation of the individual SPD’s and far less risk of an attack surface developing in the security perimeter due to a misssed policy statement.

 

Time to move…

There is another set of traits that are very attractive about SPB and particularly what we have done with it at Avaya in our Fabric Connect. It is something termed as mutability. In the last article on E-911 evolution we talked to this a little bit. Here I would like to go into it in a little more detail. IP VPN services are nothing new. MPLS has been providing such services for years. Unlike MPLS however, SPB is very dynamic in the way it handles new services or changes to existing services. Where the typical MPLS infrastructure might require hours or even days for the provisioning process, SPB can accomplish the same service in a matter of minutes or even seconds.  This is not taking into account that MPLS must also require the manual provisioning of alternate paths. With SPB not only are the service instances intelligently extended across the network by the shortest path, they are also provided all redundancy and resilience by virtue of the SPB fabric. If alternate routes are available they will be used automatically during times of failure. They do not have to be manually provisioned ahead of time. The fabric has the intelligence to reroute by the shortest path automatically. At Avaya, we have tested our fabric to a reliable convergence of 100ms or under with the majority of instances falling into the 50ms level. As such mutability becomes a trait that Avaya alone can truly claim. But in order to establish what that is let’s realize that there are two forms.

1). Services mutability

This was covered to some degree in the previous article but to review the salient points. It really boils down to the fact that a given L3 VSN can be extended anywhere in the SPB network in minutes. The principles pointed out from the previous article illustrate that membership to a given dark horse network can be rather dynamic and can not only be extended but retracted as required. This is something that comes as part and parcel with Avaya’s Fabric Connect. While MPLS based solutions may provide equivalent type services, none are as nimble, quick or accurate in prompt services deployment as Avaya’s Fabric Connect based on IEEE 802.1aq Shortest Path Bridging.

2). Nodal mutability

This is something very interesting and if you ever have the chance with hands on experience, please try it. It is very, very profound. Recall from previous articles, that each node holds a resident ‘link state database’ generated by IS-IS that reflects its knowledge of the fabric from its own relative perspective. This knowledge not only scopes topology but resident provisioned services as well as those of other nodes. This creates a situation of nodal mutability. Nodal mutability is the fact that a technician out at the far edge of the network can accidentally swap the two (or more) uplink ports and the node will still join the network successfully. Alternatively, if a node were already up and running and for some reason port adjacencies needed to change. It could be accommadated very easily with only a small configuration change. (Try it in a lab. It is very cool!) Going further on this logic the illustration below shows that a given provisioned node could unplug from the network and then drive over 100’s of kilometers to another location.

Figure 9

Figure 9. Nodal and Services Mutability

At that location, they could plug the node back into the SPB network and the node will automatically register the node and all provisioned services. If all of these services are dark horse then there will authentication challenges into the various networks that the node provides as users access services. This means in essence that dark horse networks can be extremely dynamic. They can be mobile as well. This is useful in many applications where mobility is desired but the need to re-provision is frowned upon or simply impossible. Use cases such as emergency response, military operations or mobile broadcasting are just a few areas where this technology would be useful. But there are many others and the number will increase as time moves forward. There is no corresponding MPLS service that can provide for both nodal and services mutability. SPB is the only technology that allows for it via IS-IS, and Avaya’s Fabric Connect is the only solution that can provide this for not only L2 but L3 services as well as for IP VPN and multicast.

Some other use cases…

Other areas where dark horse networks are useful are in networks that require full privacy for PCI or HIPPA compliance. L3 Virtual Service Networks are perfect for these types of applications or solution requirements. Figure 8 could easily be an illustration for a PCI compliant environment in which all subsystems are within a totally closed L3 VSN IP VPN environment. The only ingress and egress are through well defined virtual security perimeters that allow for the full monitoring of all allowed traffic. This combination yields an environment that, when properly designed, will easily pass PCI compliancy scanning and analysis. In addition, these networks not only are private – they are invisible to external would be attackers. The attack surface is mitigated to the virtual security parameter only. As such, it is practically non-existent.

In summary

While private IP VPN environments have been around for years they are typically clumsy and difficult to provision. This is particularly true for environments where quick dynamic changes are required. As an example, the typical MPLS IP VPN provisioning instances will require approximately 200 to 250 command lines depending on the vendor and the topology. Interstingly much of this CLI activity is not in provisioning MPLS but in provisioning other supporting protocols such as IGP’s and BGP. Also, consider that all of this is for just the initial service path. Any redundant service paths must then be manually configured. Compare with Avaya’s Fabric Connect which can provide the same service type with as little as a dozen commands. Additionally, there is no requirement to engineer and provision redundant service paths as they are already provided by SPB’s intelligent fabric.

As a result, IP VPN’s can be provisioned in minutes and be very dynamically moved or extended according to requirements. Again, the last article on the evolution of E-911 speaks to how an IP VPN morphs over the duration of a given emergency with different agencies and individuals coming into and out of the IP VPN environment on a fairly dynamic basis based on their identity, role and group associations.

Furthermore, SPB nodes are themselves mutable. Once again, IS-IS provides for this feature. An SPB node can unplug from the network and move to the opposite end of the topology which can be 100’s or even 1000’s of kilometers away. There they can plug back in and IS-IS will communicate the nodal topology information as well as all provisioned services on the node. The SPB network will in turn extend those services out to the node thereby giving complete portability to that node as well as its resident services.

In addition, SPB can provide separation for non IP data environments as well. Protocols such as SCADA can enjoy an isolated non IP environment by the use of L2 VSN’s and further they can be isolated so that there is simply no viable path into the environment for would be hackers.

This combination of privacy, fast mutability of both services and topology lend to what I term as a Dark Horse Network. They are dark, so that they can not be seen or attacked due to the lack of surface for such an endeavor. They are swift in the way they can morph by services extensions and they are extremely mobile, providing for the ability for nodes to make whole scale changes to the topology and still be able to connect to relevant provisioned services without any need to re-configure. Any other IP VPN technology would be very hard pressed to make such claims, if indeed they can make them at all! Avaya’s Fabric Connect based on IEEE 802.1aq sets the foundation for the true private cloud.

 Feel free to visit my new You Tube Channel! Learn how to set up and enable Avaya’s Fabric Path Technology in a few short step by step videos.

http://www.youtube.com/channel/UCn8AhOZU3ZFQI-YWwUUWSJQ

Aspects and characteristics of Trust and its impact on Human Societal Dynamics and E-Commerce

June 3, 2009

 

Introduction
While recent developments in electronic commerce have fueled a surge in interest around the subject of trust, it is an aspect of human interaction that is as old as civilization itself. Going further, it could even be said that it is one of  its foundations.

We typically think of trust as something that spans between two or more humans and provides a basis for their interactions. While this is a true characterization of trust it is not an exclusive one. Recent advances in technology within the past 20 years have served to greatly change both the scope and meaning of this paradigm. One of the primary reasons for this is that the scope and capability of interaction has increased in a like manner. Interactions occur not only between humans, but human to machine as well as machine to machine. Furthermore, these interactions can be chained by way of a conditional policy basis to allow for complex communication profiles that in some instances may not involve the direct participation of a human at all.

This paper is intended to analyze the subject of trust and its close association to other subjects such as risk, assurance and identity and the impact that it has on technology and the dynamics of human interaction. We will begin by looking at trust in the basic definition and historical (as well as pre-historical) context. This will serve to set the stage for later focus into the impact on areas of technology and advanced communication capabilities that have become prevalent in our lives. It is the hope of the author that this diatribe will allow for a better understanding of the subject from both a philosophical and practical standpoint.

 

How do I trust you?

This is the classic question, and one that is hard to quantify. Indeed, the answer may be different for different people. The reason for this is that some people are simply more ‘trusting’ (the cynical reader might think ‘gullible’) than others. There is also a degree of context, which is very closely related to assumed risk on behalf of the trusting party that comes into play with every decision of trust. If we think about it, the manifestations can quickly become boggling. After all, there is a big difference between trusting your neighbors’ kid to cut your grass versus trusting that same kid to baby-sit your own. There are certain pieces of additional information that you will typically need to extend your trust into the deeper context. This additional information will typically (if you decide to let him or her baby-sit) provide you with the additional level of assurance to extend the trust into the new scenario.

So while the possible manifestations are quite numerous and complex, we can already see that there are some common elements that are present in every instance. The first point being that trust is always extended based on some level of assurance. The second point being that this relationship between trust and assurance is dependant upon the context of the subject matter on which trust is established. Going further, this context will always have an element of risk that is assumed by the extension of trust. This results in a threefold vector relationship that is shown in figure 1. What the diagram attempts to illustrate is that the threefold vector is universal and that the subjects of trust (the context of its extension, if you will) fall in relative positions on the trust axis.

The vector relationship between context, assurance & trust

Figure 1. The vector relationship between context, assurance & trust

 As the figure above illustrates there is a somewhat linear relationship between the three vectors. It is the subject of trust that provides for the degree of non-linearity. Some subjects are rather relative. As an example, I might not me too picky about my lawn but others might be as sensitive as to rate the level of trust to be in close equivalence to baby-sitting their kids. Some parents may be so sensitive to the issue of baby-sitting that they will require a full back ground check prior to the extension of trust. In other instances, things are rather absolute. A good example of this is the trusting of an individual with the ‘Foot Ball’, which is a top secret attaché that covers the instructions, authorization & access codes for the nuclear warheads involved in the defense of the United States of America. For this subject, we are assuming that the individual is a member of the Department of Defense with relatively high rank and has passed the integrity and background checks as well as psychological stability testing to provide the level of assurance to extend what could be perceived as the ultimate level of trust. Also consider that there is no ‘one’ person that has this type of authority, it is a network of individuals that needs to act in a synchronous fashion according to well defined procedures. This reduces the possibility of last minute ‘rogue’ behavior.

There is another thing to consider as well and this is something known as an assurance cycle. For some extensions of trust, a one time validation of assurance is all that is required. As an example, even the pickiest of yard owners will typically validate someone’s skill just once. After that, there is the assumption that that skill level is appropriate and is unlikely to change. This is often the case as well for baby-sitting. Seldom will even the most selective parents do a full background check every time the same kid is brought in to do the job. It will usually take some exception, such as a poor mowing job or a bad event during baby-sitting that causes this degree of trust to be compromised and hence require re-validation. There are some positions however that are extremely sensitive, have huge potential impact and are non-reversible. A good example is the extension of trust to handle the ‘Foot Ball’. In this instance, there are several regular security and psychological tests that occur as well as random spot testing and background checks to assure that the individual’s integrity as well as those that support him or her are not in any way compromised.

So from the above we can assume that there are four major elements to trust. First is the aspect of context, what is it that the trust is about. In this there is always an element of risk that the party who extends the trust assumes; Second is the level of assurance, what will it take to enable and establish the extension of trust; The third is the element of validation, how often will I need assurance to keep the extension of trust and then finally there is the element of trust itself.

There are also modes of trust that occur, some of which are deemed to be more solid that others. These modes are found in three basic types. First there is what is termed as ‘initial trust’. This is the trust that you need to get up out of bed in the morning to face the world. This is basically the concept that the world is not outright hostile and that while still a jungle you have trust in your own ability to make progress in it. A good example is that in most neighborhoods you can pass someone on the sidewalk and ‘trust’ in the fact the individual will not try to attack you. Note that this requires a two way equation, the other individual has to have the same perception. This is a key ingredient and provides the bootstrap for the other two – more sophisticated forms of trust. Another commonly used term for this is trusting at ‘face value’. Second is something termed as ‘associative trust’. This is the extension of trust to someone or something based on the reference and recommendation of another individual in which you have already established a trusting relationship. Both initial trust and associative trust could be classified as temporary states of trust that require the third and last mode which is ‘assured trust’. This is where the initial trust is then validated by actual experience or some other system of assurance. This and associative trust provide a degree of historical context to the paradigm and begins to develop the concept of reputation. In essence, (though perhaps not always true) if you were trustworthy in the past it is likely that you will be trustworthy in the future. As an example, if my neighbor told me that a certain kid was great lawnmower, I am more likely to extend the initial trust based on this recommendation. Once the kid performs the job well and up to expectations the mode of trust then becomes extended to ‘assured’. I have seen the job that the kid does with my own eyes (note I have extended some degree of risk here – he could have scalped my lawn) and I am now happy with the job. The relationship with the kid is now direct, between my self and him or her. The neighbor has faded off as the relationship has matured. Although the neighbor’s opinion may still carry some value; for instance if I were told of something being damaged or stolen I might experience a compromise in the degree of assured trust that has been established between myself and the rumored individual. This can begin to uncover the potential corrosive effects of gossip and hearsay in inter-personal relationships but it also shows the capability of social systems to create feedback loops in which trust can be built up or eroded based on an individual’s behavior.

One last aspect to consider is the fact of identity. This may seem out of place in this face to face example. Obviously, I do not need identification to be assured of the fact the neighborhood kid is who he says he is. I can see this with my own eyes and establish it with easy conversation. However, there is something known as abstraction that becomes prevalent in more complex examples of trust. Also, as the assumed risk gets higher along with the increase in abstraction, the need to be certain of an individual’s identity becomes a requirement. As we shall see though, this is not required, or rather it is more implicit in the simpler examples of trust. However, as human interaction becomes more indirect and the relationship of worth to risk becomes higher, getting assurance of an individual’s identity becomes explicitly paramount.

I have this goat that I would like to trade for your cow

Since it is established that trust is a major requirement for a human societies, it makes sense to look at the phenomenon in the context of human societal evolution. For this, we need to look at the historical use of trust, particularly prior to the recent era of technological innovation. This will serve two purposes; first it will provide once again a simplified view of the paradigm. In a sense, it provides a form a reductionism because all of the newer trappings and manifestations of trust that technology requires are removed because they simply did not exist yet. Second, it will serve to provide a view of the phenomenon of trust in context of both social and commercial scopes. As an additional note, the following historical analysis is decidedly ‘western’ in its recourse and perspectives. This is not to indicate that the concept of trust or any of its resulting paradigms are solely western. The focus on western culture is done for one simple reason. Covering all cultural manifestations of trust and their evolutions would be exhaustive and well beyond the intended scope of this paper. Additionally, most if not all of the foundational concepts such as credit and currency are, aside from cultural trappings, largely the same.

If we go back to the time of hunter-gatherers, trust was something that was somewhat limited and narrow in scope. The limitation to the scope was simply because of the fact that humans had contact lists that were numbered in ten or perhaps twenty individuals. These were the individual’s tribe. This is where literally one hundred percent of social interaction took place. Additionally, these individuals were most often direct relations to the individual so there was still a grey area between genetic familial interactions versus interactions of a true non-familial social context. The scope was limited simply because humans did not ‘do’ a lot. We pretty much spent most of our time gathering roots and tubers as well as hunting.

While things were admittedly more limited back then, it could be argued that the table stakes were much higher. A single individual who makes a mistake in a large animal hunt could injure or kill themselves and perhaps several other prime members of the tribe. A single individual who did not know the difference between benign and poisonous plant species could endanger the whole tribe. So while the scope was both limited and narrow, the context was everything. In the high stakes game of Neolithic hunter-gatherer societies a single error would often spell disaster for everyone. For this reason the time of education was often well past adolescence and into young adulthood. Accompanying this were (and still are) complex initiation processes and ceremonies which are basically symbols of the tribes extension of trust to the individual as a fully functioning member of the society.

Here we still have the basic three vector relationship of figure one. Indeed, in order to be universal to the paradigm it needs to be so chronologically as well. There is still, 1). The context of the trust – I will trust you next to me with a spear; 2). The level of assurance – I worked with your father to teach you; and 3). The resultant extension of trust – let’s go hunt something that is ten times our size together. While the whole paradigm is much simpler, the stakes are very high. In some ways they are ultimate. Almost equivalent to the level of trust extended in the example of the ‘Foot Ball’.

With the invention of agriculture the phenomenon of trust had to change and evolve. At first, this was a simple extension to the Neolithic hunter-gatherer model. If you lived in a village on the Asian steppes at the end of the Stone Age it is likely that you were very isolated. It was probably unlikely that you ever saw an individual from a neighboring village as these other villages were often hundreds of miles in the distance. Consequently, the scope of trust was still limited to the tribe. The scope while still limited was becoming less narrow however. The reason for this is the element of possession. With the advent of agriculture and animal husbandry came the concept of possession. After all, if one and ones family spent their time and energy to raise crops and herds. There would undoubtedly evolve a sense of worth and ownership of that worth. With this came the concept of trading and bartering. The introduction of this simplest form of commerce occurred simply because it allowed individuals to specialize and thereby maximize the resources available to the tribe. At first this may have been communal, but as time passed and certain trades became differentiated, a sense of value for those trades became evident. We can see this from the archeological evidence of the early Bronze Age.

Trading in this context almost always happened within the tribe. External trading between tribes did not really occur in mainstream until the advent of the chiefdom. There are several reasons for this as we will later see. At the earlier stage, because of the limited scope, trust was often established on a handshake basis. If an individual wanted to trade an animal for some grain or another animal. The individual in that tribe who specialized in that trade was approached. There was often direct personal relationships that went back several if not dozens of generations between families. Trust you might say was embedded.

Something interesting also happened around the same time. Gradually, it came to light that there was not only a sense of worth for what an individual owned, like a goat or cow; but it began to extend to services that one could render. Such skills as medicine, metal smith, and yes even religion and tribal leadership (often which were synonymous at this stage) could be classified as such. With this splintering of occupations came the abstract concept of a contract. Even though the agreement was more often than not implicit and verbal, it was typically done with witnesses, was based on familial honor and the tribal penalties for breaking good faith were often severe.

As societies embraced all forms of agriculture there resulted in every instance a surge in population within the societies. This created a positive feedback loop that actually better enabled the tribe to survive and in turn grow further. That is… at first. It is commonly assumed that resource shortages are something that is new to humanity. This most definitely is not the case.  Many early societies quickly outstripped their surroundings of one resource or another. Often this resource was water. It is not a coincidence that the first advent of organized chiefdoms occurred in semi-arid regions that were tipping towards further arid conditions. Whether this happened because of communal agreement by all members or by force through a stratification of society (it was usually a combination of both), it is undeniable that this was a trend that occurred globally at various times in pre-history. As this happened throughout the Bronze Age there was an implicit extension of trust to the leader of the tribe that came along with it. It was not always given willingly, but in most instances it was absolute. With this came the evolution of the ‘divine’ rights of chiefs and their families and the quasi-religious merging of tribal leadership and religion that is often a signature of this stage in societal development. Even with this however, most chiefs did not long survive breaches of trust with the populous, at least at first. As ruling classes became more powerful, rule by force became possible and indeed many times attempted. Many things changed at this point as we shall see. Humankind had reached a sort of critical mass.

I want my silk and I am willing to do what it takes to get it

As human society progressed these isolated communities began to reach out and establish contact with one another. The reasons for this were varied but there is no doubt that pre-historic trade was widespread and would even traverse continental boundaries in some instances. There is one thing that is true with primates and humans are no exception. Once different societies or cultures establish contact, ignoring one another is not a long term option. Sooner or later they will interact. Whether this interaction is peaceful or warlike is to a great degree determined by trust. Societies that trust one another tend to establish trade and share cultural traits and ideals. Societies that do not trust one another tend to avoid contact and when they do have contact it tends to be of a violent nature. Again, we can get into boggling possible iterations that might occur for a virtually unlimited set of reasons. In some instances there may be vast ideological differences that cause the animosity. In other instances, (and it should be noted that this is by far the predominant cause) it was based on something known as circumscription. This is when one society sees another in a predatory sense. Most often the reason for predation was for territory or resources, both natural and human. What is important is that this trend again was self reinforcing. As the prevalence of aggressive societies increased there in turn increased the need for strong leadership and military capability within societies as a whole to either carry out the acts of circumscription or defend oneself from it.

At this stage of societal development we see each of the great civilizations enter to the empire phase. This phase which some would argue not to be a phase but an integral characteristic of human culture has dominated our history. As we shall see however, any empire that withstood the test of time realized that in order to do so one must have willing, or at the very least submissive subjects. These subjects must see the empire as the greater good or at least the lesser of two evils. Here we see the beginnings of the concept of a social contract known as citizenship. Where there are certain benefits, privileges and rights to being one. This is something that reached an ancient epitome with the Roman Empire. The wiser emperors were very astute to this concept. Some were masters at public display and acts of imperial benevolence done in a public fashion to assure wide reaching knowledge of the act. Such acts were cheap in relation to the revenue and value that it served to continue securing for the empire. In addition, there was the constant presence of hostile neighbors, which the emperor did not have to necessarily manufacture to create the additional rationale of keeping distant kingdoms within the fold. After all, if the emperor placed enough legions in the locale to defend it, it was often of the dual purpose of keeping it subdued as well.

None the less however, the Romans were keen on extending citizenship. It was once boasted of the Romanized Britain’s that they were ‘more Roman than the Romans themselves’. They were certainly no exception. It was very common across the empire to see a sense of membership in it. Some kingdoms were more willing subjects that others but by and large an entity as large as the Roman Empire simply could not be ruled by force alone. Again, the wiser emperors understood this and leveraged it to the hilt. There was a sense of pride and trust in being a Roman citizen. Particularly if you were a free merchant who looked at trade abroad (across the Mediterranean) as desirable.

Parallel to this is the development and maturity of two other concepts. One is the independent representation of worth. This is the development of a system of currency. This was certainly not new with the Romans but they did bring it to a level of maturity and perfection that can rival the process of today’s mints. Another thing that they did was remove any local intermediary to imperial allegiance. Roman citizens were to declare direct allegiance to the emperor, not to the local king who then claimed allegiance in turn. Each citizen was to take the oath directly. In this sense, a king was no different from his subjects. This way allegiance was not to local kings who could come and go (and be deposed at will by the emperor) but to Rome itself which stood ‘forever’ and was the greater good or greater obligation depending on your perspective. In either case, it superseded any allegiance at a local level.

With such systems in place trade was seen to prosper within the empire. Along with this surge in trade came the relative prosperity of the provinces that participated in it. Aside from the benefits however, there was the required abstraction of worth that came along with it. Within this more sophisticated commercial environment there were many intermediaries. With additional parties and complexities came the inevitable individuals who attempted to circumvent the system of governance. In the simple Neolithic village trade, it was very difficult if not impossible to subvert the trade. The trade was face to face, based on the trust of family to family and the transaction was solid not abstract. It was a real time exchange. There simply was no opportunity for infringement on the transaction. With the introduction of sophisticated monetary based commerce, this was no longer the case. There was now plenty of opportunity for enterprising but less than honest individuals who could now make a ‘little extra’ on the side within the normal flurry of business transactions. As this occurred, more formal systems of governance were created to provide the additional assurance that goods and services were rendered fairly and appropriately. Again, this is not new with the Romans but it could be said that they brought the concept of governance and law to a level of true maturity that there to fore had not been attained by any civilization (perhaps with the exception of China). Indeed, today many countries still base their legal systems on the precepts of Roman law.

If we look at all of this we can begin to see a resonant balance of concepts. Some, like the legal system are positive and reinforcing; others like thievery and embezzlement are negative and corrosive. Others can be either such a reputation. It is the delicate balance of these negative and positive influences that create an ecosystem of trust with the ultimate trust ecosystem being the very existence of civilization itself.

In the late 5th century the emperor Justinian had an issue with getting access to certain eastern products. Justinian tried to find new routes for the eastern trade, which was suffering badly from the wars with the Persians. One important luxury product was silk and the famed purple dye used to color imperial robes, which was imported and then processed in the empire. In order to protect the manufacture of these products, Justinian granted a monopoly to the imperial factories in 541 AD. In order to bypass the Persian land route, Justinian established friendly relations with the Abyssinians, whom he wanted to act as trade mediators by transporting Indian silk to the empire; the Abyssinians, however, were unable to compete with the Persian merchants in India. Then, in the early 550’s, two monks succeeded in smuggling eggs of silk worms from Central Asia back to Constantinople, and silk then became an indigenous Byzantine product.

What we see here is a natural progression of steps that served to provide stronger assurance to Rome that it would get the products that it valued. The first set of steps attempted to remove unpredictable and hostile trade paths with those which were more friendly and stable. The final steps moved to remove intermediaries all together and thereby attain the highest level of assurance by direct control of the product.

All of this was for naught however. Despite all these measures to protect trade, the empire suffered several major setbacks in the course of the 6th century. The first one was the plague, which lasted from 541 to 543 and, by decimating the empire’s population, probably created a scarcity of labor and a rising of wages. The lack of manpower also led to a significant increase in the number of “barbarians” in the Byzantine armies after the early 540s. The protracted war in Italy and the wars with the Persians themselves laid a heavy burden on the empire’s resources, and Justinian was criticized for curtailing the government-run post service, which he limited to only one eastern route of military importance, the silk highway. Also under Justinian I, the army which had once numbered 645,000 men in Roman times, shrank to 150,000 men.

What this in essence shows is that even whole civilizations can collapse under the weight of history, bad circumstance and limited decisions by the ruling party. As the trust in the systems of governance waned, individuals tended to seek security at more local levels. As this happened the implosion of the culture was a certain result. The imperial contract was broken. Feudal society became the method de jure for the next one thousand years.

Adam Smith’s hidden (but shaky) Hand – the rise of the Market

It could be said that as the Roman Empire fell there was a pulling back of trust to the more local and limited scope that was prevalent prior to its existence. It would take several hundred years before economies and systems of trust and governance extended beyond the castle walls once again. With the advent of the renaissance and the rise of the merchant class much of the momentum that had been lost with the fall of Rome began to be regained. Gradually and with an accelerating pace Merchant and Guild classes began to develop.  Modern nationalistic attitudes began to appear and the concept of a ‘marketplace’ began to evolve where trading could occur with the assurance that transactions would happen in a lawful and orderly fashion. Once again we find the threefold vector relationship of context, assurance and trust that served to set the foundations of an independent but entirely abstract entity known as the Market. At first, these early markets were largely under the control of the trading companies. Individuals or businesses could gain a stake into the lucrative potential gains (and associated risks) of ‘global’ trade by investing in shares of the trading company. With this revenue, the trading company would be able to pay for the building of the required ships and crews for the expanding trade routes. The investors made their investment based on the trust in the worth of the shares that they bought. At some point in the future, if the trading expedition went well, the shares would be worth some value above what was invested.

Back at home, less adventurous individuals would focus on crafts trades by gaining access to one of the many Guilds that were springing up across Europe. Again there was an element of trust here. In this instance there is trust in the organization. There was trust in the fact that if one joined a Guild and went through the appropriate training and apprenticeship, one was more or less assured of getting a job upon completion.

As these social constructs began to gain momentum they found an eventual convergence in the industrial revolution and the rise of the modern trading market place. During this time a new branch of science began to be developed known as economics. One of the practitioners of this discipline known as Adam Smith noticed that there was a resonant feedback mechanism between profit and competition that seemed to keep the market balanced so that products and services were levied at fair rates of exchange. This he coined ‘the invisible hand’ of the marketplace. As the concept evolved, several practitioners began to assume that the market was predictable and could be ‘trusted’. This was based on the assumption that market behavior was essentially Gaussian and that it, in combination with this ‘invisible hand’ would serve to provide an overall stability to the marketplace.

As we all know now, this assumption was largely incorrect. The stock market crash and the following deep depression were largely fueled by an overextension in the market that was based on this false assumption of predictability. As a matter of fact, one week prior to crash of October 1929, Irving Fisher of Yale University who was perhaps the most revered US economist of the time claimed that the American economy had reached a “permanently high plateau”.  As little as three years later the national income had fallen by over fifty percent. In essence, no one, not single economist saw it coming. This was a prime example of misplaced trust and overconfidence that had been built up over the centuries from the initial days of the cognizant risk that was assumed by those investing into the early trade expeditions. What served to allow this? Again, it was the abstraction of worth and also of the risk assumed on that worth.

When early investors bought into a voyage, there was a direct one to one relationship to the success or failure of that voyage. If the ship went down so did your profits along with the initial investment. There was very little present to abstract or protect from the risk. In the modern marketplace however, wealth could be moved and transferred from one interest to another. This capability gave the impression of lessened risk. In reality the overall risk was spread among various interests, so it did reduce the risk, but in a single investment, and this is a key point. If the whole market crashed as it did on that fateful Monday morning and all of your assets were in the market at that time it did not matter how well spread out your investments were, the market crashed and so did your assets! There was no difference. In essence, the market was your ship.

What this serves to illustrate is that while abstraction allows for greater scale, volume, and agility; it reduces the overall visibility of assumed risk but does not eliminate the risk itself. This is an important principle that we will re-visit once again as we begin to look at the recent trends of trust in e-commerce.

 The new commerce paradigm

When you purchase something on the web today, you very seldom if ever get a chance to interact with another human being. When you think about it, there is a great degree of abstraction in the e-commerce model that the on line purchaser simply needs to accept. This is nothing new. It has been happening gradually over the years. It was even occurring back in Justinian’s day. After all, it is highly unlikely that Justinian ever met the actual proprietors of the dyes or silks in person. He had emissaries that handled his relationships with them. Note also that in the end he chose to remove all intermediaries to the product including the proprietor.

If we think about it, currency is the first level of abstraction that allows for all the others to occur. The concept of independent representation of worth allows for trading at a distance without moving huge hordes of product as barter or direct trade would require. One party could pay for product with currency, typically gold or silver. As time progressed, the concept of currency evolved into a ‘certificate’ paper form that represented an amount of gold or silver, which is then held in a reserve by some organization. One of the first organizations to do this were the Knights Templar in Europe to provide for safe transfer of wealth to the Holy Land for would be pilgrims. This added an additional level of abstraction, but with this new approach a business deal could happen in a totally separate occurrence from the actual movement of product or gold and this is more often than not the case. This is one of the primary tenets of commodity trading. For many centuries, currency through banking and a postal capacity addressed the requirements of distant trade and commerce. (Remember that Justinian kept the postal service to the east.) In more recent times, we can reference the use of the Pony Express and soon after the locomotive that allowed for the significant growth the countries of North America experienced, but the basic paradigm did not change. It was still a combination of currency and postal service. The only thing that was happening was that the information regarding commerce and the product being traded was moving faster.

All of this changed with the invention of the telegraph and soon afterwards the telephone and the further abstraction of worth, the ‘wiring of currency’. At this point the delta of time between information and product truly diverged. It could be argued that it is easier and faster to move a letter versus goods. However, in most instances, particularly with the locomotive, both moved on the same train. Telecommunications made its big impact by the ability to communicate far faster than the movement of goods. As a matter of fact, it allowed for the total separation of commerce information and product flow. This is the primary feature that has allowed for our modern world.

Everything is Virtual (in its own way)

The inception of the Internet could be viewed as a continuation of the telecommunications commerce paradigm. There is however a critical difference. There is a critical set of additional abstractions that it allows for true e-commerce to occur. The first is that commerce is no longer limited to physical commerce, whether it be products or services. Think about it, with a telephone even of the highest quality channel, the only thing I can do is talk to you. Now granted, there are some things of value here. Perhaps even valuable enough to pay for if I happened to be a lawyer, accountant, or some other form of consultant. The list is pretty narrow though because it has to be limited to talking. The fax machine changed this slightly so that now I can send a facsimile (hence the term ‘fax’) of a document and then talk to you about it over the telephone. There is more value for the service here. In the case of legal consultation, it might be a contract or agreement. In the case accounting it might be a balance sheet or cash flow statement. In either instance the value of the service is increased because you did not have to wait for two or three days for the letter or document to reach you by mail before I can call you about it. For quite some time, this was the state of the art for business communications.

With the Internet however whole processes and services can be productized in a virtual fashion and sold electronically. In essence, currency moves (virtually as well – we shall discuss this next) and nothing happens physically. No product is shipped; no person picks up a hammer or a shovel as a result. Something happens in cyberspace instead. More importantly, something happens in cyberspace that creates an eventual real world result.

There are many companies that serve as examples for this. Paychex™ provides electronic outsourcing of company payrolls. EBay™ provides an on line auctioning service where folks and companies can sell their belongings and products in a virtual garage auction type of setting. In all of this though, on line stock trading is the one with perhaps the biggest impact on the movement of wealth in today’s world. This ability has greatly improved the trader’s response time to market trends. This is accomplished not only by the use of the Internet and computing but by the removal of the intermediaries. (Sounds like Justinian doesn’t it?) While this has certainly been a boon for the typical individual many economists have indicated that the implications can be a knee-jerk economy, where herding behavior among trading communities can be greatly accelerated, sometimes to the detriment of the market.

Along with the virtualization of products and services there has been an equal and parallel trend in the virtualization of wealth. Much of our wealth today is paid out to us and then relayed to those we are indebted to without ever being realized physically. In other words whole cycles of revenue transfer happen in a totally virtual context. As an example, my mobile phone bill is automatically paid by my corporate card, and my corporate card is in turn paid electronically out of my checking account which is funded by electronic deposit by the company’s payroll service. None of the monies ever becomes physically realized. It is the transfer of the balance (in essence nothing more than a number) that moves the wealth. Indeed, at the very base reality it is the manipulation of numbers in different account records that represents the transfer of that wealth. I never touch the gold, but I realize the values of the benefits.

When we put these concepts together we arrive at the contemporary paradigm of e-commerce. Let’s take the example of an individual that buys a product on line and uses a credit card. The e-vendor charges to the account number and the individual incurs a charge on their account. They may have the card set up on an automatic payment from a checking account which in turn is funded by electronic payroll deposit from the company they work for. Everything in the end to end commerce flow is virtual. The only tangibles in the whole end to end commerce model are the hours worked by the individual and the product that (hopefully) eventually arrives at his home in good condition. This is something that most folks simply take for granted. They trust the paradigm. There are others who are more cautious, those who only trust a part or portion of the paradigm. An example would be an individual who is completely comfortable with electronic deposit from their company but prefers to write a check (which is in turn a paper abstraction of wealth that could be viewed as a precursor to the current paradigm) to pay their credit card bill. This same individual however, might be totally amicable with purchasing a product on line from an on line vendor using that card.

Then of course, there are those who would trust no such abstractions. Indeed, there are those who insist on being paid in cash and would not relinquish that cash to any entity for holding. All of their charges and bills they incur and pay on a personal basis. One has to wonder, in today’s society how limiting and restrictive this approach is. Any extension out of the normal day to day life would require significant effort and expense. As well as risk, this individual is carrying his whole wealth on his person. He is at extreme risk on the physical side. He could be mugged and most probably harmed, perhaps killed for the wealth he carries. So any extension of the constricted life style would be more costly, even if it went as projected. So there can be a cost for not trusting as well.

From this we can see a spectrum of trust, one that runs from total trust where everything is virtual to total mistrust where everything is physical. We could also argue to extend this to say that both are extremes and that as such they would represent the population according to Gaussian distribution with the majority of the population lying somewhere in the middle. At both ends of the spectrum there are extremes of risk as well. On the virtual side, all of the risks are in turn virtual (There is however the real loss of wealth in cyber-crime and identity theft. Most credit companies will protect their customers from any charges incurred – this begins to touch on the concepts of insurance and the spreading of the risk factor which we will discuss shortly), on the physical side all of the risks are physical including one critical difference – the risk of physical harm. Indeed, it is most probable that this was one of the primary motivations for abstraction (virtualization) of wealth to begin with. Recall the Templars, who founded the first embodiment of modern banking. They became powerful and wealthy on the holdings and transferal of wealth for pilgrims to the Holy Land so that risk was reduced on the individual who made the trip. In essence, the wealth was ‘virtualized’ during the trip. There was a degree of separation of the individual and their associated wealth. Over course of the sojourn the individual was fed and defended (for a substantial fee) and when they arrived at the Holy Land they could cash in their deposit checks and they were flush once again. The revenues were transferred by more secure military means or more ideally, the revenues existed in Jerusalem prior. Either way, the pilgrim received their gold at the end of the trip, less the substantial fee of course.

 Go ahead – everything will be alright…

If the aspect of risk is somehow primary to trust then there is a related value in the level of assurance provided to the individual entity that enters into the relationship as well. Again these are related in a vector relationship that is exactly that as shown in figure one. As the level of risk gets higher in the trust relationship the level of assurance must in turn be sufficient to ‘cover’ it.  There are more dimensions to consider however. We need to consider the aspect of reward.

Reward could be considered to be a positive dimension of risk. The two exist in opposition. As the ratio of reward to assumed risk becomes higher, it is more likely that an individual will move forward and assume the risk. It is almost as if an individual reduces the risk factor in their own mind when taken in context of reward. This is what causes individuals to do things that they would otherwise not ordinarily do, such as clicking on an icon on a questionable web page. In instances where the degree of risk is higher than the potential reward an individual is likely to pass the opportunity by. This relationship is shown by the diagram below. Note that there are two vectors in this diagram one is the lower risk or liberal risk vector because the expected level of assurance is lower per given equivalency in context. The higher risk vector represents the more conservative risk vector, as stronger expectation of assurance is expected for relatively lower extension of trust. The sinusoidal line in the middle represents the decision vector of the individual or entity. It is represented as such because it could be described as a waveform that is unique to the entity. Some individuals or organizations may be fairly liberal, other may be more conservative, but each one will be sinusoidal in that the decision hinges between perceived potential risk and reward. It is also important to note that at the nexus of the graph the sinusoidal pattern is smaller and increases in relation to the absolute boundary vectors which illustrate the potential range of decision.

Figure_2

   

 

 

 

 

 

Figure 2. The relationship of reward and risk in trust

Note that as the risk and reward grow more significant the sinusoid grows in relation; which represents the state of ‘indecision’ that we typically encounter in high stakes affairs where the risk and reward potentials are exceptionally high.

This is common sense to some degree. Few of us would argue this. However, there are a few important points to consider that are pertinent in today’s ecommerce environment. First, when we say assumed risk or potential reward, we mean ‘perceived’ assumed risk or potential reward. What an individual perceives and what is really occurring are two totally different things. Herein lies the root to all scamming and racketeering activities and the addition of a cyber environment only provides another level of cover for further abstractions between perceptions and truth.

The second important consideration is that assurance (or insurance) can change this relationship. Both can serve to decrease the degree of risk assumed and hence push the individual in the direction of a positive decision.

As an example, neither you nor I would purchase a book from an unknown vendor on line with no validation and no privacy. The level of risk (placing your credit card number on line unprotected) versus the reward (a book – that you must want otherwise we wouldn’t be having this thought exercise) is simply too high. However, if it is a well known vendor and your credit card information is held in a profile that does not go on line, the level of risk is minimal and the purchase becomes a very trivial decision that is almost equivalent to standing in an actual book store. This is even more the case if you happen to have coverage on your credit card for fraudulent activity. This is illustrated by a modification of the figure below. As systems of assurance are put in place they provide a positive ‘pressure’ on a given situation. This pressure serves to reduce the perceived (and hopefully actual) degree of risk.Figure_3

 

 

 

 

 

 

Figure 3. The positive influence of increased assurance or insurance

From this we can deduce that providing increased assurance to individuals who participate in ecommerce is a good thing and will produce positive results. This is indeed the case. It also means however that individuals can be misled. They can be misled either by the degree of the perceived reward (think fake lotteries and sweepstakes) or by the degree of perceived assurance (anonymous SSL is the main avenue here). Many scams will try to do both. A good example is a sweepstakes email from a seemingly reputable company name that has the happy news that you are the winner and you only need to fill in some required information on a ‘secure’ web site. You even get the SSL connection with the lock icon in the bottom on the browser screen! So assurance is a two edged sword. If the potential reward is big enough and the ‘illusion’ of assurance can be provided, then the basic ingredients for a scam are present.

This can be carried further by the ingenious but nefarious use of software code that can provide the ability to place key loggers, bots and Trojans on a users’ PC as a result to the mere visiting of a web page. Once the code is resident, all sorts of information can be garnered off of that compromised system. With this approach there is no need to dupe the user into entering anything on-line. The malignant party need only wait for the scheduled updates from its cyber-minion. That is all that is needed in this scenario is a moment of indiscretion on behalf of a user who is ‘dazzled’ momentarily by the perception of some great potential reward. The code does the rest.

So what is a user to do? It seems that we are going back in a cyber sense to the days immediately following the fall of the Roman Empire or in the days of the Old West where your very survival often depended on the whims of the environment. Interestingly, there are many analogies about the Internet and the Old West. We are now at a point in evolution where the analogy to the time following the Roman Empire (known as the Dark Ages) may be more appropriate. Many of the malicious parties are no longer just college kids or folks looking for a quick buck. As systems automation has become more prevalent many malicious activities are being sourced against infrastructure. Some of these activities can even be traced back to national, religious or political interests. So things are getting into the big leagues and like a good ball player, we need to change our mentalities to play in the league.

In this model, you might view the typical enterprise as a feudal kingdom that lies behind solid defenses of rock and earth. From these ramparts an enterprise does its business via various ways of securely providing for access across its defenses. As we carry the analogy further, the single Internet user is like a peasant in a mud hut outside the walls. Their defense is only as good as the probability of contact with malicious forces. They may run anti-virus software and have security check updates, but the real bottom line is that there is always a lead with malice ware, just as there is always a lead in weapons versus defense. If the user is frequenting unclean sites then it is only a matter of time before they contract something that neither the security checks nor the anti-virus software recognizes… that is until it is too late. So the analogy is very good. In the Dark Ages, if you were living in a mud hut you where at very similar odds. If no one came along, you were fine (the analogy here is that your software is up to date and recognizes the threat)… if not, then not; because most often your defenses were paltry in comparison to those who threatened you.

 So what does all this mean?

 What we will do now is take a look at the information regarding the subject of trust that we have gathered by our walk through history and see how it relates to these modern day issues. Some of the results that we will find will be obvious, other results may be startling. Some may even provide discredit to some major industry trends. In all of this it is important to keep an open mind and to remember that history often does repeat itself – it just happens in a different context.

First, let’s be clear. The Internet was never like the Roman Empire, except perhaps in the earliest days of DARPA. From the outset, the analogy of the old west or the dark ages was the most appropriate way to describe the environment. What I would like to do however is bring the analogy a level higher in scope and say that the typical enterprise is the typical empire or kingdom and that each enterprise is responsible for its own domains and the interests that its enterprise represents. This is certainly a valid analogy in that even Rome co-existed with other empires though not always peacefully. Persia and Carthage are two examples. So in a similar fashion different enterprises may be seen to interact, sometimes friendly such as a supplier relationship, other times not; such as a competitive relationship. This however is not the point. The point is that each enterprise is responsible for securing its own domains, just as each empire was responsible for theirs. Here the analogy is true. As an enterprise, my organization can not be made responsible for the security of my suppliers or even my customers. It is up to them to make sure that their own house is in order. The bottom line is that some may be more diligent than others.

So what is the first thing that we can draw from this? Well, first off empires existed by virtue of the ability to leverage wealth. They did this by maintaining well protected trade routes to the various other empires or nations that provided or desired products for trade. We might view Virtual Private Networking and data encryption as the modern day equivalent of this. Business to business connections happen securely when they are properly administered as their widespread use can testify. (Note however that recent attacks on IPSec VPN gateways have been documented, just as attacks on well protected trade routes occurred.) Secure remote connections can happen for end users within enterprises (I am using one now) as well. All of this can occur because the enterprise, like the empire has the ability to set the policies for its security practices.

Like well protected trade routes to the empire, VPN’s are only a part of the answer for the enterprises defense. Each Enterprise also has a well protected border that is maintained by threat protection and security devices just as empires maintained well protected borders by the use of armies or legions.

In the industry today there is a major push to an end to end security model. In this model, everything is authenticated and encrypted directly from the user’s device to the server that they are accessing. This approach has it’s benefits but it also has a drawback in that intermediate security devices such as threat detection and firewalls are blind to the traffic that is coming across the border. As such, encryption could provide a cover of darkness for a would-be attacker instead of providing the protections that it was intended for. Parallel to this is a major thrust for the decomposition of the security framework within the enterprise. In this paradigm, intermediate security devices are labeled as antiquated and not up to the challenge of protecting the enterprise in today’s e-commerce environment. Instead, the function of security becomes increasingly resident in the server and the client in the end to end scenario. If we carry this analogy to the empire, this is equivalent to leaving the borders less protected in lieu of depending purely on trade routes. This brings to mind Justinian’s reduction of the armies and the resultant reduction in control of territory that the empire experienced.

Perhaps a clearer analogy is the foot soldier. This is a paradigm I like to term as the ‘Naked Samurai’. In this analogy, the trend of security decomposition can be made equivalent to a Samurai who disrobes of all armor prior to entering into battle. (While this was never a practice of the Samurai, it was known to happen with Scots, which scared the devil out of the Romans – but didn’t do well for the attrition levels of the Scots. It should be noted that they eventually abandoned the practice and started to use armor like everyone else.) In order to survive the endeavor, the soldier must be flawless in his reactions. Each response must be perfect because ‘any’ error would be grievous. Even a minor injury would prove fatal as it would likely lead to further errors via pain and blood loss that eventually would prove to be his demise. As a result, no sane soldier would enter into the thick of battle without armor and yes medieval Japan had its share of armor. In many ways, this is equivalent to the current decomposition trend. In the end to end encryption paradigm, the first point of defense is the last point of defense. As a result, any threats that the server experiences it must be perfect in its response to. As we covered earlier, it is not always possible for security code updates to catch the latest mal-ware. In this model it is also not possible to always monitor or protect the client end system because the client may very well visit sites that are compromised. As the client system gains access to the server, it can then in turn infect the more important system. Without intermediate security, there is nothing that can be done to rectify the situation.

To carry the analogy further, this is parallel to the fall of the empire and the rise of the feudal kingdom in its place where the feudal kingdom becomes analogous to the server. Arguably the feudal kingdom like the server is less able to defend itself than the empire like the enterprise. Most certainly, any defense it does have is much more local and as a result much more easily compromised. More so, once it is compromised there is no cavalry to rescue it because the intermediate security devices are blind to the encrypted traffic. Also consider that the compromised system is now an enemy outpost within the enterprise data center where it can further entrench and infect other systems. This is analogous to the Dark Ages castle opening its drawbridge and filling in its moats. All folks coming into the castle are escorted by a secure squad of guards to their place of business. All of this sounds well and good, but no one did it. Why? Because, such a practice would have been construed as insane.

It is clear that a good security practice involves a combination of components. It is also clear that security has strong impacts on degrees of assurance, whether it is for medieval merchants or for e-commerce enterprises. Secure borders, rock walls, earthen ramparts, armed guards and armed trade caravans, all of these were required in order to fully secure a domain of interest which was the empire. The very same thing holds true for the enterprise. To succumb to the notion that defending the border is just too difficult is to succumb to the notion that destruction or at the very least fragmentation of the larger entity in question is eminent. No enterprise would accept such a notion, just as no empire would. Yet, empires have fallen for these very precise reasons. Ominously enterprise networks, particularly those that depend on e-commerce within their business models, could be viewed in very close analogy here.

Fortunately, there are differences in the fact that unlike empires, enterprises do not have to control all of the territory connecting their sites in a physical sense. They do however have to deal with the secure inter-connections across vast geographic domains. As a result enterprises require multiple layers in the security model to properly protect its resources and interests. Firewalls, VPN gateways, Threat Detection & Remediation to name a few, as well as end to end security are required to totally secure an enterprise. All of them provide value, the question then becomes – ‘How do two paradigms like end to end encryption and intermediate security devices co-exist and provide value to the enterprise?’ Well, the answer is rather straight forward. It is the same as that which provided the answer for the empire. It is a term known as ‘Federation’.

I’ll trust you if you’ll trust me

Merriam-Webster dictionary defines a ‘federation’ as an encompassing political or societal entity formed by uniting smaller or more localized entities: as a:) a federal government b:) a union of organizations 2: the act of creating or becoming a federation; especially : the forming of a federal union. Extending this into the area of security technology it is interpreted as a system for common governance and implementation of consistent policy for the domains of interest. I say ‘domains’ in plural because this is one of the major uses of federation, the tying of enterprises for B2B usage. Such an approach allows for the ability to extend trust across domain boundaries for very specific reasons as well as the ability to limit any such trust only to those services that are made open. This is analogous to the opening of the draw bridge or the border to a trading party that has established friendly intentions. The figure below shows such a relationship. In the diagram we show an enterprise (enterprise A) that has a relationship with three other companies (B, C &D). One is a supplier to enterprise A and is connected to enterprise A over a provider network. In this scenario, the two companies use an actual VPN with dedicated gateways. Both enterprises extend basic trust and each one administers their own relevant firewalls and access control policies but they will trust the credentials of the other enterprise by the use of federated digital identity.

Figure_4

 

 

 

 

 

 

Figure 4. An example of a Federated Business Ecosystem

In the other relationship Enterprise A is the e-commerce vendor and has business relationships with enterprises C & D as a supplier of products. For these relationships enterprise A provides a secure web services portal over a provider network. In this scenario, there are no VPN gateways. Instead, enterprise A provides directory services for its customers based on a federated B2B relationship. As a result to the federation, the enterprise trusts the credentials that enterprise C & D users offer when they access enterprise A’s secure web portal. As they gain access to the portal they are in turn offered a certificate based secure encrypted transport via SSL or some similar method. Once that occurs they have access to the secure portal and can do their business within the allowance of the access control policies. Note that while Enterprise A has relationships with all companies, there is no provision for direct connectivity between Enterprises B, C & D in the context of ‘this’ business ecosystem. Other contexts may allow it.

Further federation of the internal security frameworks would allow for the autonomic modification of security policies (i.e. Firewalls) and access according to the higher level governance of the policy environment of the larger Federation. Federation allows all of these companies to interact and execute a business ecosystem in a relatively secure fashion that does not demand undue opening of each company’s security border.

Sidebar – The Neurobiology of Trust

Recent studies have shown that the phenomenon of trust is strongly related to the quantity of the hormone oxytocin in the bodily system. A monitored test with a variety of a game of trust indicated that during periods of relatively trusting interactions the hormone was seen to markedly increase in particular portions of the brain that revolve around facial recognition and social interactions. Conversely, the hormone was seen to decrease in instances where the other players actions illicit a feeling of mistrust. Along with this decrease in oxytocin, there are also telltale ‘fight or flight’ indicators such as colder hands – which reflect the surge of blood to the body core. Furrowed brows are another key indicator along with escalated heart rate and corresponding increase in blood pressure.

Additionally, other studies have shown that facial expressions or genuflections that are meant to indicate friendly intentions such as waving or smiling will also cause a marked increase in the presence of the hormone.

The question remains on how whole heartedly trust can be generated and maintained with the at best indirect human interactions that are often the case in ecommerce situations. These studies do indicate that there are biological reactions that can actually be measured within the human brain. This fact leads to the possibility of designing ecommerce sites where test users are monitored for the presence of oxytocin in the system as they navigate through the prototype site. Such design approaches will allow for the redesign of ecommerce sites that are better suited to the human aspects of trust. In the future, real-time biometric sensors may be able to report some of these indicators back to the ecommerce site to provide feedback of the customer’s level of comfort as they use the ecommerce site.

 What about the guy in the mud hut?

 All of this is well and good for enterprises, but what about individual users who are not affiliated with an enterprise? Unlike the enterprise, these individuals do not have the convenience of large budgets for security. The analogy here is very close to the farmer who lived in the mud hut and traded his wares with larger kingdoms in return for the needs of life. When you think about it, the e-commerce paradigm is quite frightening for these users. They are using a network that they do not administrate or control to gain access to services that they also do not control to purchase products. Very often they are required to put fairly sensitive data into the web interface that they are using. All with very little level of assurance that no foul play will occur. When put this way, it is a wonder that anyone does anything on line that has to do with credit or financing. Yet, many do. The convenience outweighs the perception of risk. Even with this motivation however, the level of internet sales during the Christmas holiday season has experienced a sharp decline with many folks opting to investigate on line but actually get in the car and physically go to the store to buy the product in person. Internet sales were shown to be down forty percent during the 2006 holiday shopping season. While the numbers are not yet in for 2007, many fear that it will reflect further depressed numbers. When asked why through the use of surveys and such, many users cited fears about identity theft and the commandeering of credit cards for illicit use, and this concern is to some degree validated. A study by the Federal Trade Commission (This study can be found at: http://www.ftc.gov/os/2007/11/SynovateFinalReportIDTheft2006.pdf ) has shown that Identity Theft reports hover at around 4% of the surveyed population with losses totaling 15 billion dollars in the 2006 time frame with an average cost per user to be around five hundred dollars. These statistics are intimidating. Moreover the experience of identity theft is even more so. Most users become very leery of e-commerce of any kind once they become victims. Indeed, many psychologists are saying that the same post stress symptoms that individuals experience after a mugging or robbery are being experienced by folks that are unfortunate enough to experience cyber fraud or identity theft. Obviously, there is no threat of physical harm, but the feeling of violation and loss of control are just as acute. As more users undergo this type of experience, they take it into their social context. They tell friends and family of the ordeal and by word of mouth provide a dampening effect on e-commerce activity by the reduction in the perception of assurance. This is very similar to my neighbor telling me about the shoddy job that the neighborhood kid did on his or her lawn. As a result, I will tend to be more diligent and inspect the job more thoroughly when it is completed and even perhaps pick out some thing I might otherwise disregard. My degree of trust has been compromised because of the reduction in assurance by my neighbor’s comments. This would be even more so if it were a baby-sitter because if the increase in the level of assumed risk. We can find a direct analogy in e-commerce that speaks to some of the reasons behind the downturn in activity.

Clearly, there needs to be some sort of governance for security within domains of public Internet access. Internet Service Providers are increasingly moving to meet this new set of requirements. Many will provide SPAM protection, anti-virus updates, free firewalls and other security related code and services as a bundled part of the access service. The movement to security governance in the provider space is the only way to further secure the guy in the mud hut. Many Internet purists bristle at any such proposal. I would argue that Internet purists are not ‘real world’ just as Adam Smith wasn’t on the economy. There is a real threat to the common Internet user, and security domains of interest (i.e. by the companies who provide Internet access) are the only way to combat the problem. Software updates to users PC’s are only part of the answer however. Providers need to incorporate stronger security policies based on histograms of problematic sources. As users become known perpetrators in cyber-crime or even spamming activities, providers need to crack down and revoke access as well as if appropriate, forward the incident to legal authorities. Believe it or not this actually does occur. During the 2005 timeframe the FBI executed a significant string of arrests on child pornography trafficking; all with the cooperation of Internet Service Providers. Other arrests have occurred in the areas of identity theft and cyber-fraud that show that it is possible to do enforcement, which after all, is a key ingredient in any system of governance.

Given all of this there is still something more. We have thought about protection of identity and privacy. We have talked about active components that can police and provide this boundary of security domain. We have also talked about the role that the user’s machine can play in the security paradigm. There was also the discussion of the federation of these systems and methods so as to provide a coordinated system of governance for infrastructure and policy. What is missing is a key element that goes back to the days of the Templar Knights; reducing the element of reward or temptation.

What’s in your wallet?

 If I have a credit card account that is in good standing is there really any reason to put the number of the account on line when I buy something on the web? Really, think about it… do I really need to do that? Would it not be better to hash out a string that is unique for the transaction and then share it with my card provider (via a dedicated secure connection between us) which I in turn then present in equivalence to the e-commerce vendor? Which would of course, occur over a different secure connection. The e-vendor (if I may use the term) would then in turn present the hashed token to the card provider. The card provider would then research its record of transactions for the user and then (hopefully) find that it is ‘open’ from a transaction standpoint. The card provider would then honor the credit and then update the account. Now yes, there is the argument that someone could steal that hash, but it is limited to the value of the transaction only. It will likewise be a one time only occurrence within space & time that can only be valid for the transaction at hand. Given the speed of most levels of Internet access, there will potentially be only micro-seconds worth in time where a potential thief could ‘steal’ the transaction. Consequently, a strong level of assurance would be provided to the user that ensures their trust.

The key concept of the above is that while abstraction has been the enemy of the commerce paradigm from the standpoint of this paper so far; it is also an avenue for further entrenchment of security services into the e-commerce paradigm. While abstraction from the original concrete transaction (remember the village trading example) has caused a series of potential security holes where criminal activity can occur, in a very real sense further abstraction of certain aspects can help alleviate them.

This concept of the digital wallet yields a system which simply generates credential hashes that are used in tandem with Identity assertion tokens to ‘point’ to entities that can in turn validate the transaction. These ‘pointers’ are only valid for the context of the transaction, with the vendor that it is intended for and for a limited duration. All of this closes the window of risk exposure considerably. The direct credit card information is never out on the wire. There is never any instance of where it needs to be presented. This thereby attains complete abstraction from the actual credit card number information. This is a critical move that greatly reduces the assumed risk for the purchaser. It also significantly lessens the level of temptation for any would be cyber-thief. The level of assurance increases, or more so, the level of required assurance decreases in like (recall figure 3). Provided that there can be a solid way to identify valid e-vendors, the level of assurance with existing technologies could be enough to provide the boost in activity that e-commerce needs at this point in its growth as a market sector.

If such a system could be built that not only incorporated the abstraction concepts described above but also included a consortium of e-vendors and credit card providers, a cyber shopper could then look for the ‘brand’ label that provides the added level of assurance that this is a safe site that participates in the business ecosystem consortium. They will know that they can enter the site and buy something by the hash generation technique mentioned above and that they will not at any time during the course of the shopping cart experience ever be asked to put in a credit card number. But this only works if there is the assurance of participation in the system of governance and the ability to identify oneself as such.

 Just who are you?

 In all of this what is consistent? From the initial Stone Age village trade to the next generation e-commerce transaction what shifts and what doesn’t? Well, as we have seen almost everything shifts. The concepts of representation of worth and the methods for doing so have definitely changed. The methods of advertisement and business have most definitely changed as well. In both instances, the changes have led to more abstract models of function. In turn, the aspect of identity has been necessarily been abstracted to fit into this new environment. But interestingly, it is the one thing that, at the end of the day, has not changed. After all, the human that traded his cow for grain in Neolithic times could be viewed to be no different from us modern humans outside of all of the additional trappings of civilization. In all of this garbled abstraction that has gotten added to the commerce model there is still the humble human who is looking to buy or sell something and of course during the course of business make a few bucks! Even in light of complicated autonomic business processes where the human who is buying something is not acting with another human ‘at the other end of the line’, but is instead working with fairly inhuman process oriented flows; there is still a group of humans who set up the automated process environment. It is also assumed that these humans did so with the intention of making a few bucks. So the fact of identity does not go away with automation. As a matter of fact, it has now become one of the most critical pieces to indicating the success or failure of the e-commerce model. To be clear, while the need for identity has been consistent, what it means has had to change drastically.

If we recall the Neolithic village trading example, we were in a village in central Asia before the advent of the Bronze Age or perhaps right at it inception. The whole population of the village was most probably around one hundred and fifty individuals. Comparing this with most isolated villages in central Asia today would give credence to such an estimate. Given these numbers, it is highly likely if not almost certain that the two individuals knew each other well. It is also highly probable that each individual’s families knew each other as well. In other words, identity was part and parcel to the Neolithic trade. If someone came in from across the tundra with a cow that they wanted to trade for grain the result would probably not be a good one for the ‘would be’ trader.  In real life, he would probably be killed quickly and the cow simply taken by the family that did the killing. At the very best, it would probably work out that the village would simply take the cow and leave the stranger, perhaps bantered about a bit. In any case, the least probable outcome would be for everyone to sit down at the fire and draw up an equitable trade agreement for the animal. Why? The answer is simply that the stranger is not ‘one of them’ and because of his singularity has no leverage. He is not part of the social fabric of the village, so unless he had something really outstanding and had the ability to defend it – and there were points in pre-history where things like this did occur – he would usually be turned away or worse, killed

This is really no different today. We do not kill folks that are not part of our social circle any longer, but someone who is not part of the normal social eco-system will usually find it harder to do business in person to person exchanges. The problem is, with e-commerce it is very hard to hold this kind of line at all. As soon as you go on line, you are dealing with folks you don’t know and probably never will meet. Granted there could be a small percentage of folks who you know who own e-commerce companies, but I think that you will find the list to be quite short. The real fact of the matter is that in most instances you do not know the folks that you are doing business with. This has been cited as one of the major issues that folks have with e-commerce. The fact that there is very little that can be provided to assure the user that they are talking to who they think that they are talking to and that there is no one in the middle.

Identity may be a consistent historical feature in assurance, but in the new e-commerce model the concept needs to change. Clearly, if any real capability for identity is to be brought into the e-commerce paradigm we need to consider the human in the cyber environment. First, all instances of human presence on the Internet are composite instances. The reason for this is that no human can access the Internet directly. All humans require some type of device as well as some type of network access with that device to get on line. The composite goes further as well, there are the aspects of the capabilities of the device; the bandwidth available, the type of video or audio supported, perhaps even the location of the individual as they access the network as well as the application they are using! All of these characteristics build up the composite entity that is a human being on line. The figure below illustrates this concept, note that there is a layered instance of the human over some type of interface into an application which is in turn supported by an operating system for the device and lastly the device hardware itself. All of this together adds up to the complete instance of a human presence on the network. Does this mean that a human with one device is different for that same human using another type of device? The answer is yes. Particularly if there is a significant difference is device capability, particularly in the area of security.

Figure_5

 

 

 

 

 

 

Figure 5. The Composite on-line Entity

 Going further, we could extrapolate this out to non-human instances of presence as well. It would apply to application servers or to thinner types of devices as well such as sensors of the physical environment like video surveillance cameras. In these cases, there is no human sitting at the ‘other end of the line’. Instead there is just a machine. But the machine is also a complex of composite elements. It also has an application, an operating system, hardware elements and many other items that make it a server or sensor device. As the figure below shows, the same could even be held true for the simple video surveillance camera. As the figure below shows, both the server and the camera have interfaces so that a user can log into that entity. It is by this logging in that an association then occurs between the entity and the human, which we must remember is in turn a composite instance on the network. So things can get fairly complex and convoluted in terms of who is who and who is running what. In order to clarify how these relationships can be embodied we will go through a couple of mundane examples of network resource usages and how the aspects of identity are inherited.

Perhaps the best and most clear example of the transference of identity by system log on is in the case of video surveillance. The reason for this is that by logging into the system, the direct visual perception of the individual at the console is literally extended on a virtually unlimited basis. In essence, a person could be sitting in Europe watching real time video (less the latency for delivery of the data) of camera feeds in the United States or elsewhere. This relationship is shown by the diagram below. This is a rather obvious fact. However, one of the things that needs to be considered is that the systems intentions and integrity are directly associated with the whims and motives of the human being that is logged into it. In other words, there is a big difference between law enforcement personnel, illegal voyeurism and potential terrorists.Figure_6

 

 

 

 

 

 

Figure 6. An example of how identity transits composite entities

The issues get more complicated with automated process flows. In reality all process flows have initial human sources. Even process flows that are completely automated and self configured were designed by humans for a particular purpose. A good example is the recent flurry of Service Oriented Architectures (SOA) that are now the IT industry vogue. Based on web services concepts, a given process or application is packaged into a ‘service’ definition which is in turn represented into the SOA framework as a ‘service’. A service would typically represent some sort of application that drove a business process or a function for an overall business process. An example could be an application that performs order processing or billing within an end to end business transaction. A simple SOA process flow is shown in the figure below.Figure_7

 

 

 

 

 

 

Figure 7. A Simple High Level SOA Process Flow

It illustrates a simple e-commerce order process flow. Each part of the end to end process is represented as a service within the overall process flow. Each is a web service application or a legacy application that has been adapted to a web services architecture. Each was created by a human being or multiple human beings for a specific purpose. Indeed a good degree of equivalence could be drawn between the old time order clerk, who manually fulfilled the order by paper and the application that now processes the order electronically. Just as there was the old time possibility of the clerk fudging the order and embezzling the remainder, so too there exists today the possibility of an embezzling web service that is purposely designed to accomplish that end. Perhaps more feasibly, a rogue web service could be designed by less than honest staff that could be inserted into the process that might behave perfectly well on the front end. This is shown in the figure below.

Figure_8

 

 

 

 

 

 

Figure 8. An example of a ‘Dark’ SOA Service extension

On the back end that same service that checks and validates credit accounts might export customer credit card numbers to a dark server somewhere on the network before being taken off site or otherwise forwarded. This ‘dark’ portion of the service is not represented in its services description to the SOA environment. It is for all intensive purposes an invisible portion of the service due to the abstraction that SOA infrastructures provide. In essence the only way for this service is by monitoring its conversations and data exchanges directly.

The whole point of this is that systems and process automation do not by themselves, address the issue of trust. In some respects, the issue is made more difficult by process automation. This is particularly true if systems of governance for Web Services within the organization are lax. There is an additional point to this however. Each of the web services is a composite entity. Each entity possesses the capability for damaging activity. How damaging depends on what the service does in the end to end business process. This means that identity is just as critical here as it is in the human interaction model. Additionally, histograms of activity for a service need to be monitored so that any unknown or undefined communications coming from it or going to it are quickly analyzed and dealt with. It must be considered that in this environment, a lot of damage could happen in a very short period of time. Hence, the systems of identity and governance must in turn be automated and extremely dynamic.

The Dark Delta – The difference between perception and reality

Trust is clearly something that is related to our perception of risk. The main problem here is that our perception may not always be totally accurate. Furthermore, it could be argued that our perception can ‘never’ be totally accurate. This gets into some very important aspects of the physical universe and our consciousness and awareness of it and the events within it. In essence, we never see reality as it is. We only see our representations of it within our own minds. At one time cognitive theorists proposed that our minds were in essence reflections of the events that transpire. This implies total accuracy of recall in those events. Subsequent findings have shown that our pictures of reality are in essence pictures that we generate in our head against an inventory of symbols and images that we learned and hold in our heads. What this means is that we do not totally recall events as they transpire but rather will swap and integrate the perceptions of events with the memories and symbols that are pre-resident in our minds. It is by this ‘strange loop’ (similar as in the mathematical concepts of Godel’s theorem) that allows for us to ruminate and in turn induct new symbols and perhaps even create new symbols in light of information received. If this were not the case then insight and invention would be impossible for us. We would simply be mirrors to what we see and react accordingly. This is obviously not the case. But while this innovative twist in cognition plays a critical role in what makes us human, it also introduces some thing that I term as the ‘Dark Delta’. Between the actual physical universe and our perceptions of it there is always a potential delta of information and as you can readily see there is no way to eliminate it. We can only narrow it.

Now we could go as far as to dismiss this to philosophical conjecture. After all, for every day occurrences this delta is very small. Generally, what we see and what we think we see align fairly well. However, let us consider how perception can be thwarted. First off, one commonly unknown fact that because of the latency in perception there is an inherent sub-second delta between what we see and what is there. For normal speeds, the delta is negligible, going at a speed of 60 miles an hour down the highway however our minds perceive us to be 11 feet ‘behind’ where we actually are. This translates to a critical subtraction of the time delta for decision making. The faster you go the more the delta expands, so that if you are piloting a jet going at six hundred miles per hour, your perception is 110 feet off from reality. This is just in speed and the latency of perception. Let’s now add in interpretation. Going back to our Neolithic ancestors, or even to a modern human in the jungles, a Tiger’s stripes can readily be perceived to be part of a tall stand of grasses. Lack of proper match between what you think is there and what is really there could get you killed. This carries forth into our modern world. Recently in New Orleans there was a woman who was approached by a well dressed and manicured man as she exited a quick mart and got into her car. He held out a five dollar bill and said that she dropped it in the parking lot on her way to the car. Fortunately, she had not yet put the change back into her purse and was able to quickly see that he was mistaken. She indicated so and went to close the door. The man quickly attempted to prevent her and insisted that she in turn was mistaken. After she managed to get the door closed the man began banging on the window. She quickly pulled out and away from the location. Shaken, she decided to call 911 and report the incident. As a result, she was contacted by the police and called down to the station. Puzzled by being called down for a seemingly odd but non-criminal event, she soon found out that a serial killer had been operating in the area and was somehow gaining access to women in broad daylight and in a populated area. The police were puzzled at how the killer was gaining access. This woman very narrowly missed what could have been a fatal incident. What saved her? It was information. Because of the fact that she did not put the money back into her purse, she was able to use this informational context to narrow the dark delta. By this narrowing of the delta she was able to arrive at the conclusion that ‘something’ was not quite right.

So we can see that the Dark Delta is not just philosophical mumbo-jumbo. It is something that we deal with every waking moment of our lives. (One could argue that during our sleep the delta is significantly widened – perhaps even infinite). When we move into a cyber-environment, this delta widens considerably. Importantly, it widens not only in the context of perception and interpretation because of the implied levels of abstraction we have spoken to previously, it also widens because of speed. Not the speed at what the user is traveling but in how fast transactions can occur in relation to the awareness of the user. In short, in a cyber- environment things happen fast and we are not always totally aware of exactly ‘what’ occurs. As a result there is a whole underground culture and industry that capitalizes on this expanded delta just as a whole culture and industry grew up around the various levels of abstraction that have evolved prior to cyber-commerce.

We can also show that the context of the delta shifts as well. In the case of the woman in New Orleans, the delta was in the perceived ‘intentions’ of the man. As noted in the previous section, in cyber-space this can extend to the very ‘identity’ of the man. The man can not only pretend to be nice, he can also pretend to be some one that she knows and trusts. This particular expansion of the delta pushes things into a third new critical dimension. Three dimensions being speed (latency), perception of intention, and perception of identity. The combination has fueled a surge of child predator’s that use the cyber-environment to gain the trust and to some degree control over youth that they would otherwise never gain from direct personal contact.

Information and Context – The light that narrows the delta

As we pointed out above, the fortunate woman in New Orleans was saved from what could been a fatal incident by information and context. As a result, information and context needs to be considered in the overall trust model. At first light, we could simply classify it as another issue of assurance and indeed it could be. As we look a little closer however we can see that the information and context more appropriately serves as a degree of ‘measurement’ in the ‘accuracy’ of the assurance. This is a key difference. In the case of the woman, the man’s appearance provided a sense of assurance that there was little risk to be assumed. This misunderstanding however led to the demise of many unfortunate victims. What the added information and context did for the woman was to highlight that fact that somewhere there were inaccuracies in this perception of low risk. Note that she did not know why – but it could be argued that that was not required. The inconsistency was enough to put her on guard and in an alarmed state. You could almost say that the context and information was like a torch or a flashlight that cut through the darkness and highlighted inconsistencies. This highlighted awareness perhaps saved her life.

We can draw the same analogy in the cyber-space environment. Many representations are made in cyber-space. Some are implicit in functions such as IP addressing and naming resolution, others are more explicit such as user identification and passwords. All of them can be manipulated, spoofed and stolen. There are also potential ambiguities as to what is actually on the wire versus what is perceived to be on the wire. Examples of this are Trojan payloads and masked XML data insertions. It is in drawing out these inconsistencies that provide us insight into potentially nefarious activities such as spoofing, insertion attacks and bot-nets. It should be noted that often attacks are caught by the symptoms of abnormality, not by the event itself. Searching for the attack instance itself or trying to find the exact event on the wire is like trying to find a needle in a haystack. This is part of the argument to do away with perimeter security. There is somehow the false impression that once you get authorized access and the appropriate health checks you are good to go of the rest of the time. There is also the false impression that you and your intentions and your machine and its intentions (for lack of a better term) are the same thing. They most certainly are not. You could be honestly accessing your systems and doing your job quite innocently while your machine is mounting attacks and/or running executables to pirate data. It is in highlighting the inconsistencies and abnormalities that where we find the best reference to the clues of such nefarious activity.

It could also be argued that if you wait for the attack and recognize it at the system that is being attacked, you are too late. This provides further argument against the total decomposition of the security perimeter to the server itself. The ubiquitous presence of the Dark Delta further exacerbates this model. The server is by analogy equivalent to the woman and the perimeter security systems to the information and context. The reason for this is that the Dark Delta applies for all entities, not just humans. By removing the perimeter security, the server is left to its own limited perceptions of what is actually going on or coming its way. Also consider that any element of time has also been removed, an attack is real-time and imminent. It also needs to be established that there is also a dark delta in known signatures for attack and virus recognition, so the server itself may not be able to discern a piece of malicious code or data because it has no context to reference and hence provide a match. Recall our symbol matching ability – as an example if you’ve never seen a poisonous snake you are much more likely to identify it incorrectly and perhaps even be willy-nilly in the way you choose to approach it. Such a mistake obviously could be fatal to you.  The security perimeter provides an additional perspective and informational context (equivalent to our internal symbol inventory) that can highlight and narrow the Dark Delta considerably.  It also provides the obvious role of intermediate remediation of any events which we typically attribute to such systems. By creating systems and architectures that can provide context and information that can be ‘cross referenced’ and validated, light can be shown into this Dark Delta and narrow it considerably by removing ambiguities and increasing the accuracy in the perception of ‘reality’ on the wire. Increasing this revealing light for users could potentially highlight inconsistencies in representation and intentions by highlighting unexpected address combinations, network ingress patterns, spoofed system names and addresses as well as whole web sites.

In a very real sense, the same elements that serve to save the Neolithic hunter or unsuspecting victim in a parking lot are the same elements that serve to protect and ‘save’ our information systems and infrastructures. A Neolithic hunter is saved by noticing an inconsistency in the textures and colors or in the shadows within the grass and moves well away prior. If he waits to find out it is a tiger he is probably too late. Next generation security architectures also ideally aim save the systems they protect by noticing and highlighting inconsistencies prior to finding the tiger in the grass first hand.

 In Summary

 As this paper closes on the subject of trust we find a number of parallels and traits that are characteristic and universal to the paradigm. As the human race moves into the next generation virtual world of cyber-commerce, these parallels will be extended and retrofitted to work into this new environment that same way that they have been retrofitted to monetary commerce and market based economies.

As shown early in the paper, the paradigm of trust has been challenged time and time again by increasing abstraction in the way we humans interact. What was initially a very concrete attribute to a relationship has become increasingly abstract and disjointed both spatially and temporally as we move into the 21st century. As this evolution of commerce moved into a more virtual construct, we in turn developed methods of governance to provide assurance that transactions of commerce happened in a predictable fashion and with rules that insured participants complied. In addition to this element of governance there was an equal need that developed for enforcement so that the rules of governance were followed and those that violated the ‘contract’ were dealt with appropriately. This delicate balance has for the most part been maintained to allow for the sophisticated commerce culture that we have today. If one thinks about it, the culture relies on many things that are taken for granted. Once that balance is upset, many of those things fall asunder and a society can fall into severe and potentially fatal upset. We pointed out historical instances where this has occurred and provided insight into how the seed of demise came about. It became apparent that the lack in the ability to enforce the mandates of governance led to an overall reduction in the level of trust in the systems of the time. With this reduction of trust, the foundations of commerce began to implode and as a result the society as a whole reached a point of collapse.

I think that it should be apparent by now that trust is something that is inherent to the human condition. At the risk of extending the paper, it could be argued that trust is an integral ingredient to any social animal. Once an animal chooses to become social it ‘gives up’ certain things so that it can ‘gain’ others. Usually gain outweighs loss. A good example of this are the social evolution of wolves who give up independence in lieu of certain other benefits such as the superior hunting capability that they are so well known for. Each wolf ‘trusts’ in the system, and it works. This works the same way with us but it is made far more complex by the ‘strange loop’ phenomenon that was mentioned earlier. With humans, as we have seen – it is not so simple. Humans (and certain other primates) have this ability to intentionally deceive others within its social circle. This strategy has been successful over the millennia. This must be so, otherwise all people would be honest. This is obviously not the case.

This subversion of trust required systems of governance to assure proper bounds of behavior within the society and commerce system. Enforcement is therefore a key element to trust that may be somewhat indirect with it but directly related to the concept of assurance. This in turn shows that while we as social animals may have a magnetic tendency towards groups, we require rules and methods of enforcement to stay together in large groups for any length of time. We can view the modern requirement for network and systems security to be evolutionary results to this ‘arms race’ between subversion and governance that is as old as society itself.

There are some historical lessons to learn however. The first is that while decomposition and collapsing of the security boundary may seem more cost effective and scalable, it is not a feasible approach as it removes intermediate systems of defense that may prove to be critical during attack. Additionally, these systems add layers to the overall defense network as well as a different perspective that the server itself could never have. Rather than decompose and collapse, it makes sense to decompose and distribute security functions without removing critical layers of defense from the infrastructure. By doing so, there is necessarily the requirement that the server and application policy environment act in an orchestrated and federated fashion with the network if such coordinated services are available but then revert to a simple decomposed model when they are not. In the instances where it is not available, more constrained access policies may be put into place to assure that access is limited for the application called and nothing more. This approach can in turn offer the best of both worlds to the mobile user with the varying degrees of trust that are established.

We also discussed the delta that exists between perception and reality as well as how it relates to the concept of trust and assurance. We went on to illustrate that the level of perceived risk to the ratio of potential reward was the primary determinate in the trust decision process. It was shown that systems could be put into place to provide further assurance or insurance to the user. This in turn can push the level of perceived risk down and further encourage the user to continue with their on-line purchase. The proviso being that the user is secure in the fact that they are dealing with who they think they are.

This in turn led to the concepts of identity and the important foundational role that it plays in trust. We discussed how the aspect of identity gets fuzzy and rather complicated in the cyber environment as well as how identity can become smeared across the network by the user logging into different systems. We also discussed the fact that with systems automation we need to consider machines and the services they render in much the same way as we consider humans. Machines and their resident services need to be challenged, authenticated and authorized just as humans are required to do so. Systems of governance also need to be put into place to provide the right monitoring capabilities to assure proper behavior within the scope of authorization that has been allowed. Enforcement capabilities also need to be available so that entities that violate the scope of authorization are dealt with appropriately.

The delta between perception and reality was also discussed in both its inherency and its impact. We termed this the ‘Dark Delta’, which in essence represents the inherent aspect of the unknowable within a moment of space and time between what an entity (human or machine) sees or otherwise experiences and what is really there. We discussed the fact that there is always a nominal delta but that in most instances this minute difference is not enough to be of any significance. In instances where the delta widens, there is usually a strong cause for concern because decisions can be made by the entity in question that it might not otherwise make. In many cases, being in a scenario where decisions are made against incorrect or incomplete information can be dangerous. As with the tiger in the grass, it could be fatal.

Clearly, work to reduce the Dark Delta is required in order to establish and maintain a trusting environment that does not have undue risk for the individual extending it. In legacy commerce environments these systems have been in place since the birth of monetary based commerce. Many of these systems have simply been transposed into the ecommerce environment with little or no modification. This failure to evolve paradigms has resulted in a significant widening of the dark delta in ecommerce. This is reflected by the recent downturn in holiday season on line shopping – with fear or concern of identity theft being the number one cited reason.

One of the final premises of this paper is that in some case further well designed abstraction can in turn complicate things for the would be thief. Additionally working towards shortening the length of time and lessening the potential reward of pirating a transaction or its associated data will further reduce the window of opportunity to a level where it is longer worth the effort to subvert. By this further abstraction and by creating systems to reduce the dark delta within interactions (this includes all modes of interactions – person to person, person to machine and machine to machine) an environment can be reached where consumers will feel the degree of comfort that they require to move towards an ecommerce paradigm. Many would argue that the fate of the free market commerce system hangs on its success. Whether this is true or not will remain to be seen. It is however certain that the aspect of trust is foundational to human societal dynamics and its most recent embodiment in the Internet and ecommerce.

 

Epilogue

In light of the economic down turn of late 2008 it seems prudent to provide an epilogue to the summary and the conclusions that this paper reached. While many of the examples and analogies used in this paper seemed to be rather prophetic, it should not however be considered as special in any way. The reason for this is the fact that the basic elements of commerce and society have not changed. They are the same today as they were two thousand years ago. Technology has not served to change any of them. More so, it has served to enhance or inhibit them, but the basic elements have remained the same. Trust in the system requires trust in its governance, which extends to its rule of law and enforcement of it. Once these systems are eroded serious consequences are often the result. With the recent events of impropriety and even thievery at unprecedented levels, along with the long list of bail outs for firms that have come to the point they are at by mismanagement and overextension of risk. It is little wonder that trust is in short supply from the perspective of the common man.

It is not an exaggeration to say that very edifices and foundations of trust in our free market system have been severely shaken. Again history has shown that at such times, the collapsing system of commerce, if not corrected can result in follow on collapses in the trust of the systems of society. At these times, governments are often forced to implement martial law and strong centralize government to maintain order by rule of force. President Obama was quite correct when he alluded to the fact that stronger regulation and transparency were key elements in restoring faith in our systems of free commerce as well as our way of life.

As this paper has illustrated, while the basic elements of society and commerce have not changed, the dynamics are strongly affected by technology. On a closing note, history has shown that technology tends to ‘grease the skids’ for commerce and society. It can serve accelerate the rebound of such systems after down turn events. The reason for this is that human societies will tend to pull inward as a result to down turns. After the fall of the Roman Empire, both systems of commerce and society were in ruin. The pulling in of society was severe – perhaps the most severe in the history of mankind. Society and commerce often did not go beyond the walls of the castle or fortress. The pulling in at this time was also of a very long duration – lasting hundreds of years.

Subsequent down turns have not been so severe and in each instance technology served to allow for quick and more consistent rebounds to the economy. The reason for this is simple…communication. Each new innovation in the movement of information has served to re-establish the critical links of human communication that are so critical for the re-establishment of trust. It is the opinion of the author that this down turn is no different. As pointed out earlier, as a result to a down turn societies turn inwards in the way they operate. Commerce reverts to more local community levels. With the internet and modern communications ‘local’ no longer has to be geographically local but local in the form of context. The World Wide Web has allowed for the growth of communities of interest in which ‘local’ groups can interact on issue and motives of common interest. As an example, a vendor in North America can do business with a partner based out of Southeast Asia based on the fact that they were room mates in college. Now they are on opposite sides of the globe, but can leverage the personal relationship that they have just as if it were at a local level. Recent services such as LinkedIn™, Face Book™ and technology trends such as Cloud Computing and Service Oriented Architectures are good examples of this. On the web, local cyber communities can serve to re-establish on line commerce without requiring full blown trust in the monolithic world of high finance. By allowing technology to enhance traditional human patterns of interaction, the pulling inwards that accompany economic down turns can be accommodated without the severing of long distance and cross cultural ties that have typically been the result in the past. For the first time in history the term local is not limited to merely geography. This has had and will continue to have profound impact on human society, systems of commerce and the trust that these systems require in order to exist.

Game Theory dynamics and its impact on the evolution of technology solutions and architectures

June 1, 2009

Introduction

 

Recent work has been done in the study of game theory dynamics and the influence that it has had over the millennia on the process of both biological and cultural evolution. The theory cites that the underlying engine which drives evolutionary frameworks of any type is a mathematical abstraction known as non-zero sum dynamics. The previous statement sets the foundation for this paper. It will endeavor to investigate the premise that technology is an expression of culture and hence its forms and usages (its packaged solutions and architectures) are prone to the same dynamics as other forms of cultural evolution or even biological evolution.

What this analysis will indicate is that the future trends of technology can to some degree be predicted by use of non-zero sum dynamics much in the same fashion that future trends in culture can predicted. The basic premise being that non-zero sum dynamics is the mathematical ‘attractor’ to which all evolutionary process are driven, always to higher orders of ‘non-zero sumness’ if you will. It is the position of this executive white paper that by allowing for this unavoidable dynamic, some degree of pre-emptive capability can be garnered against the common technology market demands.

This is not ‘Black Magic’ however, nor is it a ‘Silver Bullet’. Consensus on a particular direction of technology can only be gained by knowledge of the technology, its latest facets and the industry and market dynamics that surround it. Once this knowledge is gained however, insight as to how it will evolve can be extrapolated with relative accuracy by predicting its evolution against a non-zero sum dynamic. Additionally, since non-zero sum dynamics lead to and leverage on cooperative behavior, an attractor can be provided which motivates various product and development teams within the organization to work for a set of common architectural and solution goals.

 

 

What is Game Theory?

 

As its name suggests, Game Theory is the mathematical description of game flow and outcome. There are three basic precedents to consider: 1). Zero sum dynamics, 2). Fixed Sum dynamics and lastly, 3). Non-zero sum dynamics. Simple examples of the first two instances can be provided by the sport of boxing. In this type of ‘game’ there are two opponents who face each other to ‘win’. In order for one player to win, the other player has to loose. At its face value, this is zero sum dynamics. Both players start with nothing, but one player ends up with the ‘win’. The potential for the win existed prior to the contest however and it is this potential which provides the zero value.

 

Most boxers however do not box for free. In most instances there is a prize for the winner and a ‘payout’ for the looser. At its face value, this is fixed sum dynamics of which zero sum dynamics is a part, for ‘zero’ is indeed a fixed sum. In more sophisticated examples, the payout and winning prize might be in ratio to the performance of each boxer on a round by round basis. This creates a scenario where even if a looser still looses, they can improve their lot and receive a larger portion of the prize by performing better during the contest. The sum is fixed however, there is only so much monies allocated for the event so that as the looser performs better the winner realizes less of the winning prize. This is the epitome of fixed sum dynamics. In essence, winning and loosing become a mathematical relationship that almost, but not quite, approaches a dynamic which can encourage cooperative behavior. It is does not because the winner still wants to hold onto as large a portion of the prize as possible while the looser wants to pull a larger payout. In essence, both sides ‘benefit’ from the contest, albeit one at the expense of the other.

The third precedent is quite a bit more complex, but can be readily be displayed at least in a rudimentary and limited form by the game of Monopoly. In this type of game there are a number of players that are participating in a sort of limited economic system. In instances where there are two players, the game is simply a very sophisticated fixed sum dynamic. As the number of players is increased however, the dynamic gets to be more complicated with the ability for alliances and coalitions to form where players ‘team up’ to reap benefits or to dislodge or take advantage of other players. While the maximum potential win is still fixed (this is why it is a rudimentary and limited example) there begins to appear ‘emergent’ benefits that would not have occurred if the alliances and coalitions were not formed. This is the essence of non-zero sum dynamics which is the ability to create emergent benefits which can either create positive sum gains or avoid negative sum losses.

There are a number of important points to consider before we move on. First, the dividing lines between these dynamics are somewhat soft. While zero sum dynamics is a form of fixed sum dynamics, non-zero sum dynamics is a ratio of available resources and the number of players fending for those available resources. The result is that as the ratio is spread out (expressed as available resources to the number of players); non-zero sum dynamics begin to occur. It should be noted at this point, that non-zero sum dynamics do not guarantee positive sum gains. Nor does it guarantee the avoidance of negative sum losses. What it does do is perform a hedging or insurance to increase the probability of these occurrences. Indeed, this is the basic premise of the insurance industry, the spreading of risk among participating entities.

The other perhaps more important point to consider is that in order for non-zero sum dynamics to occur there needs to be two foundational components that can not be avoided. These are the ability for the players to have a common method of communication and the ability for the players to establish trust with each other within the alliance or coalition. If either of these paradigms can not be met then non-zero sum dynamics can not (or at least is very unlikely to) occur.

 

 

 

 

 

 

 

Getting out of Jail

 

There is a text book example for game theory that demonstrates these concepts well. It is known as ‘the Prisoner’s Dilemma’. In this example, there are two criminals that are arrested for given crime. As they are brought into the Police station they of course are separated and interrogated. Each prisoner is given a set of similar ultimatums. The police officers tell each suspect that if they confess to the crime but the other suspects stays quiet they will be let off free while the other suspect will be put in jail for ten years. Conversely, if they remain quiet and the other suspect confesses, they will go to jail for ten years and the other will be let off free. Going further, each suspect is told that if they both end up confessing, each will receive three years in jail. Finally, if neither confesses, they will both only end up with six months in jail.

What results is a matrix of possibilities where the optimal scenario for both suspects is to keep quiet and take the six month sentence. This scenario however requires both suspects to trust that the other will keep his mouth shut. What creates the problem is that there is an individual ‘selfish’ option that appears to be more optimal to each but this option holds a risk and consequent factor of mistrust. Obviously, if they had the ability to communicate they would be able to assure one another and reinforce the joint understanding. Since they do not have the ability to communicate and being criminals they might not be the strongest in the way of trust, it is highly likely that at least one of the suspects will confess with the hopes of getting out of jail free. As we established however, both probably will. If both confess, both will receive a three year sentence which is a far cry from the six month slap on the wrist that they would have received had they kept quiet.

What the scenario illustrates is that logic alone will not lead to the optimal scenario for both individuals. The only (or the most probable) way that the optimal scenario can occur is for there to be trust and understanding between the two suspects.  Furthermore, the ability to establish that understanding needs to occur by communication either prior to or during the scenario. As we covered earlier the suspects can not communicate during the scenario, so the understanding and corresponding trust needs to be established prior. Without this, the scenario reverts to a fixed sum dynamic.

 

What does this have to do with evolution?

 

At this point, a reasonable question to ask is what this all has to do with the evolution of technology solutions and architectures. The answer is quite a lot. However, in order to validate this, we must first establish what this has to do with the general principles of evolutionary process.

Some of the latest theories on evolution propose that much of it is driven by non-zero sum dynamics. This was made quite clear by a contest that was held by Robert Axelrod, where several universities submitted modified code for the prisoner’s dilemma. The goal was to find the best and most efficient algorithm for playing the modified game. In this modified game, the six month jail sentence became evolutionary success of the code ‘species’, while all other options became equivalent to various degrees of evolutionary failure.

A particular code named ‘tit for tat’ won. Tit for tat is based on initial trust (i.e. tit for tat does not ‘rat’ on the first iteration), the algorithm will then take note of its partners actions and then act in like on the next run of the game with a different participant. So if the code is met with cheating, it will cheat in turn on the next run.

The game was run over several hundred iterations and as the game processed, the tit for tat ‘species’ became more numerous, eventually forming rudimentary coalitions that cooperated by ‘implicit’ trust. Importantly, the tit for tat species did this at the expense of the other species in the game. What this illustrates is that not only do non-zero sum dynamics seem to play a strong role in evolutionary dynamics, but zero and fixed sum dynamics do as well.

While this is not proof in and of itself, there are several other studies that validate these findings. There are also several studies that allude to a transitional relationship between chaotic determinism (which is deemed as the mathematical engine for non-animate matter to gain higher orders of complexity, examples are star formation and planetary accretion) and game theory dynamics, which would seem to be a higher mathematical order in turn that allows for animate matter (life) to further leverage this proto-deterministic foundation. While these tenets are far beyond the scope of this paper, they are important concepts. References are provided at the end of this paper that will allow the reader to further investigate according to their level of interest.

There are a few very important points that need to be brought out before we proceed on. First, multiple iterations became a form of pseudo-communication, while history (even a short term history such as a single past iteration) can serve as a form of pseudo-trust. It is interesting to note that by modifying the tit for tat code to have a longer memory (each player kept track of its actions and then advertised them to the other player upon each iteration) the coalitions formed quicker and grew larger, thereby allowing for the tit for tat code to gain quicker dominance over the other code species. The second point to consider is that tit for tat is initially altruistic in the way it behaves and very quickly reverts to altruism when given the indication to do so. All of this points to a degree of success that can be garnered from cooperative behaviors. The third and perhaps the most important point is that direct explicit communication and trust is not a requirement. Implicit methods will suffice. This is obviously true in that we do not ‘know’ everyone we do business with and we do not communicate or trust everyone on an explicit basis only. If that were the case we would have never moved beyond simple village economics. As an example, if I buy a car from a Japanese auto manufacturer, literally all communications and trust are implicit with the possible exception of the local car dealer. Although the car could be bought on line, thereby remove all explicit trust and communication.  Hence, it could even be said that the whole basis for e-commerce could not occur without an implicit trust model and all evidence validates this conclusion. This also gives us our first insight into the evolution of technology and solutions. Communications and trust (the ability to establish identity and corresponding factors of trust) are of extreme importance.

 

 

 

Now let’s add a Barrel full of Monkeys

 

At this point we will now begin to address the aspects of human culture and its evolution. It should be noted that many researchers feel strongly that game theory dynamics is also one of the major components responsible for the evolution of human intelligence. This is actually fairly intuitive if one can consider that when groups of primates meet, they are basically set with two choices; they either attack each other, or they choose to co-exist. In this example the basic premise of game theory dynamics is; 1). Zero sum dynamics is leveraged in the attack option and, 2). Non-zero sum dynamics is leveraged in the co-existence option with the emergent benefit being increased gene pool and better awareness of potential predators.

If we take a look at the higher apes (particularly chimpanzees) we find that the range of dynamics has increased with the zero sum dynamic being on-going territorial war (with well documented conquering scenarios) and the non-zero sum dynamic moving on from simple coexistence to extended cooperative behavior (say for instance engaging in on-going territorial war to conquer another group). From here the dynamic points to the mannerisms of human tribes and the fact that when two tribes or even cultures for that matter come into contact they are basically left with the very same choices as our primate cousins, we either fight or we co-exist and with humans, co-existence means trading.

As we can see, there are not a wide range of options to choose from. It is also clear that once tribes or cultures come in close cultural contact, ignoring each other is not an option. It is important also to realize that either choice drives emergent benefits and also one choice will often drive the other. As the examples of aggression show, the best way to drive cooperative non-zero sum dynamics is for a set of groups to have a common enemy, which is in turn the fixed sum dynamic. The first and most obvious emergent benefit that derives from allegiances is larger numbers. Larger numbers also means a greater communication network and this network is established and reinforced by trust. As a result, these networks can tend to be very good conduits for ideas and inventions. These inventions were often of a war-like nature such as a better bow and arrow; but just as often would be of peaceful nature; such a more innovative farming practice.

So it is clear that non-zero sum dynamics is often found in intimate relationship with zero or fixed sum ‘drivers’. It is also clear that the ability to share ideas and innovations is the baseline hallmark of culture. As a result, we can safely conclude that game theory dynamics has played a very major role in human cultural evolution. With literally the same mix of dynamics, we move from simple prehistoric economies through to chiefdoms and empires to the eventual advent of the modern nation state and its corresponding international economies. Each successive era built on two past precedents for their foundation namely, communication and trust or the lack thereof.

There are a few salient points that we should cover at this point. The first is that each phase in cultural evolution has its established ‘personality’ and ability to establish common channels for communications and trust. The second point is that each culture created their own system of governance to maintain these frameworks and provide assurances that individuals were not cheating the system. The third point is that cultures tend to grow and assimilate other cultures with each subsequent phase. Sometimes these assimilations were peaceful other times not, but in most instances the assimilated cultures benefited from the dynamic. Indeed, history shows than in many instances cultures and peoples willingly submitted their own sovereignty in lieu of increased security or enhanced prosperity. Finally, as these cultures grew they created larger and more complex non-zero sum dynamics with larger and more complex potential for emergent benefits. In each instance, new and more complex systems of governance became required because the potential of cheating became higher, more problematic and harder to identify. As an example, if a goat shows up missing in an isolated village of one hundred people it is fairly easy to find the perpetrator. When we compare this with fraudulent representation in a trading contract with several middlemen along the ancient silk highway between China and Rome we have quite a different problem. At this point we need to consider that in order to meet these newer requirements, new technologies and methods needed to be created which would facilitate the preservation of communication and trust by identifying those who would try to subvert the system. Today we can extrapolate this to E-commerce and real time international trading. Indeed, this is the primary and most solid rationale for a communications company to be involved in Identity and Security Policy Management. They are the 21st century equivalent of the middlemen on the silk highway.

 

Business as usual?

 

            As shown in the previous section, the evolution and development of human technology is directly and inextricably tied to human culture and commerce. This is so much so that human technology could even be described as an expression of human culture. Indeed, this is the position of many anthropologists and historians. Many, if not most of the innovations throughout human history have been created or invented to either increase the potential non-zero sum dynamic (i.e. to make more profits) or to preserve its existing scope (i.e. to protect existing profits). If this is indeed the case then we can see that by analyzing the cultural and corresponding business and commercial dynamics in terms of the requirements mentioned in the previous section we can to some degree gain better insight as to the next directions in technological evolution. In fact, it is the position of this paper that by using this mentality, better insight will provided with which to actually invent the technologies to meet these ongoing and constantly changing requirements.

            It is also important to consider that in larger technology companies, both external and internal dynamics need to be considered. Up until now, most of the focus has been on the external dynamics, or rather what this all means to a company and its customers who will work within the system of governance for E-commerce. Consider also though that there is a game theory dynamic that also occurs within a company itself and that by paying attention to this dynamic an increase in emergent benefits can in turn be realized.

            In order to qualify these statements we will now take a look at the different forms in which game theory is manifested and why. While this could range into a deep philosophical discussion, we will avoid that by limiting the discourse to the perspectives of a company or enterprise and its business focus, specifically communications technology. It is the position of this paper that a company can vastly optimize both its internal and external processes by tuning them to better realize the non-zero sum dynamic that has shown to be a fairly dependable guideline over the millennia of human history.

 

           

The Prisoners Dilemma revisited

 

            If we will recall the text book game theory example that we covered earlier, there is an optimal non-zero sum option that can only be reached by mutual trust and communication. Without this, it is highly likely that the optimal option will not be realized. Instead, some lower fixed sum option instead will likely be selected. In the example, we used logical deduction to illustrate that the most probable scenario is that both suspects will get three years in jail. Now one might argue that only one of the suspects might confess (note that this is ‘cheating’ against the system of governance). In that instance, one of the suspects will reach a higher optimal option, but they will do so at the expense of the other. Now one might ask, what is the problem with that? Clearly, one of the suspects has benefited significantly. 

The answer is that yes, this is a great approach and most definitely the best option if the other entity is your competition. However, the approach is flawed and self negating if the other entity is a partner, a colleague or another department. This is a critical differentiator that needs closer scrutiny as it largely dictates a successful versus an unsuccessful strategy, realizing that strategies can be competitive, cooperative, external or internal.

When we look at how many telecommunications companies have operated in the past, there were two major divisions, Service Provider and Enterprise. Traditionally, these two divisions were kept largely separate in both agenda and process. As we can see by the prisoner’s dilemma however, isolation is highly unlikely to yield the optimal benefit. Each division is likely to make their own decisions based on their own requirements and motivations. While these decisions will likely benefit each respective division (as we (hopefully) have very competent individuals and groups making these decisions), it is not likely that the other division will reap those benefits. Worse yet, it is possible that the decision of one division could adversely affect the other.

Recent activities in many Chief Technology Offices have focused on cross communication and trust between these divisions which will better allow for technology reuse and more optimal benefits from research and development. Two pertinent examples are the recent activities around Identity Management and Video. In Identity Management specifically, the Enterprise IdM strategy will reuse technology out of the Service Provider 3GPP/IMS project while the 3GPP/IMS project will benefit by the work that has come out of Enterprise in the areas of federated security policy frameworks. It should be noted that these emergent benefits are most often facilitated by establishing clear communication and trust between the divisions and that only by this are these types of benefits realized.

            We need to also realize that most often each division is broken into product ‘silos’ that are largely run as separate P&L organizations. In the past, decisions regarding product and solution directions were made within these silos with little or no communication between them. Obviously, this takes the two dimensional prisoner’s dilemma into a multi-faceted ‘cut throat’ monopoly game where no coalitions are allowed and every player is in for their own interests. This is so because each silo formulates their own strategy and then goes to the investment board for funding based on their own interests. Of primary importance here is that this all occurs against a fixed amount that is defined in the budget. What results is a fixed sum dynamic. The optimal scenario can not or is highly unlikely to be realized. This means that the company will not fully realize its investment. Now it could be argued that the investment board could make the choice of which strategy receives the lion’s share of the funding and indeed it does. But this allocation is done at low modularity, somewhat like doing Bonsai with a machete. What happens in this instance is that the investment board becomes the equivalent of the ‘state of war’ and all of the silos become the equivalent of participating nation states. In order to mitigate this, communication and trust needs to be established between the different organizations and a degree of sovereignty needs to be sacrificed in order to realize any emergent non-zero sum benefits. What is communication in this example? It is the sharing and ultimately the joint-creation of strategies that cross-leverage each other. These higher level strategies are then what are presented to the investment board. What is trust in this instance? It is the ability for everyone involved in the strategy to know and trust that everyone will complete their plan of record (their ‘tribal’ obligation if you will) and thereby trust in the vision that the strategy promises. They will also trust that the ‘system of governance’ will assure that this takes place and that the appropriate allocations of development funding will be provided to them. Again, it is in the best interest for those in the system of governance to assure this otherwise trust is lost, which is the essence of their power.

            Again, the recent activities around Identity and Policy Management are defining the strategy at the high end architectural level and then creating the cross-silo communication and trust to establish concise plans of record with associated phases and timelines. The CTO groups should then work as the system of governance to assure that the levels of communications and trust are maintained as the overall strategy is realized. If this is done, then the mechanics of non-zero sum dynamics are realized and the emergent benefits of the strategy are realized. If the CTO groups fail in this regard, the trust will be lost and all assurance of the strategy along with it. The question begs whether a company can really do this. History says yes, provided that the system of governance is maintained.

 

Hey, let’s gang up on them!

 

            Up until now we have been focused on internal processes and issues. At a face value, we can make the statement that from an internal perspective everyone should work in a non-zero sum dynamic; whereas from an external perspective the company (the tribe for all intensive purposes) faces the vast fixed sum dynamic of the industry at large. In essence, all other companies have the potential of becoming enemies or more properly put competition (no matter how much we would like – we really can not throw spears at them). Obviously, it is not as simple as that. The reason is that the industry could be viewed as a vast ecosystem with many niches and broad fields. This means that some companies are less of a threat than others simply because of their areas of focus.

            In these instances companies will often create partnerships. By now, it should seem intuitive about what this means. Yes, it means that a non-zero sum dynamic emerges out of the fixed sum dynamic of the industry at large. By this, both companies should realize additional emergent benefits that would not have occurred otherwise. Obviously, there needs to be a cohesive strategy for the alliance, which requires communications at an in depth level and there needs to be established trust that each company will hold up its end of the bargain and will not act in a predacious manner. The problem with partnerships is that it is difficult to provide a firm system of governance to provide assurance. Therefore, strategies with partners require more energy from a human perspective. There needs to be the face time at the highest levels as makes sense on a continuing basis to provide this assurance. It is also critical that ongoing communications is maintained between development teams in a very similar fashion to the internal processes, with joint plans of record and associated phases and timelines. There is one key difference however in that there is no sacrifice of sovereignty to the partnership. Here we see a different non-zero sum dynamic, or rather a non-zero sum dynamic that is based on a different system of governance; a mutual system of governance. In this system the parties involved will hold their end of the bargain as long as everyone else does and (this is very important) the perceived emergent benefits of the strategy are realized. If either does not occur then the system of governance fails and the partnership fades away or worse yet becomes competitive (which can bring its own set of problems particularly if the partnership has been long and successful because this means shared installed base – but as non-zero sum logic dictates – if the system of governance is good and emergent benefits are continued to be realized then there is no reason for the partnership to fail outside of neglect).

            We have rough equivalencies in the political world with the internal non-zero sum dynamic being something like a national government or legal system and the partnership non-zero sum dynamic being more like a coalition type of arrangement similar to the United Nations or NATO. Again, the main difference between the two is that in the first instance constituents surrender some degree of sovereignty while in the second example this does not occur. It should be noted though that some degree of autonomy is always surrendered in order for any partnership to be maintained. It should also be noted and we have already shown that the major tenets of game theory and non-zero sum dynamics holds true in both instances.

            In summary, we can conclude that game theory dynamics not only adequately describes evolutionary frameworks, both biological and cultural. We can also state that practically every human thought and action is in some way associated with these dynamics. As a result, any aspect of culture, even that of directions in technology can be extrapolated and to some degree preempted by examining them against a game theory context.

 

It’s a Jungle out there

 

            The last dynamic we will speak to is the fixed sum dynamic. Here we will show that not only is it impossible to eliminate fixed sum dynamics, there are valid reasons why one would not want to do so. The reason is simple; fixed sum dynamics provide the background or fuel for non-zero sum dynamics. Put into the context of a company, it is their competition that causes them to innovate.

            History has shown that it is the fixed sum dynamics that creates the motivation to build non-zero sum dynamic systems. This motivation is always driven against a framework of adversity. Even in the case of peaceful innovations such as farming practices, there is the fixed sum dynamic of the adversity of nature and the fact that, if left to itself, entropy (or more properly put competitive elements of order) would drive the environment back into a wild state.

            History also shows that cultures tend to decline when reaching a zenith. Recent findings have shown that often the reasons are a decrease if not ceasing of the non-zero sum dynamic. As the innovations cease the society also ceases to be competitive and will eventually subside to another more dominant, more innovative culture that has a stronger set of non-zero sum dynamics from which to leverage.

            In summary it is the entity that maintains the non-zero sum dynamic that maintains dominance and perceived invulnerability. As an example, it is often said that some companies create markets. This is a very serious misconception, not even monopolies can claim that. Smart companies preempt markets and they do it by remaining tuned to the non-zero sum dynamic whether they realize it or not. If one takes the time to analyze every dominant technology company, one will find that this is the case in every instance. There is an ability to sense market trends and dynamics, (the fixed sum environment if you will) and thereby adjust their own non-zero sum dynamics (their strategies) to better align with the optimal attractor (the market direction). The end result is the emergent benefits to both the company and the market in question.

 

 

Now let’s put some gas in this sucker

 

            What this all comes down to is providing a win-win situation for those who participate in the non-zero sum dynamic. If this were not the case then those who are getting the short end of the stick will remove themselves from the dynamic. If we look at business ecosystems and why they succeed, it is because they drive non-zero sum dynamics in several different vectors. If take the example of a bank or financial institution for instance there is the obvious customer dynamic between a communications company and theirs. This would be termed as the northbound vector. They also have customers however (this would be termed as the southbound vector) and they realize success not by whether or not they buy from a certain communications company. They succeed because they are able to better offer their services and satisfy their customers banking requirements. As long as they are able to do this and realize the emergent benefits of increased business or reduced cost of business then the decision to purchase from a given company is deemed a good one. Note though that there are many other elements within the bank that also determine the level of success their customer’s experience. A good account manager would know these other elements and try as much as possible to influence them to better align to a positive result. All of this points to a requirement for expertise in the financial sector for that account manager.

            A more complex example is that of IP based television services. In this instance we also have the northbound ( the companies customer) and southbound (the customers’ customers) business vectors. In addition, we also have a serious set of fixed sum dynamics in the way of competition from competing companies. A wise company will provide a method of gaining inroads into this market by leveraging on partnerships against the common fixed sum dynamic of the competition. These partners are hoping to realize emergent benefits as well. By aligning a strategy to a non-zero sum dynamic, a successful business ecosystem can be driven where a company and its partners are better able to compete against the competition. This is a non-zero sum dynamic that is intended to negate or avoid negative sum losses, which would be the competition taking the emerging market. Further, if the right kind of innovation is provided, a partner ecosystem can build a different non-zero sum vector that in turn provides emergent benefits to the provider customers and finally to their customers, the actual subscribers. By creating solid non-zero sum dynamic vectors a business ecosystem can flourish and grow. Furthermore, in order to continue along with a successful strategy, the dynamics of the ecosystem would need to be continually tuned and optimized to the market requirements and demand (which provides the role of the attractor for the system).

            If we take things down to their most basic level, people (which are the market) want a series of five things. First, they want to be able to provide for themselves and their family, second they want to be able to acquire things; third they want to be provided with security to protect themselves, their loved ones and the things they acquire. Forth, provided that the other requirements are met they want to be entertained in their idle time. Lastly, there is the nebulous aspect of self fulfillment and worth that every individual needs to feel. It could be argued that this last need is met by succeeding in the other four. This is simple in that it ignores the spiritual dimension of life. However, since we are purposely avoiding any issues of abstract philosophy we will ignore this, but note that it often can be a very strong need or motivating driver.

By looking at these requirements, we can more easily begin to tease out the areas where technology can better address these needs. A good emergent example that is driven by a strong fixed sum dynamic of the recent rise in international terrorism is the area of civil communications infrastructures. Additional threats from the environment such as the 2004 tsunami and Katrina all point to the need to provide increased security for the populous.

This has created a very strong non-zero sum dynamic in the market where enhanced communication and coordination services for emergency response teams will be a big market and a lot of dollars will be spent to meet these requirements. We need to remember that there are other dimensions of non-zero sum dynamics as well. As an example, look at the amount of negative press that resulted to the Bush administration and FEMA over the bungling of the emergency response to New Orleans. As a result to the imbalance of the political non-zero sum dynamic, the federal agencies in question are making major enhancements to their infrastructures and communications systems. In essence they are reacting to a fixed sum dynamic.

            Again, companies are being preemptive to this market by aligning with some of the major integrators in the industry to put together a civil communications practice to address these new and emerging market demands. Again, we have internal as well as partner non-zero sum vectors as well as the customer vectors which would be the agencies themselves.  These vectors define the business eco-system. From there the business vectors end or more so translate into the political non-zero sum vectors. In essence, if companies succeed in the market with a civil communications practice, then partners in turn succeed. In turn the agencies are better able to respond to any new threats or emergencies. This in turn saves lives and this in turn creates a positive political non-zero sum dynamic. People feel safer because they believe that they are safer by the improvements that they see when the next emergency situation transpires.

            I do not mean to belittle people’s lives by tying them into the political dynamics, but unfortunately this is often the reality. No one wants to see human lives lost but the reality is that this will often be tolerated or ignored if the political costs are not that high. This again pits us into a different set of game theory dynamics that is outside the scope of this paper, but can be readily extrapolated from the tenets thus far established.

 

 

In Summary

             

It has been the intention of this paper to show that game theory dynamics is an effective method for representing evolutionary dynamics. It has also been the intention to show that almost every aspect of human behavior is in some way associated with these dynamics. These are important insights whose import can not be overstated.

By using these concepts we have shown that we can to some degree predict the requirements of the industry by looking at the associated dynamics of society. We have also shown that technologies, no matter how high or abstract, are prone to these dynamics.

It has also been the position of this paper that these dynamics also come into play within a companies business processes, its partnerships as well as how they address their competition. Indeed, game theory dynamics is about as close to a crystal ball as you can get. No, it’s not magic, but it’s pretty damn close.

 

Bibliography – Books for further reading interest

 

NonZero – The logic of Human Destiny

Robert Wright                                                  Vantage Press              ISBN 0-679-75894-1

 

In the wake of Chaos

Stephen H. Kellert                                            Chicago Press              ISBN 0-226-42976-8

 

Deep Simplicity

John Gribben                                                    Random House ISBN 1-4000-6256-X