^{1}

^{1}

There are several security metrics developed to protect the computer networks. In general, common security metrics focus on qualitative and subjective aspects of networks lacking formal statistical models. In the present study, we propose a stochastic model to quantify the risk associated with the overall network using Markovian process in conjunction with Common Vulnerability Scoring System (CVSS) framework. The model we developed uses host access graph to represent the network environment. Utilizing the developed model, one can filter the large amount of information available by making a priority list of vulnerable nodes existing in the network. Once a priority list is prepared, network administrators can make software patch decisions. Gaining in depth understanding of the risk and priority level of each host helps individuals to implement decisions like deployment of security products and to design network topologies.

Computer networks are undoubtedly vulnerable no matter what level of hard ware, software or a combination of both types of security parameters are incorporated. As long as the network servers provide services on different host servers, they depend on the server software that may have security holes which makes them susceptible to malicious attacks. To detect and/or prevent the network accessible resources from suspicious attacks, various commercial Intrusion Detection Systems (IDSs) [

In order to evaluate the security risk of a large scale enterprise, an administrator must consider not only single vulnerability exploit but also the multi-stage and the multi-host vulnerability attack used by the attackers. To incorporate this fact, an attack graph is built to find out the logical relationship between multiple exploits. However, when size and complexity of the network increases, two major problems occur. First, the attack graph grows exponentially when the size of the network and algorithm complexity increase. Secondly, comprehending the information conveyed by the graph becomes difficult. Therefore, the attack graph that addresses the issues mentioned earlier were chosen and we will explain further in the next section.

Very little has been done in scientific and research community to develop statistical model that quantify the overall network security risk. Most of the work focuses on qualitative and subjective aspect of networks without having formal statistical model. To get rid of this problem, we introduce the statistical model that uses Markov chains in conjunction with CVSS framework metrics to analyze risks associated with structures of various networks. The model can be used to identify critical nodes in the host access graph where attackers may be most likely to focus. Based on that information, a network administrator can make appropriate, prioritized decisions for system patching. Further, a flexible risk ranking technique is described, where the decisions made by an attacker can be adjusted using a bias factor. The model can be generalized for use with complicated network environments.

In recent studies, S.M. Rajasooriya [

In the present study, we are proposing a stochastic model for the security risk evaluation for the entire network based on the Exploitability sub-score and Impact sub-score. We are considering a realistic network topology having three host servers and each host consists of one vulnerability. Based on the network architecture and given firewall rules, a host access graph is constructed. From the host access graph one state transition probability matrix is computed by utilizing Exploitability sub-score and Impact sub-score. By using the Markovian random walk, we can prioritize the risk associated with each node via ranking.

Finally, summing up the risk associated with all the nodes presents in the network, we determine the overall network security risk. This quantitative value can be taken as a security metric to determine the risk of an entire network. Finally, the schematic network topology in our study represents a typical security system that is in operation. Thus, our proposed statistical model and methodology can be applied to a specific security system that is in place for a given company.

In this section, we have defined some of the basic terminology related with cyber security. We also explain the basic idea of the Markov chain process that is implemented to develop the stochastic model to achieve our objective.

A vulnerability is a flaw that exists in computer resources or control that can be exploited by one or more threats. A software vulnerability [

Attackers usually penetrate any type of computer network via a chain of exploits where each exploit in the chain creates the foundation for upcoming exploits. A combination of such exploits make the chain called attack path; a collection of such attack paths develop the attack graph. An attack graph is a succinct representation of all paths through a system that ends in a state where an intruder has successfully achieved its goal [

graph when a number of nodes and complexity of the network increase. As the scalability and complexity of the network increase exponentially, the computation cost to create the attack graph increases. As a result, it is difficult to interpret the attack graph precisely. On the other hand, most of the attack graphs are designed for a single target, and cannot be used to evaluate the overall security of the networks with several targets. To address these striking problems Anming Xie, Zhuhua Cai, Cong Tang, Jianbin Hu, and Zhong Chen [

CVSS [

A Markov chain is regarded as one of the best modeling techniques that has been used effectively in various fields such as reliability analysis [

Mathematically, a Markov chain can be defined as a discrete stochastic process [

The Markovian properties reveal the fact that the transitions between states are memoryless; transitioning to the next step depends only on the current state and not on any of the other previous states. We can correlate this property with the attacker’s behavior in a sense that an attacker needs to exploit several nodes before reaching the goal node. When the attacker starts attacking an immediate node to reach the goal node, there are many nodes available before reaching the goal node called intermediate node. When an attacker reaches any intermediate node, there is no memory of previous node. The attacker launches further attacks until the goal node is found. To advance the attack, the attacker should move from one intermediate node to another/several intermediate node/s. In the present study, we have assumed that selection of the best intermediate node depends on three parameters, namely Exploitability sub-score, Impact sub-score, and an individual skill of the attacker called Bias factor.

Without loss of generality, transition states are independent of time. Mathematically, there exists some transition probability matrix, P(x, y) such that

We can create a new set of states S ´ [n], having a different set of states associated with each timestep. In the present study, P(x,y) represents the transition probability matrix. To simulate the Markov chain, a stochastic transition probability matrix P and the initial probability distribution is required. In the present study, initial risk associated with each nodes in the host access graph is considered as initial probability distribution which will be explained further in Section 4. Once we have the stochastic matrix P and the initial risk, then utilizing the basic properties of Markovian process, we can determine the risk of the entire network.

The schematic network given below,

For simplicity limited numbers of nodes are present in our network illustration and we have developed a host access graph manually. However, as the size and complexity of the network increase, we can use any kind of attack graph generation tools [

severity scores for each vulnerability. The score is computed based on the standard expressions. The standard expression depends on several matrices that provide a quantitative score to approximate ease and impact of exploit. In the present study, we have applied both scores to determine whether it is beneficial to move from one node to another node from the attacker’s perspective. These two scores, that is, Exploitability sub-score and Impact sub-score, are combined to provide the basis of assigning the edges of attack graphs to represent the values of the probability distribution. This probability represents the possibility of a vulnerability to be exploited by an attacker. While implementing our stochastic model, the behavior of the attacker is another concern. In this study, we assume that the attacker will choose the vulnerability that maximizes the chances of succeeding in compromising the goal state. Due to any reason, if the attacker terminates attacking, then the attacker will move to the initial state. Finally, utilizing the properties of Markov chain, the risk of the individual node is computed. Nodes are prioritized based on computed risk. Then, we sum the risks of all the nodes that will give us the total security risk present in the network.

The central component of the proposed stochastic model exclusively depends on the host access graph mentioned in the previous section. Before delving into the modeling approach, let us formally explain the host access graph as shown below by

In

Once the host access graph is constructed, then our basic foundation is developed for further analysis. To make this graph more applicable and realistic, we have modified it by adding one additional dummy node to represent the attacker. The attacker starts exploiting the immediate node by gaining a high level of privileges. In reality, even if an attacker is equipped with sophisticated tools and a high level of experience, there is no guarantee that he/she will reach the goal node. This may happen due to reaching a level of difficulty or being discovered by an intrusion response team or any type of unusual circumstance. Whenever the attacker stops launching attacks at any point due to any reason, then he/she goes back to the initial state from where the attacking began. To incorporate this attack scenario, a dummy node A is introduced. For any node

In our proposed model, the attacker starts attacking the immediate node and keeps on launching attacks until it reaches the goal node. One big question that arises here is “what happens if the attacker is encountered with the multiple nodes to reach the goal node and on what basis the attacker decides to select the best node from the available alternatives.” We have assumed that the attacker’s decision solely depends on two parameters. The first parameter is Exploitability; it is all about the level of complexity involved to attack the vulnerable node. The second parameter deals with Impact factor, which means how much impact can an attacker make when a vulnerable node is exploited. CVSS provides numerical scores scale of (0,10) where 0 signifies the most secure and 10 signifies the least secure of the mentioned parameters. These two parameters are conceptually expressed by,

In Equation (1), we have coined the new term ExploitabilityBenefit. It is defined as the function of Exploitability and Impact factor. Using these values an attacker determines the level of benefit to change from one to another node. To clarify this idea, let us take any two nodes from the host access graph as shown

below, by

In

In the above Equation (2),

Each element of the adjacency matrix is computed using Equation (2).

Diagonal values of the adjacency matrix are all zero because no cost is involved to move from the current node to the node itself. Elements of the matrix A are not normalized, thus, the non normalized values are converted into probabilities using Equation (3). This equation reveals the fact that in each step the attacker goes from node j to k with probability given by

Writing Equation (3) in matrix form we have,

where, A is the weighted adjacency matrix. P is the transition matrix that provides the transition probability that the attacker moves from one state to another state and D is the diagonal matrix computed using Equation (5) below,

Finally, we have constructed the transition matrix (using Equation (4)) representing transitions probability that an attacker moves from one state to another state, that is, from state j to state k.

Consider an attacker starts attacking from the initial node to the goal node. The attacker must obtain a user level or root level of privilege on the intermediate node to advance the attack further to reach to the goal node. In reality the attacker should try to obtain the highest level of privilege. Host access graphs are created based on the philosophy of gaining high level of privilege. Nodes of the host access graph are treated as OR nodes, which can be satisfied if any of the child node is true. The risk analysis is based on the relative rank value for every node of the host access graph. R is the risk vector and its initial risk value is computed based on the number of hosts present in the host access graph. Suppose there exist N nodes in the host access graph; then simply set all the node ranks equal to 1/N. This initial risk is first injected by the starting node of an attacker. This risk value flows level by level until convergence. The complete risk ranking algorithm is described by the schematic diagram given below by

The risk value of

Suppose,

j. Equation (6) is further extended to Equation (7) as shown below. The risk values are normalized, where

The value of R in Equation (7) is recursive and must be iteratively calculated until convergence, which is expressed by Equation 8, that is,

The attacking process is based on the Markovian random walk, that is, an essential condition for the iterative computation to converge [

To validate our proposed stochastic model, we have modified the network scenario [

The firewall rules are created to filter inbound and outbound traffic. A summary of firewall rules of the network scenario are shown in

Source | Destination | Service | Action |
---|---|---|---|

All | WS | http | Allow |

All | WS | ftp | Allow |

All | FS | ftp | Allow |

WS | BEDS | oracle | Allow |

FS | BEDS | ftp | Allow |

All | All | All | Deny |

We have assumed that each of the target hosts consists of a single vulnerability. The attacker utilizes the vulnerability score to compromise the host. These are shown below by

Based on the experimental topology with its firewall rules and vulnerability associated with a respective host, we have generated a host access graph as shown below by

Note that the diagonal elements of the above weighted adjacency matrix are all zero in a sense that practically no cost is involved to move from the current node to the same node. For the sake of simplicity, we have assumed the value of

Host | Vulnerability | CVE-ID | Score | Impact Sub-score | Exploitability Sub-score |
---|---|---|---|---|---|

WS | Apache Chunked Code | CVE-2002-0392 | 7.5 | 6.4 | 10 |

FS | Wuftpd Sockprintf | CVE-2003-1327 | 9.3 | 10 | 8.6 |

BEDS | Oracle Tns listener | CVE-2012-1675 | 7.5 | 6.4 | 10 |

Node | Risk |
---|---|

M_{0} | 0.261 |

M_{1} | 0.245 |

M_{2} | 0.262 |

M_{3} | 0.231 |

adjacency matrix are calculated using Equation (2). For example, the entry of the first row and second column is (0.5 ´ 10 + 0.5 ´ 6.4) = 8.2. This is the weighted value that the attacker uses to move from the node

Once we have the weighted adjacency matrix A, then we need to convert its elements into respective probabilities; thus, it requires constructing a diagonal matrix. The entries of the main diagonal are obtained by using Equation (5) as shown below.

An element of the first row and the first column of the diagonal matrix is 1/(8.2 + 9.3) = 0.05714. The same idea is used to compute the rest of all the elements. Utilizing weighted adjacency matrix and the diagonal matrix computed as shown above, we have obtained a transition matrix P via Equation (4), that is,

Note that the extents of the first row second column is, 0.46857. It is the transition probability of the attacker moving from node

In

From the numerical value, we can conclude that node

In this study, we have developed a stochastic model for cybersecurity using host access graph to determine the overall network security risk. Our model uses Markov chains in conjunction with CVSS framework to comprehend and analyze the risk associated with the structure of the network. This developed model determines the critical nodes existing in the host access attack graph where the attacker is most likely to visit. Based on this information, a network administrator can make the appropriate decision about system patching with priorities. The risk ranking algorithm that we implemented is very flexible in a sense that we can model the attacker in terms of skills and expertise by changing the Bias factor

applicable to a specific security system that is in existence or it is required by a given enterprise.

For future study, we plan to extend the model by incorporating other factors such as zero day vulnerability, propagation distance between the nodes and topological information in order to create a more comprehensive and integrated approach to evaluate overall security risk.

Pokhrel, N.R. and Tsokos C.P. (2017) Cybersecurity: A Stochastic Predictive Model to Determine Overall Network Security Risk Using Markovian Process. Journal of Information Security, 8, 91-105. https://doi.org/10.4236/jis.2017.82007