3.1.1. Definition of the Model
Definition 1. Network graph model
The network graph model for penetration test (NMPT) is a network graph model that is suitable for describing the penetration test process. NMPT is constructed by extending the attributes of edges and nodes based on the traditional network graph model. NMPT is defined as a 4-tuple:where V is the set of nodes. E is the set of edges. is the connection relation between nodes, expressed in the form of a matrix. H represents the network hierarchy, indicating the location and hierarchical relationship of nodes in the network topology. Definition 2. Nodes in NMPT
For the network graph model, , where the set of nodes , for each node in the network, according to the network security attributes related to the penetration test, the node attributes is defined as a 3-tuple:where Vid is the identification of a node. Vtype indicates the type of node. Vattr indicates the attribute of node. Due to modern network’s characteristics of gradation, heterogeneity and complexity, we consider dividing the Vid into two parts, mandatory information and custom information, according to the usage of the scenario. Therefore, the node identifier Vid is defined as:
MInfo is the mandatory information segment, including the node name, serial number, etc., as the basic information that must be provided to relevant personnel. The serial number is the globally unique identifier of the node. CInfo indicates the user-defined information segment. The user-defined information segment meets the requirements of different network scenarios and serves as a supplement for identification information.
Due to social engineering methods, the penetration test is widely used with a high threat and characteristics of a high yield. Therefore, we consider the characters and their effect on social relations in the process of the penetration test. We set the node type as not limited to the host node type, and it contains person nodes, so the node type is expressed as:
Vtype is the host node type, and Vtype is the person node type. Because host nodes have multiple meanings and are often mobile and virtual, host node types can be further represented as Vtype = <Vmob,Vvrm>. Vtype is classified into administrators and common users based on their operation rights. Vtype = { Administrator,User}.
The penetration test process involves many attributes related to network security. People, as a key factor in social networks and social engineering, should be included in the network model for a penetration test.
a. Attributes of host-type nodes
Attributes of the host include service, operating system version, vulnerabilities, the value of host, permission level, current running status and host assets, which are represented by 7-tuples:
Service indicates the open service information of the host. OSV indicates the operating system version information of the host. The host operating system version could be Windows, Linux or CentOS. Vuln refers to the vulnerability of the host. The vulnerability of the host consists of identification and vulnerability information, which is expressed as follows:
ID
is the identifier of vulnerability, representing a specific host vulnerability. Info
represents the vulnerability information, including vulnerability type, vulnerability description, exploitation effect, success probability and exploitation cost, which is expressed as follows:
where Type
is the type of vulnerability, including remote vulnerability and local vulnerability. Desc
is a description of vulnerabilities, which is used to expand vulnerability explanation information. Eff refers to the effect generated by the exploit of the vulnerability, such as host permissions obtained by exploiting means and exposure of new potentially reachable nodes and related connection credentials. Prob represents the probabilities related to the exploit vulnerability. CVSS vulnerability scoring standard is used as the criterion to evaluate the probability of vulnerability success. Cost refers to the cost of the resources consumed by exploiting the vulnerabilities. The Cost setting is related to the difficulty of exploiting vulnerabilities and the amount of resources consumed by the penetration tester.
Val indicates the value of the host. The larger the value is, the more key sensitive information the host contains, and it is worthwhile for penetration tester and defenders to attack or defend. CPL indicates the attacker’s permission level on the host, which is expressed as:
NoAccess, LocalUser, Admin, System }. CRS indicates the running status of the current host, and it is expressed as:
Running, Down }, where Running and Down indicate the running status and downtime status, respectively. Prop
is the network asset owned by the host, which is expressed as: Prop
Data, File, Cred, Link >, where Data represents the sensitive data contained in the host, File represents the file directory contained in the host, Cred represents the connecting credentials contained in the host and Link represents the links and pathways pointed by the host to other host nodes and key directories. Combined with the above definition of host node attributes, the host node attribute information in the enterprise network identified as HOST1 is expressed in
Figure 3:
b. Attributes of person-type nodes
Vattr
can be used to describe the key information of nodes of the person type in social networks and target networks, including personal information and personal assets, which is expressed as follows:
Info
is the description and expansion of the relevant information of the person node. Prop
is a description of the available assets that a person contains and is expressed as Prop
Info
, Cred
. Info
refers to the social information pointing to other character nodes. Cred
indicates the connection credential that character has that can log in to the target host. Combined with the above definition of person node attributes, the person node attribute information in the social network identified as Tom is expressed in
Figure 4:
Definition 3. Edges in NMPT
For the network graph model , the set of edges , for each edge, according to the network security attributes related to penetration test is defined as 3-tuples:Eid is the identifier of the edge. Etype is the type of the edge. Ecap is capability of the edge. The identifier of the edge in the network is the globally unique identifier of an edge. Different from the Vid of a node, the Eid of an edge is often associated with the node.
As the node has two types of host node and character node, considering the different meanings of the edge between the host node and person node, the types of edge are divided into three types, which are expressed as follows:
HH represents the edge between hosts. PP represents the edge between persons. PH represents the edge between the person and the host.
Definition 4. Connections in NMPT
For the network graph model , its connection relation is an important part of representing the reachability between nodes and the relationship between people. There are three connection relations of defined in this paper, which are expressed as: is the connection relation between host nodes. It is expressed in the form of an adjacency matrix and denoted as , where M is the number of host nodes, which are expressed as: For each element of , is described as: , where 0 represents the unconnectedness between hosts in the network, 1 represents the connectivity between hosts in the network. P represents the firewall filtering policy included by hosts based on the network connectivity.
represents the social relationship between character nodes, expressed in the form of the adjacency matrix, denoted as , which is the number of character nodes, which is expressed as: Each element in is described as , where the value 0 indicates that there is no social relationship between person nodes and 1 indicates that there is a social relationship between person nodes. Since the social relationships between persons are equivalent, that is to say, the social relationship between person A and person B is equal to the social relationship between person B and person A.
represents the ownership relationship between person and hosts in the form of an adjacency matrix. It is denoted as , where m is the number of host nodes and n is the number of character nodes, which is expressed as: Each element in is described as , where the value 0 indicates that there is no ownership relationship between the person node and the host node, and 1 indicates that there is an ownership relationship between the person node and the host node. The relationship between the social network composed of person nodes and the network structure composed of host nodes is shown in Figure 5. In
Figure 5, each person node in the upper network might have the control authority or sensitive information of one or more hosts while forming a social network with each other. Host nodes in the underlying network not only constitute the network topology connection but also contain security-related attributes and assets. The dotted green line indicates the ownership relationship between the person and the host; that is, the person has the control rights of the host.
Definition 5. Hierarchy in NMPT
The network hierarchy is used to represent the network area where the host nodes reside in the network topology. It is expressed as follows:SN indicates the maximum number of subnets on the network. H indicates the maximum number of hosts in a subnet. Π
is the network structure concretization of all host nodes in the network model, which is expressed as: For each in Π:
Vid
is the unique global identifier of a host node. SN is the number of a subnet in the network hierarchy. H is the number of a host node in a subnet.