A Comprehensive Survey of Social Based Routing for Delay Tolerant networks

: Delay-tolerant networks (DTN) is an approach to deal with scarce network connectivity found in sparse mobile ad-hoc networks (MANETs) which makes the problem of routing messages a challenging task. DTNs have find their usefulness in many challenging environments such as tactical networks, underwater sensor networks, wildlife monitoring, disaster recovery etc. Pocket Switched Networks (PSNs) have emerged as a new application of the delay tolerant networks where network nodes are computing devices carried by humans. Hence, the study of how humans interact in their day-to-day life, the places they visit frequently, the people they meet frequently, the social groups in which they participate on regular basis etc. can help improve routing process in PSNs. This type of routing inspired from the way humans interact with each other is referred to as social based routing and had been a recent topic of research in the field of DTNs. This paper presents a comprehensive survey of the various social-based algorithms that have been designed for Delay Tolerant Networks.


Introduction
Delay-tolerant networks (DTN) is an approach to deal with scarce network connectivity found in sparse mobile ad-hoc networks (MANETs) which makes the problem of routing messages a challenging task. The mobility of a node or its power failure or sometimes the low duty cycle of nodes may cause frequent disruptions in the network connectivity. A complete and continuous end to end connectivity may be unavailable which is one of the essential conditions for routing in ad-hoc networks. Hence, all the traditional ad hoc routing protocols whether proactive or reactive fail to work in environments facing intermittent connection or frequent network partitions. Delay Tolerant Networks have been designed to work in challenging environments characterized by frequent network partitions, sparse network density, long and variable delays ranging from hours up to days and low and asymmetric data rates. DTNs exploit the 'store carry and forward' paradigm to route messages towards the destination as finding continuous end-to-end connectivity is a difficult task [1]. One of the basic assumptions is that each node is equipped with a persistent storage which helps message buffering for a long time. Each node buffers the message and forwards them only when it comes in proximity of another node. This node also carries the messages that it has received and continues this forwarding process to nodes encountered in subsequent connection opportunities until one of the nodes carrying the message meets the destination or the is dropped from the node buffer due to expiration of its TTL.
Routing process in DTNs involves a decision making by a node whether to forward a message stored in a node's buffer to the encountered node or not. DTN routing protocols can be divided classified broadly as: Naïve Replication based routing and Utility based Forwarding mechanisms Naive Replication: In naive replication-based routing approach, stored messages in a node's buffer are forwarded blindly to the encountered node without considering its capability to meet the destination in near future. Few common examples of routing protocols which fall under this category are Epidemic routing, Spray and Wait etc. While epidemic routing creates unlimited number of replicas of a message, spray and wait creates only a predefined number of replicas of a message in the network.
Utility based routing: In utility based routing approaches, the forwarding decision is guided by a utility metric which may be based contact history of node meetings etc. The utility metric qualifies the encountered node as relay node to forward the message copies. Some of the popular utility-based routing protocols are: Prophet, MaxProp, Spray and Focus, Encounter Based Routing (EBR) etc.
A significant advancement in this category of utility-based routing approaches is the development of social-networking based routing protocols. These protocols have gained high popularity due to the ubiquitous presence of mobile devices in human societies. Pocket Switched Networks (PSNs), a confined application area of DTNs take advantage of the social characteristics exhibited by the nodes in the network which are devices carried by humans [3]. PSNs utilize contact opportunities occurring between humans to communicate in absence of network infrastructure. Since the carriers of computing devices in PSNs are humans, their social properties can be extremely useful in guiding routing process.
In this purposed paper, we provide an overview of the various social-based algorithms that are proposed in literature.

Overview of Social Characteristics
Social Graph: The interactions among nodes in a social network can be describe as a social graph. The vertices of the graph represent individuals and any interaction among two individuals results in an edge between them. In DTN, a social graph is an approach to find out how nodes connect with each other. The social ties between individuals may be of various forms such as relatives, companions and so on. A given social graph can be assessed in terms of a range of number of social metrics such as community, centrality, and similarity etc. [2] and these metrics may be used for design of a variety of social-based routing methodologies. In DTN scenarios where nodes are devices carried by humans and exhibit social characteristics based on human interactions, each node may build its contact graph by recording its interactions with other nodes. A contact graph can be made independently for each single availability. Alternatively, it can be used to record the experiences in a specific time period. Each edge in the contact graph may then be assigned weights based on contact statistics such as the frequency of meeting with two end nodes of an edge, the total contact duration between the nodes etc. Contact graphs generally describe the experience history between two nodes while the social graph speaks about the social interactions among a group of people [2]. In DTN both may be used simultaneously for effective message routing. Some of the social characteristics that have been used in the development of these routing protocols are discussed below.
Community: Community [3] defines a group of people who frequently encounter each other. The knowledge about community structures in a network helps node take better routing decisions by selecting those nodes as relays which are known to be a part of destination community. Community detection has been widely explored problem in social network analysis (SNA) but since the network structure changes dynamically in DTN, these algorithms need to be adapted to suit the network structure of DTNs. Communities may be user defined or nodes maybe divided into communities based on their contact meeting history or communities may be formed even based on other social characteristics of nodes such as their nationality, language they speak, food they eat etc.
Centrality: Centrality defines the topological significance of a vertex within a graph. In a social graph representing the network structure, centrality of a node defines the popularity of a node within the network. Centrality of a node can be measured in three different ways as defined by Newman [4][5][6].
A) Degree centrality of a node can be measured in terms of the number of direct neighbors of a node. In a social graph representing the network structure, the degree of a node is measured as the total number of links it has with other nodes. A higher degree signifies a higher popularity. Such a node can be chosen a good relay as it has better connectivity in the network.
3 of 17 B) Betweenness centrality is measured as the total number of shortest paths passing through a node in the network. Such a node can act as a good bridge node that can control and provide connectivity between nodes present in different parts of the network.
Closeness centrality is opposite to the betweenness centrality. Itis inversely proportional to the average of shortest path distance of a node with other nodes in the network.
C) Similarity centrality measures the number of common neighbors between two nodes [2]. A node can be said to be a good relay node if it has high similarity with the destination node.
Friendship: Another very popular social metric to measure the quality of contacts between two nodes is friendship. A pair of nodes can be described as good friends if they have regular contacts/common interests [7].3. Results

Social Based Routing Protocols
This section presents a systematic review of various social based routing protocols that have been proposed in the literature for DTNs.

Label Routing
Pioneer work in the field of social-based routing was done by Hui and Crowcroft by proposing an algorithm known as Label Routing [8]. This was the first work to use the concept of community structure in PSNs. It assumes that each node in the network belongs to a certain community denoted by a label. Since blind flooding is an unaffordable technique in PSNs, label routing utilizes the community label information to take forwarding decisions. A node on meeting another node compares its label with the label of the destination node of the messages stored in its buffer and accordingly decides whether to forward a message to the encountered node or not. An encountered node acts as a relay for the message only if it has same label as the destination node's community label. Label routing is based on the fact that people belonging to the same community contact each other frequently and hence can act as better relays. It has the added advantage that it requires very minimal information about a node. The implementation of such an algorithm requires only taping down the mobile device to get the label of a node. The user defined communities may not be based on the actual interactions among the nodes and hence may not be effective for data routing, especially in situations when node density is sparse. The members of the same community may not meet for a long duration or may be far off from each other which may also lead to the failure of message delivery. Thus simply relying on the community label is a naive decision.

SimBet Routing
In [2] Daly and Haahr proposed a routing protocol, called SimBet, that used a combination of two centrality metrics namely betweenness centrality and similarity centrality for selection of good relay nodes for a message. Node having high betweenness centrality value can act as good 'bridge' nodes by connecting nodes in different parts of the network whereas nodes with high similarity value have high chances of meeting a destination node directly or through its neighbours. Computing the betweenness centrality is a computationally intensive task as it requires finding all the shortest paths across the network. In order to avoid exchange of this huge amount of information across the network, SimBet calculates the betweenness centrality based on only local neighbourhood information. The betweenness and similarity metrics are dynamically updated in the SimBet at periodic intervals.
A node n on encountering node m it determines its relative betweenness (BetUtiln) and similarity utility (SimUtiln) with respect to node m as given in equation 1 and 2: It then calculates the composite SimBetUtil utility as weighted sum of BetUtil and SimUtil as in equation 3: Here is an algorithmic parameter that helps to adjust the relative importance between similarity utility SimUtil and betweenness utility BetUtil. The node n forwards a message m, destined to destination d, to encountered node m if and only SimBetUtilm(d) > SimBetUtiln(d).
The combination of two social metrics makes the protocol more effective. But the uncertainty of future encounters and the dynamic changes in the network graph can sometimes tend the SimBetUtil to fail to deliver the messages. In order to avoid communication overhead required for calculation of social metrics SimBet suggests a distributed regime to calculate them which is even more desirable in dynamically changing DTN environments. But relying solely on local information for calculating betweenness centrality may not result in very accurate bridge node identification. One way to increase the accuracy is to use larger neighbourhood information such as considering two hop nodes as well. But this may in turn increase the communication overhead.

Bubble Rap Forwarding
Hui et.al [3] proposed Bubble Rap routing protocol which became very popular in the field of socialbased routing. It also uses the concept of communities to take forwarding decision but here the communities are formed on the basis of node contact frequency in contrast to user defined communities used in Label routing. Further it uses the concept of node centrality to take better forwarding decision both within the community and outside node's community within the entire network. The network is divided into communities using Distributed k-clique algorithm which results in non-overlapping communities, hence each node in the network is assumed to belong to one community. Each node also calculates its centrality value with respect to its own local community called as local centrality as well as with respect to the global community, i.e. the entire network, called as global centrality. A node's popularity can be measured with respect to the entire network, referred to as global centrality or it can be measured with respect to its own community in which case it is referred to as local centrality. The node centrality is defined in terms of the node degree centrality. Routing in Bubble Rap routing protocol takes place in two phases: a bubble-up phase based on global centrality and a bubble-up phase based on local centrality.
In order to better understand the routing procedure followed in Bubble Rap, consider a situation where a node A carrying a message m, destined for node D belonging to the community CD, meets another node B, three cases may arise: Case 1: If node B belongs to the destination community CD, node A simply forwards the message m to node B, and stops further forwarding by removing the message m from its buffer in the hope that now the message has reached the destination community and will very soon meet the destination.
Case 2: If node B belongs to a different community destination community CD, node A forwards the message to node B if and only if its global centrality value is smaller than node B. With this, a node forwards the message to more popular nodes in the network.
Case 3: If both the forwarding node A and encountered node B belong to destination community CD, node A forwards the message to node B if and only if its local centrality value is smaller than that of node B. With this, a node forwards the message to more popular nodes within the local community of the destination. in the network.
Bubble Rap routing protocol used the concept of community structure to streamline the routing process and in addition used the centrality measure of a node to speedup the delivery of the message towards its destination. However, if destination belongs to a community where most of the members are less popular i.e. have low global centrality value, than the chances of message reaching its destination community may be very low. This may lead either to a long delay or even failure in message delivery The design of bubble rap routing algorithm does not take into account multiple overlapping communities which is quite possible in real world interactions.
Bubble Rap uses K-Clique community detection algorithm that was designed for static binary networks. The DTN network is a dynamic network and hence the community structure of the network keeps on changing. The K-Clique algorithm doesn't give satisfactory results in case of dynamic networks also it is unable to find the overlapping communities in the network. For solving these problems an adaptive dynamic community detection algorithm is proposed to find the communities for Bubble Rap Algorithm [9] which generates community structure according to the previous community structure of the network and current structure of the network. It also prevents the generation of overlapping communities.

Social Based Multicasting
The algorithms discussed so far were unicast in nature. A set of multicast algorithms using socialbased forwarding in DTNs were proposed by Gao et al in [10]. These algorithm use contact statistics based metric called as cumulative contact probability instead of centrality measures used previously in social based routing. The contact process between the nodes is modelled as a Poisson process and can be used to compute the contact probabilities between node pairs. The cumulative contact probability for a node i may be calculated as equation 4: Here λi,j represents the average rate of meeting between a pair of nodes (i,j) in a network consisting of total N nodes, over a total time period T.
This metric Ci denotes the average meeting probability of a node i with any other node in the network in time T. Hence this metric can be used to qualify whether an encountered node can act as a good relay node or not.
Gao et.al. have proposed two variants of multicast problem one related to single-data multicast and other related to multicast of multiple data items [10].
In single data multicast, it is assumed either the source node of the message will contact the destination directly within time T or the message will be relayed to node that can contact the destination within time T. Only those nodes are selected for message relaying whose cumulative contact probability results in a delivery ratio greater than a desired probability p. The relay selection problem is mapped to the knapsack problem and solved using dynamic programming In the case of multiple data multicast problem, each node maintains the list of contact probability with other nodes belonging to its own community. If the destination belongs to different community, then a gateway node is selected which may belong to multiple communities. The relay nodes are selected by the source node using the new metric and appropriate messages are placed in each relay node. The message selection and the relay node selection are again modelled as a knapsack problem.

Homophily Based Data Diffusion
A novel 'Homophily' based approach was proposed by Zhang et al in [11] which was able to prove that the order in which data is diffused to the encountered nodes play a very important role in achieving a high delivery ratio and lower data access delay. This work proposes a data diffusion scheme with the aim to lower the query delay. The major problem in DTN is the partial transmission of data during node encounters. Partial transmissions occur either when a link is broken due to network related problems, or if the receiving node buffer does not have enough buffer space etc. Thus not only selection of relay nodes is important but the order in which data is transmitted is more important.
Zhang et al exploited an observed phenomenon in social networks called as homophily which states that friends share more common interests than strangers [11]. Hence they proposed sharing of most similar data items, in the order of their similarity, when two friends meet each other and dissimilar data items when two strangers meet each other. They compared this data sharing /diffusion scheme with three other schemes: first sharing most similar data items with any encountered node irrespective of the relationship between them, second sharing most dissimilar data items with any encountered node and third sharing most dissimilar data between friend nodes and most similar data with stranger nodes. The simulation results showed that their proposed data diffusion scheme observed higher diffusion speed and lower data access delay compared to other schemes, which signifies that it can diffuse more relevant data items in the network within same amount of limited resources. The major drawback of this approach was detection of friendship among nodes in dynamic DTN environments.

Friendship Based Routing
Bulut et al proposed a new routing protocol called friendship based routing which uses a contact statistics based social metric, called Social Pressure Metric(SPM). This metric is then used for formation of friendship-based communities in the network [12]. Two nodes are considered to be good friends if their contact frequency is high and at the same time they contact each other at regular intervals for longer durations. The SPM between any two nodes i and j is calculated using as in equation 5: Here, f (t) denotes the remaining time to the first encounter of these nodes after time t and T is the total time period. The metric SPM basically denotes the average meeting delay between any two nodes. A link between a pair of nodes (i,j) is assigned weight wi,j which is calculated as the inverse of SPM as in equation 6: The higher the value of wi,j closer the friendship between i and j. After computing its link strength with the neighbouring nodes, a node then forms its friendship community consisting of all nodes with whom its connection strength is greater than a predefined threshold. A node i on meeting node j forwards it a message destined for destination d if either j belongs to the friendship community of d or the link quality of (j,d) is higher than that of (i,d), in the current time period T.
However, the calculation of SPM requires storage of contact history between nodes in each time period. This is a not realistic in a DTN environment. It was proposed by Mei et al in [13], performs message forwarding based on the commonality of interests between two nodes. It is observed in real world that nodes that having common interests tend to meet each other more. This fact forms the premises of this routing protocol often and hence selects those nodes as relay nodes that have similar interests with that of destination node. Each node i has a k-dimensional vector Ii that describes the interest profile of that node. The similarity between interest profiles of two nodes i and j are calculated using a cosine similarity. The message m is forwarded to an encountered node j only if the cosine similarity between the interest profiles of message m and node j exceed by a threshold ρ. No extra storage other than the interest profile has to be stored by node and even the calculations are not too high.
A new user-centric data dissemination scheme is proposed by Gao et.al. in [14]. The authors define node centrality using a new angle. The centrality of node i is based on the expected number of encountered nodes in the in the remaining TTL Tk -t, that are interested in the messages carried by node i. Here Tk is the time to live of the message and t is the current time. The relay selection stage ensures that a new relay has a better capability of disseminating data to the interested nodes than the existing relays based on this centrality. Local centrality is calculated over one-hop and multi-hop centrality over multiple hops.

Sociability Based Routing
A sociability based approach was proposed in [15] by Fabbri and Verdone. This protocol is based on the observation that certain nodes in the network are highly sociable or popular which have frequent encounter with many nodes in the network. A sociability indicator metric is used to calculate the sociability of each node by counting the number of encounters it makes with other nodes over a time period T. Thus this metric is time varying. The forwarding strategy is that messages are to be forwarded only to those nodes that are highly sociable.

SOSIM
In SOSIM [16], each node in the network is represented using a vector based on its social characteristics such as their nationality, the language they speak, the places they visit etc. The social similarity between nodes that have met in past is used to guide the message forwarding process under the assumption that people sharing common social features have high chances of meeting each other. A node P is represented using vector of P(p1, p2, p3. . . pm) where each pi represents a value for some social characteristic of a node. To effectively evaluate the social similarity metric, each pi is calculated as in equation 7: Where Mi represents the number of nodes that node P has met with same social characteristic as destination node; and Mtotal is the total number of nodes P has met in the observed history. Hence each pi will have a value in the range 0 to 1. The social similarity metric between two nodes P and Q, S(P,Q), is then computed using three commonly used similarity measures: namely Tanimoto similarity, Euclidean similarity, and weighted Euclidean similarity.
Tanimoto Similarity: Basically The Tanimoto coefficient is used to measure the similarity between nodes X and Y and is calculated as in equation 8: where X · Y is the dot product of the two vectors. Weighted Eucledian similarity: Here, denotes the weight assigned to ith feature of social vector Vector X= ( , , … . , ) Where each = Here is the number of nodes met having similar value for i th social feature as the destination node and is the total number of nodes encountered by this node. Hence each xi has a value between 0 and 1.
For instance, if a node is described using three social features: City, Language, and Position. And the destination node is described using social vector (New York, English, Student) Suppose node P has met 70% of nodes with City=New York, 80%of nodes with Language=English, and 40% of nodes with Position= student, then node X can be described as (0.7, 0.8, 0.4). Since an ideal forwarder for node D should be represented as (1, 1, 1), hence using Tanimoto metric, its social similarity with destination D will be 0.88. SOSIM uses delegation forwarding due to its ability to reduce the cost of message communication.
The main feature of delegation forwarding is that it forwards a message to newly encountered node if and only if its quality level is better than all other nodes that the current node has encountered so far. In SOSIM, each node is assigned a quality level based on the social similarity metric S(P, Q). Each node on encountering another node compares its quality value and forwards the message only if encountered node has higher quality value. Further the sender node updates its value to the quality value of that encountered node. Thus a node will forward the message to another node only if it is better than the observed nodes until this point. One of the main overhead of this protocol is maintenance of social feature vector of all the nodes it has met in past and its exchange at each encounter.

SEBAR
SEBAR is another community based social routing protocol where forwarding decisions are based on the uses the concept of Social energy [18]. Social energy, inspired from the law of Physics, measures the ability of a node to act as a relay node for other nodes. Nodes generate social energy by means of node encounters. Nodes that frequently meet large number of nodes tend to have higher energy; similarly, node communities having high meeting rate between its members have high social energy.
SEBAR also uses k-clique community detection method, like Bubble Rap. The communities formed are overlapping in nature i.e. a node may belong to more than one community.
Centrality of any node k within a particular community Cj can be computed as in equation 11: Here ( ) denotes the duration of i th contact of a node with any node belonging to community Cj.
Hence a node k's centrality with community Cj is the ratio of sum of contact duration of meetings with nodes belonging to same community Cj to the sum of contact duration of meetings with nodes belonging to any community.
A node k's social energy is combination of the energy generated by itself through its own encounters denoted as E_Nk and energy that it has acquired by participating in other communities denoted as E_Ck.
E_Nk(i) denotes the energy acquired during i th meeting within its own community and E_Ck is the energy gained from meeting with other community nodes.
The process of node encounter is analogous to the process of collision which results in generation of energy, which is equally distributed among both the colliding nodes. Each node shares a portion of its energy to the communities it belongs to and keeps the remaining portion with it. Overall, when a node experiences frequent node encounters, its social energy will be high and can act as a better relay both within its own community and in the whole network. A node belonging to multiple communities gains energy from multiple communities via node encounters with their members. A node also loses its energy if it does not meet any node for a long time. This is similar to energy decaying process of law of radiation of energy in physics, where all objects radiate energy which results in drop of temperature. A node must have certain energy level to act as a relay for other nodes; hence for a node to be active it must keep meeting new nodes, otherwise its energy level may reduce to zero because of the decaying process.

Articulation Node Based Routing
ANBR [19] is a routing protocol which takes advantage of articulation points in graph to deliver messages to disconnected components of the graph. An articulation point is defined as a vertex whose removal from the graph makes the graph disconnected. Hence these articulation nodes can be used to deliver messages outside the local neighborhood of a node. Since computation of articulation points requires knowledge about complete graph and finding the complete network topology is difficult in DTN scenarios, the authors propose use of local subgraph for calculation of articulation points. Nodes exchange their neighbor list on meeting each other and use this information for constructing its local subgraph G' (V', E') which consists of the connected nodes and their one hop neighbors and links between them. The traditional method based on DFS is then used for finding the articulation points and the biconnected components of the local subgraph. Articulation points in the graph G' are computed using depth first search (DFS). Each node maintains a message vector which stores all the messages that are to be forwarded; a neighbor vector which stores the list of its direct neighbors; and articulation vector which stores the list of all articulation nodes a node has encountered. A node either forwards the message to its destination directly or to an encountered node if it is one of those articulation node or direct neighbor of some articulation node. The articulation node act as bridge nodes which help a message to move out of its local neighborhood. ANBR also uses the Drop Oldest message dropping policy in case of buffer overflow. In a social network the links between the nodes keeps on changing which causes a highly dynamic topology. Finding articulation points in a dynamic graph can be an expensive task. SEDUM is a multi-hop, utility based routing protocol designed for social based DTN scenarios [20]. It is different from most of the earlier probabilistic routing protocols designed for DTNs which considered only frequency of node meetings for computing the utility function and ignore the duration of contact between nodes which actually affects the amount of data that may be exchanged between encountering nodes. SEDUM uses both contact frequency as well contact duration for the design of its utility function. The utility function considering only contact frequency between nodes works well in network scenarios when nodes have low or medium mobility rate where the chances of message abortion due to link disconnection is low. However, network scenarios with high node mobility may lead to frequent message abortions due to shortage of contact duration. In such cases it `will be better to consider both contact frequency and contact duration. Ze et al proved both theoretically as well as through simulation results that the combined use of contact frequency and contact duration in taking forwarding decisions highly affects the network throughput. The direct utility between node I and node j is calculated as in equation 13:

SEDUM
Here ( , ) ( ) denotes the time duration of k th meeting between node i and node j.
A node may also send message to node nj via a relay node Hence indirect duration utility between node i and node j through node k can be calculated using the transitive principle in equation 14: ( , ) = ( , ) * ( , ) (14) The cumulative duration utility is then calculated as the maximum of direct utility between nodes i and j and maximum of indirect utility between nodes i and j through all intermediate nodes k as in equation 15.
The utility metric is also updated periodically every T time units considering both the current value and historical value to properly reflect the current communication capacity as in equation 16: Here is weight constant which helps to smoothen the utility updation process.
SEDUM implements a multi-copy routing strategy similar to spray and focus routing protocol [21]. The first phase called the replicating phase is similar to spray phase, where the source node replicates a message copy to a fixed number of encountered nodes. These nodes then enter the second phase of the routing protocol called the forwarding phase, which is similar to the focus phase, where a node carrying a message copy forwards a message to the encountered node only if its utility metric for the destination is higher than the current node. Once the message reaches its destination, the destinations spreads information about delivered message in the network via exchange of delivered message list. All nodes receiving this information, then remove the delivered message from their buffers. This is called as the clearing phase which prevents further spread of delivered messages in the network and save both bandwidth and buffer space.
Since the buffer space of all nodes is limited, forwarding messages to high utility nodes may lead cause congestion problem. Hence SEDUM also uses an effective buffer management policy. Both its buffer drop and buffer scheduling policies are based on message priorities and utility values of the messages.

Location Prediction-based Forwarding for Routing using Markov Chain (LPFR-MC)
Opportunistic Networks (OppNets) concept can also be integrated well in an IoT environment where the exploitation of connection between the IoT devices occurs in an opportunistic manner. Such a paradigm is referred to as OppIoT [22].
A new opportunistic routing scheme termed as "Location Prediction-based Forwarding for Routing using Markov Chain (LPFR-MC)" has been developed for IoT applications. It is composed of two phases: first phase uses Markov Chain to predict the location/region of a node and the second phase uses the predicted node location for computation of node's delivery probability.
Depending on neighbouring node's information, the proposed model makes use of the Markov chain to calculate the probability of neighboring node moving in the direction of destination node's location or region. The range of transmission of the source is divided into four regions based on the angle. In order to find the next state of neighbouring node n in the next time span, initial probability vector [nv0, nv1, nv2, nv3, nv] is defined. It is derived from the idea that the distribution of probability values is done in a way that the destination region is assigned the maximum value whereas opposite region and the adjacent destination region are assigned least values and almost equal values respectively. The transition probability matrix is constructed based on the location history of the previous sequence of n. The estimation of the probability of movement of n in multiple states is based on the final updated transition probability matrix of n along with the implementation of Markov chain. A node's message delivery probability is directly proportional to its probability of moving into region of the destination node.

Opportunistic Network Routing based on Cosine Similarity
The OppNet is the combination of the delay-tolerant network (DTN) and mobile ad-hoc network (MANET). The deployment of the nodes is not uniform and between sender and receiver, there is no communication. It is challenging to choose the next hop accurately amongst the neighboring nodes.
Hence with a combination of social characteristics and the relation amongst nodes along with knowledge of the challenges regarding the long-established routing algorithms of OppNet, cosine similarity (cosSim) based solution is proposed [23]. Cosine similarity being popularly used to calculate the similarity between the text can be used as a parameter of the social relationship strength by deriving the cosine alikeness amongst data packets of the node. The next hop is finalized based on the amount of the alikeness of nodes. The routing algorithm is designed based on the topological structure of the tree. The cosine similarity is calculated between every child node and current node of the tree. The upper and lower threshold parameters (α=0.29 and β=0.81) are chosen considering the reference values of delivery ratio and routing cost depending on simulation experiments. The next hop is selected based on the following condition: (SIMsj)< α and (SIMsj)> β.

EpSoc: Social-Based Epidemic-based Routing Protocol
Epidemic routing protocol is one of the benchmark routing protocols for opportunistic MSN (OMSNs), however it suffers from high message overhead and large hop count. A hybrid routing protocol called EpSoc has been proposed by Lenando et.al. in [24] which uses the message spreading strategy of Epidemic routing but uses a social metric, node centrality, to control the message flooding process. It uses the node' degree centrality metric to adapt the TTL of messages a node is holding. A node's centrality value is an indicator of its social importance in the network, a node with high centrality value means it has number of connections with other nodes within a network and hence is capable of delivering more messages. In EpSoc, a node on meeting another node with higher centrality value forwards its messages stored in its buffer and subsequently decreases its TTL value in inverse proportion of the centrality value of the receiving node. These messages called as socially infected message list are moved to a blocking register and will be removed from node's buffer once their TTL expires. The effect of adapting the message TTL is that it releases the buffer space fast and hence the node will have space for storing newer messages and hence increased delivery ratio. The blocking mechanism helps in reducing the network overhead by preventing forwarding of decreased TTL messages to previously traversed actives nodes.

= (17)
Here denotes the degree centrality of receiving node k.

Spray and Wait Routing
Spray and Wait is a popular DTN routing protocol in which the source node sprays a predefined number of message copies in the network. This may lead to unnecessary wastage of network resources. Guan et. Al. propose an enhanced version of Spray and Wait routing protocol called a social relationship based adaptive spray and wait routing protocol (SRAMSW) [25] that can dynamically adjust the number of copies according to network conditions and take forwarding decisions based on the social relationship with the encountered nodes. The social relationship between two nodes is a measured by combining three social metrics namely betweenness centrality, similarity centrality and the friendship value.
The similarity centrality is computed as the number of common neighbors between a node and the destination node.
The betweenness centrality is calculated as the number of shortest paths passing through a node for a given destination.
The friendship between two nodes is computed as the ratio of number of connections between a node I and node j and the total number of connections made by node I with other nodes in the network.
The weighted sum of all the above three social metrics depicts the social relationship of a node with the destination node.
SRAMSW uses a timeout retransmission mechanism. If a relay node holds a single copy of the message for a long amount of time called the timeout threshold and does not meet the destination, then the message enters retransmission phase. Now the message is transmitted to the next encountered node based on its social relationship with the destination node. It also uses ACK mechanism to inform other nodes in the network about delivered messages which can now be removed from their buffers. It also uses a buffer management mechanism based on the amount of time a message has spent in a node's buffer and the number of times it has been sprayed.

HSBR
The hybrid algorithm is put forward mixed with the routing methods of basic MANET and DTN approach using their plus points. This paper deals with various situations and analyze them such as when MANET is splitting in various networks due to mobility and along with that the comparison of MANET and DTN but the main section of this paper is proposing HSBR algorithm [26] and its simulation and results with other traditional methods.
Putting forward the HSBR Algorithm actually considers improved DSR protocol and SBOR Algorithm. Before this algorithm is being proposed certain assumptions are taken: Personnel and Contacts Profile exists in all devices. The personal profile is created by the collection of data with the respective value for every row. The plain text of entries is changed by the help of hash method MD5 to Hash function in each word in PP and CP. All of the nodes are placed in a particular location. Every node receiving extended RREQ will predict the probability of reaching the receiver.
The expanded RREQ, PP and CP are taken into consideration. Designed data set are analysed considering expanded RREQ and PP. Source forwards to adjacent nodes with the help of Expanded RREQ for DTN routing. Next is, as long as the message gets lost before reaching the receiver in time or number of hops will not be sent, do the following: Source attempt to find receiving node using Enhanced DSR (M-DSR). If gets M-DSR down and also if the accessible ways are identified follow the following steps: Procedure of maintenance is considered so that backup paths are present. When no backup paths are there, Source uses semi-accessible paths. The fresh sender is taken along with remaining data is forwarded to it, by a semi-accessible path. The sender is assumed as a new sender, procedure of HSBR initiates from Source. Now if accessible connections are absent considering M-DSR then Source considers the node with the maximum prospect of reachability from adjacent nodes, considering DTN. Following this, the source will be picked or the data is kept and transferred in the storage of actual node till a favorable scenario pops up in the circuit. After fresh sender is chosen, then the procedure of mixed MANET DTN takes place again by source. And altogether if the M-DSR would not have been crashed then the transfer would have already been successful.
In this, along with HSBR, two other methods are also being taken into consideration for the purpose of comparison. Now the first parameter is the transfer success which is further being classified into 3 areas i.e. complete, partial and null. So, in terms of transfer success HSBR turns out to be the most efficient as transfer rate for the complete message is highest in HSBR and least in DSR. HSBR lags both in sending a partial message and null message by DSR and SBOR respectively. The other parameter is the average success of delivery where HSBR starts off well but eventually degrades with an increase in velocity but still does better than SBOR and way better than DSR. With respect to success percentage HSBR, the ratio decreases with increasing velocity but gives better results than SBOR which show the same pattern as shown by HSBR but with much lesser value at the same sample velocity.

ML-SOR
A multi-layer social network model investigates the connection between the social network hierarchy w.r.t. node centrality, community architecture, tie robustness and link forecasting. The intent of surveying is to analyzing end-person behavior in multiple hierarchical complicated circuits which make both offline and online social relations hybrid. An approach called ML-SOR (Multi-layer Social Network-based Routing) [27] is put forward that take out social network knowledge from representation for doing decisions related routing. To choose an efficient passing-on node, it calculates the forwarding fitness of a node in contrast to an encountered node with respect to node centrality, robustness and link forecast.

LASS
The node sharing more familiar interests with the receiver are chosen and increased local action in the community as a medium. A fresh data program sending is designed regarding MSNs considering social alikeness and local action in dynamic valued circuits known as Local-Activity and Social-Similarly (LASS) [28]. LASS's characteristics are: (1) LASS considers the variety of member's local activity amongst each community considering weighted network models (local activity) instead of creating zero changes to members. (2) The overall action of a node can be explained using a passingon utility.
Finally, social similarity between nodes is calculated by the internal output of their forwarding utilities, with which passing-on strategy can be controlled considering the basic guideline for selecting the intermediate node having greater social alikeness with the receiver.

EESR
Most of the social based routing protocols select popular nodes i.e. nodes with high social metric as relays in the network. This strategy may overburden these nodes and lead to quick exhaustion of their energy and hence imbalanced energy consumption across the network. A novel energy efficient social based routing (EESR) protocol has been proposed in [29] which limits the number of message forwarding in conventional routing protocols which do uncontrolled message forwarding. The basic version of EESR uses parameter called amplification factor, denoted by amp_ratio, to take the forwarding decision. A node forwards a message to the encountered node if and only if its social metric value is smaller than that of encountered node by a factor amp_ratio. This amp_ratio is adapted according to the TTL value of the message. Initially the amp_ratio is set to a large value, this limits the number of message forwarding's but as the message TTL reduces the amp_ratio is reduced so as to increase the probability of message reaching the destination. The amp_ratio is adjusted as in equation 18: Here ttl denotes the current TTL value of the message, TTL0 denotes the initial TTL value and is a predefined constant. However, sometimes a node may never meet a node whose social metric is greater than the amp_ratio times its own social metric, in which case the message may never reach its destination. To tackle this problem, the authors have also proposed an improved version of EESR which suggests that if a node carrying a message meets k nodes but still could not forward the message, then the amp_ratio is gradually decreased so that it becomes easy to find a relay node and hence increase the delivery ratio. This scheme can be integrated with any social based routing protocol which uses any social metric like node centrality for taking forwarding decisions.

HiBOp
HiBOp protocol is proposed essentially to overcome the inadequacy of MANET routing protocol which works on the concept of bringing the consistent path between the source and destination prior to the initiation of the forwarding process [30]. The HiBOp or history-based routing protocol for opportunistic networks consider the information related to the user's personal data, behavior in the particular context and social acquaintances to drive the process of forwarding. In order to implement this idea, HiBOp consists of two important methodologies: in the first step the nodes run context creation and management algorithms and in the next step the algorithms are used to compute the forwarding mechanism. HiBOp uses the concepts of remembering acquaintances and also the current and historic context of every node is defined by this. HiBOp makes use of the extensive information regarding the behavior, system and network interfaces which are stored in the Identity Tables (IT). During the neighbour discovery phases, the ITs are exchanged among the nodes to learn about the environment. The ITs of current neighbours defines the Current Context (CC) of the node. The basic component of HiBOp is History table which stores the values of the ITs of the neighbours of the nodes visited in the past. The accuracy of the HiBOp depends on the historical context information of the current role of the node. An interim data structure is utilized to dynamically update the contents of the current context of history table called a Repository Table. The Continuity Probability, Redundancy and Heterogeneity is calculated for every value in the Repository Table. HiBOp controls message replication by specifically allowing a sender to make more than one replicas of the message, which proves to be the advantage of this protocol over other protocols. The forwarding process of HiBOp includes three phases: emission, forwarding and delivery. the disadvantage of this plan is having a restricted amount of nodes accessible as user forwarders. The drawback of this protocol is to have a limited number of candidate node forwarders. This can be overcome by modifying the operations by including the non-member nodes of the receiver's community in the forwarding process without violating the privacy guarantee. The simulation results prove that the context-considering passed-on considering on user's social behavior is the better way for opportunistic networks.

Discussion and Scope for Future Work:
From the study of different social based routing protocols following observations have been made: Social based routing protocols are mainly based on two types philosophy one which mainly use meeting statistics between nodes such as the meeting frequency, duration of meetings, intermeeting time etc. They may use these meeting statistics to compute utility metrics, to form community structures within the network, which are then useful in taking forwarding decisions. But this requires nodes to store the contact history with the nodes they have met in their buffers and these metrics become obsolete fats if not updated timely. While the other type of social based routing protocols utilizes similarity of social characteristics between nodes such as commonality in their nationality, interests, language etc. for taking routing decisions. A hybrid technique, which combines meeting history statistics along with social interests of nodes may result in better routing, decisions as shown in ChitChat protocol [33]. Using a single social-metric in the algorithm often leads to delay in delivery because of the unpredictable network graph of a DTN. Hence social-based algorithms using multiple social metrics help to improve the routing performance of DTN social-based routing protocol.
Community based routing protocols are an important class of social based routing which rely on community formation which is also an expensive task and requires lot of information exchange.
Moreover, the works which have used community detection techniques do not mention the frequency at which these community detection algorithms are run. None of the works have analyzed the overhead of community formation process. Use of dynamic community detection algorithm that can adapt to the changing topology of DTN scenarios and can reduce the computation overhead.
Application of dynamic community detection algorithms in DTN scenarios is an important area which needs further exploration [9].
Social based routing protocols like any other routing protocols are susceptible to routing attacks where malicious nodes falsely publicize them as a good relay by falsely advertising their high social metric value. A node may publicize that it has very high centrality value so that other nodes route their data through it and then later it may drop the data or use it for other malicious activities. There is also a need for incentivizing nodes having high popularity which are usually overburdened. Most incentive schemes require monitoring of node behavior and see if they perform their duty in routing process rightfully but due to dynamic nature of DTN topology, this may be a difficult task as dropping is also a normal behavior due to buffer constraint and restricted forwarding is also a normal behavior in DTNs due to limited bandwidth and short contact durations. Although few works [31,32] have addressed this problem but it needs to be studied more in the context of PSNs. Design of incentive mechanisms for social based routing protocols is an important area of research.

Conclusion
: This paper presents a comprehensive survey of social based routing protocols which is an important category of utility based routing for delay tolerant networks. This class of protocols makes extensive use of the concepts of social network analysis to select appropriate relays for a message. However, adapting these concepts in the dynamic environments of DTN which observe frequent disconnections and network partitions is a challenging problem.