Spectral Efficient and Energy Aware Clustering in Cellular Networks

The current and envisaged increase of cellular traffic poses new challenges to mobile network operators (MNO), who must densify their radio access networks (RAN) while maintaining low capital expenditure and operational expenditure to ensure long-term sustainability. In this context, this paper analyzes optimal clustering solutions based on device-to-device communications to mitigate partially or completely the need for MNOs to carry out extremely dense RAN deployments. Specifically, a low-complexity algorithm that enables the creation of spectral efficient clusters among users from different cells, denoted as enhanced clustering optimization for resources’ efficiency is presented. Due to the imbalance between uplink and downlink traffic, a complementary algorithm, known as clustering algorithm for load balancing, is also proposed to create nonspectral efficient clusters when they result in a capacity increase. Finally, in order to alleviate the energy overconsumption suffered by cluster heads, the clustering energy efficient algorithm (CEEa) is also designed to manage the tradeoff between the capacity enhancement and the early battery drain of some users. Results show that the proposed algorithms increase the network capacity and outperform existing solutions, while, at the same time, CEEa is able to handle the cluster heads energy overconsumption.


Spectral Efficient and Energy Aware Clustering in Cellular Networks
Georgios Kollias , Student Member, IEEE, Ferran Adelantado, Member, IEEE, and Christos Verikoukis, Senior Member, IEEE Abstract-The current and envisaged increase of cellular traffic poses new challenges to mobile network operators (MNO), who must densify their radio access networks (RAN) while maintaining low capital expenditure and operational expenditure to ensure long-term sustainability.In this context, this paper analyzes optimal clustering solutions based on device-to-device communications to mitigate partially or completely the need for MNOs to carry out extremely dense RAN deployments.Specifically, a low-complexity algorithm that enables the creation of spectral efficient clusters among users from different cells, denoted as enhanced clustering optimization for resources' efficiency is presented.Due to the imbalance between uplink and downlink traffic, a complementary algorithm, known as clustering algorithm for load balancing, is also proposed to create nonspectral efficient clusters when they result in a capacity increase.Finally, in order to alleviate the energy overconsumption suffered by cluster heads, the clustering energy efficient algorithm (CEEa) is also designed to manage the tradeoff between the capacity enhancement and the early battery drain of some users.Results show that the proposed algorithms increase the network capacity and outperform existing solutions, while, at the same time, CEEa is able to handle the cluster heads energy overconsumption.

I. INTRODUCTION
T HE envisaged increase of the cellular traffic, which accord- ing to [1] is expected to reach 30.6 exabytes per month by 2020 at a compound annual growth rate (CAGR) of 53%, imposes new capacity challenges to the fifth generation (5G) cellular networks.Specifically, this -increasing trend in data traffic demand will force 5G networks to meet a 1000× capacity increase, mainly based upon three pillars: the improvement of the spectral efficiency, the allocation of new spectrum bands, Manuscript received October 18, 2016; revised March 29, 2017; accepted May 22, 2017.Date of publication June 16, 2017; date of current version October 13, 2017.This work was supported by the MITN Project CROSSFIRE (PITN-GA-2012-317126), in part by the Spanish Ministry of Economy and the FEDER regional development fund under SINERGIA (TEC2015-71303-R), and in part by the CellFive (TEC2014-60130-P) projects as well as the AGAUR project (2014 SGR 1551).The review of this paper was coordinated by Prof. G. Mao.(Corresponding author: Georgios Kollias.)G. Kollias is with the Iquadrat Informatica, Barcelona 08009, Spain (e-mail: gkollias@iquadrat.com).
Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.
Focusing on the densification of the RAN, the research community has proposed the dense deployment of Small Cells (SC) as an enabler for the capacity increase required to meet the expected traffic demand.However, such densification of the RAN has posed technological challenges, such as interference management [3], [4], and economic considerations [5].
As mobile devices are the main contributors to the traffic growth, high capacity demand is intrinsically linked to the boost in the number of mobile devices.For instance, and based on [1], the number of mobile devices and connections will globally reach 11.6 billion by 2020.Therefore, the need for denser RAN deployments run in parallel with the actual and envisaged growth of the density of mobile devices.In this context, where the densification of the network is jeopardized by the high deployment costs, we propose the exploitation of the cooperation among mobile devices (through Device-to-Device communications, D2D [6]) as a cost-efficient solution to expand the RAN when and where needed.The inclusion of mobile devices as an expansion of the RAN can provide high spatial diversity and improve the spectral efficiency of the whole network.Although cooperation among Base Stations (BS) has already been proposed as a mean to increase the spectral efficiency (e.g.,, [7]), cooperation among devices proposed in the sequel opens up new opportunities and challenges to get the network dynamically adapted to traffic needs.
The rest of the paper is organized as follows: The State of the Art and contributions are detailed in Section II.In Section III the system is modelled as an optimization problem, and two clustering algorithms, namely enhanced Clustering Optimization for Resources Efficiency (eCORE) and Clustering algorithm for Load Balancing (CaLB), are presented.Section IV analyses the energy consumption and proposes a Clustering Energy Efficient algorithm (CEEa) to prevent cluster heads from early battery drain.Numerical results are presented in Section V and conclusions in Section VI.

II. STATE OF THE ART AND CONTRIBUTIONS
The need to improve spectrum utilization, overall throughput and energy consumption in cellular networks has stimulated the research on the D2D field over the last years.In short, D2D communications are expected to become the basis to provide direct connectivity between users (with or without the support of network infrastructure), enable devices to play the role of 0018-9545 © 2017 IEEE.Personal use is permitted, but republication/redistribution requires IEEE permission.
relay in two-hops communications, and allow the multicast of common content to a multicast group [8].
Regarding the direct connectivity between users, Feng et al. proposed in [9] a resources' allocation framework to optimize the spectral efficiency of the network when a set of D2D pairs operate over the same frequency as the cellular users.In this study, however, D2D pairs are never connected to the cellular network and therefore the D2D pairs have only two options: transmit in D2D mode or remain silent.Similarly, [10] analyses the joint power control and frequency reuse of D2D pairs in the same scenario presented in [9].Also in line with [9] and [10], J. Huang et al. proposed in [11] a significant step towards more efficient D2D communications by expanding these communications from intra-cell environments to inter-cell environments.The proposal, which is based on game theory, shows clearly the potential of this inter-cell cooperation.Yet, the scenario is restricted to a use case with disjoint sets of D2D and cellular users.
The works in [12]- [17] study the performance of D2D communications in multicast groups, where all users download a common content from the BS via a cluster head user.It is shown that a better efficiency in the resources' usage can be achieved in these scenarios, although the gain is bounded by the lowest quality link between the cluster head and the rest of users of the multicast group.In detail, the authors in [12] derive expressions to select the optimal number of D2D retransmitters in a multicast group, and [13] proposes a Conventional Multicast Scheme (CMS) to decide whether a user should be served by the BS or by the cluster head.
Similarly, Meshgi et al. [14] maximize the throughput in a single cell scenario with multicast D2D groups by proposing a heuristic resource allocation algorithm that achieves near optimal performance.In [15] the authors address the multicast clustering by setting up a Primary Cluster Head (PCH) and a Secondary Cluster Head (SCH).The PCH and the SCH are selected based on their residual energy and the received Signal to Interference Noise Ratio (SINR).Similarly, in [16] the authors analyse a set of different strategies for the establishment of multicast clusters.The work shows that D2D-based multicast clustering can increase the system capacity, although it is sensitive to parameters such as clusters' dimension.Finally, key features required to support network controlled D2D-based multicasting are analysed in [17].
Although the works described so far address the problem of D2D clustering in cellular networks, they are constrained by two assumptions: i) only downlink traffic is considered ; ii) the same content is delivered to all users in the cluster/group.
Cooperative D2D moves a step forward in [18], where authors formulate the clustering problem as the maximization of the throughput constrained by energy efficiency.The proposed algorithm outperforms the results obtained without clustering but it neglects two important aspects: i) the mobility, that impacts on the quality of the links and on the role played by each user; ii) the energy consumption of the relay/cluster head could be higher in idle state than in transmission state.
In contrast with the State of the Art, we propose clustering algorithms aimed to improving the resources' utilization efficiency in scenarios where uplink (UL) and downlink (DL) traffic are considered in a LTE-A FDD system.Our algorithms are based on a previous work [19], where the clustering algorithm CORE was proposed.CORE restricted the creation of spectral efficient clusters to users within the same cell thus limiting the achieved gains in dense Heterogeneous Networks (HetNets).In order to go beyond this constraint, we propose a new algorithm, namely enhanced Clustering Optimization for Resources' Efficiency (eCORE), that extends clustering to multi-cell deployments.Specifically, eCORE is based on the cooperation among devices by leveraging the D2D communication concept, initially introduced in the framework of LTE-A to support Proximity-based Services (ProSe) for public safety [6].In our solution, the mobile devices create spectral efficient clusters with a single cluster head (CH) characterized by good quality links with the serving BS and with the rest of cluster members.In eCORE clusters can be created among users from different cells as long as they result in a decrease of the required resources.The cluster head is responsible for receiving and forwarding packets from/to the BS and the cluster members.As traffic is more intense in the DL and D2D communications are usually carried out over UL bands to limit the interference caused to neighbouring users [9], [11], the proposal benefits from the imbalance between UL and DL traffic intensity and the high channel gain of D2D communications to increase the capacity of the network.Although the dynamic adaptation to the imbalance between UL and DL traffic has been addressed in [20], [21] for TDD HetNets, the problem is more challenging in FDD systems, where transferring traffic from DL to UL is more complex.
Following this rationale, it is shown that the capacity of the network can be further increased by establishing non-spectral efficient clusters that balance UL and DL traffic.This is the objective of the Clustering algorithm for Load Balancing (CaLB), the second proposed algorithm.CaLB shows that in some cases clustering can be beneficial despite increasing the number of required spectrum resources.Yet, the proposed solutions present challenges in terms of energy consumption of the cluster head that are studied and addressed along the paper by complementing eCORE with the Clustering Energy Efficient Algorithm (CEEa).CEEa limits the cluster head energy overconsumption, thus minimizing the disincentive in the creation of clusters.Both CaLB and CEEa are designed to be executed after eCORE to improve its performance, but not to be implemented in a standalone manner.
In a nutshell, the three clustering proposals are a cost-efficient RAN densification solution based on D2D for FDD-LTE networks, and the work's contributions are the following: 1) A RAN densification solution based on D2D clustering in the framework of FDD LTE-A is presented to improve the spectral efficiency.The algorithm, which is an extension of CORE [19] and is denoted by eCORE, exploits the spatial diversity provided by the high density of users and the imbalance between UL and DL traffic.Contrary to CORE, eCORE enables the creation of inter-cell clusters.2) A load balancing clustering algorithm, namely CaLB, is proposed to increase the capacity of the network.In con-trast with eCORE that creates spectral-efficient clusters, CaLB complements eCORE by establishing non-spectral efficient clusters.The capacity gain results from the UL and DL load balancing.3) We propose a complementary algorithm to eCORE, known as CEEa, that compensates the energy overconsumption suffered by cluster heads in eCORE.CEEa benefits from mobility and forces reclustering by limiting the time during which users play the role of cluster head to reduce the energy overconsumption.

III. CLUSTERING PROPOSAL
The proposed clustering solutions described in the sequel (eCORE, CaLB and CEEa) are all based on a set of premises: i) each cluster has a single cluster head; ii) each user/device can be direcly served by a BS, play the role of cluster head, or join a cluster to be served by a BS through the corresponding cluster head, but no more than a single role can be played simultaneously; iii) intra-cluster communications are D2D transmissions carried out in the UL band to limit the interference [9], [11].In FDD, the creation of a cluster is translated into a transfer of resources' utilization from the DL band to the UL band, which is usually underutilized.For instance, the DL traffic of a clustered user is first served with DL resources (from the BS to the cluster head) and subsequently with UL resources in the D2D communication from the cluster head to the cluster member.If we assume that the channel gain from the BS to the cluster head is higher than the channel gain from the BS to the rest of clustered users, the required DL resources are reduced.Although the three algorithms share a set of premises, they differ in their objectives.Thus, in eCORE clustering is aimed to reduce the number of required resources (Section III-E).In CaLB, the creation of a cluster must decrease the load of the DL (Section III-F).Finally, in CEEa the energy overconsumption of cluster heads must be compensated (Section IV-C).
This Section is focused on the algorithms that improve the capacity of the network, i.e., eCORE and CaLB.The Section first describes a set of use cases where clustering can be applied (Section III-A).Then, the system model is stated in Section III-B and the general expressions of the required resources in UL and DL are developed in Section III-C.Based on these expressions, the optimal clustering problem aimed to minimize the total number of resources is formalized in Section III-D.Finally, eCORE is proposed in Section III-E as a low complexity algorithm and CaLB is introduced in Section III-F to further enhance the capacity.

A. Use Cases
The clustering proposal addresses three use cases: the service of user equipments (UE) in coverage gaps, the enhancement of spectral efficiency and the load balancing.UE4-BS2 is good enough, the clustering of UE4 (cluster head) and UE 5 can guarantee the service of the latter.2) Spectral efficiency enhancement (Fig. 1(c)): Clustering UE5 and UE3 with UE4 (the cluster head) increases spectral efficiency if: i) the quality of links UE3-UE4, UE5-UE4 and UE4-BS2 is significantly better than the quality of links UE5-BS2 and UE3-BS2; ii) downlink is highly loaded while uplink is less loaded.3) Load balancing (Fig. 1(d)): If BS2 is highly loaded and BS1 is less loaded, the clustering of UE3 with UE2 (cluster head) can balance the load of BS2 to BS1.

B. System Model
The network is composed of a set of FDD-LTE BSs (macro eNBs and/or SCs), namely B, covering the scenario and serving a set of users, denoted by U.Each user i ∈ U is connected to a BS k ∈ B according to any of the existing cell association algorithms (e.g., algorithms based on Reference Signal Received Power (RSRP) with or without Cell Range Expansion).The set of users connected to BS k is referred to as U k .As users are only served by a single BS simultaneously, , composed of the average transmission rate in the DL R d i and in the UL R u i .As in general UL and DL traffic are unbalanced, In LTE-A the transmission rate between two nodes depends on the selected Modulation and Coding Scheme (MCS), which is determined by the maximum allowed bit error rate (BER) and the SINR.Accordingly, the number of bits transmitted by user i during a subframe time T s = 1 ms, defined as Transport Block Size (TBS), can be approximated by an attenuated and truncated form of Shannon bound.Thus, the TBS of a transmission from i to j in the band v (v = u if the transmission is in the UL band and v = d if it is in the DL band) is approximatted as where r is the attenuating factor, W is the bandwidth of a Physical Resource Block (PRB) and γ v i,j is the SINR received at j when data is transmitted by i.If the transmitter is a UE and the receiver is a BS, i ∈ U and j ∈ B; if the transmitter is a BS and the receiver is a UE, i ∈ B and j ∈ U; finally, if both transmitter and receiver are users in D2D mode, i, j ∈ U.

C. Resources Required With and Without Clustering
The spectral efficiency is measured in bps/Hz.Therefore, the enhancement of the spectral efficiency is equivalent to the minimization of the PRBs required to serve a given traffic.Based on the definitions stated above, the expected number of PRBs required in the scenario to serve all the users in the UL (N u ) and in the DL (N d ) can be expressed as where N d k and N u k are the expected number of PRBs per subframe required by base station k (eNB or SC) in DL and UL.For simplicity, we define .
Let us consider that groups of users can create clusters.Each cluster u is composed of cluster member users, among which a single user plays the role of cluster head.Hereafter, the set of users in cluster u will be denoted by C u , and the cluster head by h u ∈ C u .The cluster head h u is responsible for receiving the DL traffic of all cluster members from the BS and forward it to the corresponding cluster member.Likewise, for the UL traffic, the cluster head receives the traffic from the rest of the cluster members and forwards it to the BS.We will denote the set of all the clusters in the scenario by C = u C u .Note that the communication within the cluster is carried out over the UL band to minimize the interference caused to the users outside the cluster.Therefore, intra-cluster communications are always carried out in the UL band.In real FDD networks, BSs are always full-duplex; conversely, user devices can be half-duplex (Half-Duplex FDD devices) or full-duplex (Full-Duplex FDD devices). 1We define the set of cluster heads as H = {h u } ∀u , and the set of cluster heads connected to BS k as H k = H ∩ U k .Accordingly, the expected number of PRBs required in the DL band ( Ñ d ) and in the UL band ( Ñ u ) with clusters are written as (4) 1 The term full-duplex is defined as the ability of a node to transmit and receive simultaneously over UL and DL.The ability to transmit and receive simultaneously over the same band is not considered in this work.
transmissions Cluster heads→BSs (5) where Ñ d k and Ñ u k are the expected number of PRBs required by base station k in the DL and UL.As observed in (5), intra-cluster communications do not interfere with UL communications from the cluster head to the BS (they are not simultaneous).Moreover, the number of PRBs required in the scenario is a function of the SINR, which in turn depends on the cell association algorithm.Yet, (2)-( 5) are valid for a given SINR level and regardless of the cell association algorithm.

D. Optimal Clustering for Spectral Efficiency
The aim of the clustering technique presented herein is the minimization of the spectral resources utilization, i.e., Ñ = Ñ u + Ñ d .As it can be observed, the minimization of the required resources is an association problem, where a user must be associated to a BS directly or through a cluster head.Let us define the association matrix X ∈ {0, 1} |U|×|B| , where | • | is the cardinality operator of a set, and the elements of X are x i,k = 1 if user i is directly served by BS k and x i,k = 0 otherwise.We define Y ∈ {0, 1} |U|×|U| as the intra-cluster association matrix, with the elements of Y such that y j,i = 1 if user j is connected to a BS through user i (i is the cluster head) and y j,i = 0 otherwise.Using matrices X and Y, the total number of required resources can be expressed as, Therefore, the optimization problem is formulated as The problem in ( 7) is an integer (binary) linear programming problem (ILP) (7a), where UEs can be served by a BS or a cluster head (7b) and at least one UE is connected to a BS (7c).A cluster can only be created if the cluster head is directly connected to a BS (7d), since multi-hops are not allowed within the cluster.A clustered user can either be a cluster head or be associated to a cluster head (7e).

E. Enhanced Clustering Optimization for Resources Efficiency (eCORE)
As all 0-1 ILP problems are NP-hard [22], (7) is NP-hard.A low complexity algorithm (O(n 3 )), namely enhanced Clustering Optimization for Resources' Efficiency (eCORE), is presented.Based on the expressions derived in Section III-C, some results can be enunciated.
Lemma 1: The number of resources required to serve a user i ∈ U k is reduced when it joins a cluster with cluster head 3) and PRBs required in ( 4)-( 5).
Lemma 2: Given two users i ∈ U k and j ∈ U q , the clustering gain G i,j when j is the cluster head is defined as, The set of possible cluster heads of user i is defined as Y i = {j : G i,j > 0}.For two users i ∈ U k and j ∈ U q , if Y i = {j} and Y j = ∅, then i and j will create a cluster in which j is the cluster head.Conversely, if Y j = ∅, j ∈ Y i and |Y i | > 1, i and j will create a cluster where j plays the role of cluster head if G i,j > G j,n + G i,t for ∀n ∈ Y j and ∀t ∈ Y i .
Proof: Using Lemma 1, the clustering gain achieved by a cluster equals the aggregation of clustering gains of all cluster members.Thus, Lemma 2 can be derived from (2)- (5).
According to Lemmas 1 and 2, clustering is not limited to users within the same cell.A cluster may be created by users served by different cells (eNBs and/or SCs).The proposed eCORE, described in Algorithm 1, is based on Lemmas 1 and 2 and it is aimed to create clusters that improve the total spectral efficiency.The key parameter of the algorithm is the clustering gain (G i,j ) defined in Lemma 1. eCORE starts with the computation of clustering gains for the different UEs, and initializing for each user i the set Y i of users j that would result in a positive clustering gain, i.e., G i,j > 0 (line 1).As mentioned, eCORE only considers single-hop intra-cluster communications to limit complexity and signalling.Accordingly, the term conflict is used in the sequel to describe situations where a user i has a positive clustering gain with a user j (G i,j > 0) that, in turn, has a positive clustering gain with a third user n (G j,n > 0).In these conflicting situations, either user j becomes the cluster head of user i or user n becomes the cluster head of user j, but not both of them.Both situations are enunciated in Lemma 2 and implemented in Algorithm 1.Initially, eCORE clusters users without conflicts (lines 3-17).In the second part, eCORE resolves the unsolved conflicts, stored in the set A (see Alg. 1), by selecting the option that provides the highest clustering gain (lines [19][20][21][22][23][24][25][26][27][28][29][30][31][32]. The computational complexity is reduced by dividing the problem into two steps: the first step (lines 1-17) discards unfeasible clustering solutions, whereas the second step (lines 18-32) resolves conflicting cases.The first step identifies potential cluster heads by figuring out if any of the associations would result in a reduction of the required resources.If not, that association is discarded (it is unfeasible for a spectral efficient cluster).In practice, the identification of potential cluster heads does not require a comparison of all users, since users farther than the D2D range can be discarded at the beginning.In a nutshell, eCORE is an algorithm that checks which clusters can reduce the overall required PRBs.With this, not only the overall number of PRBs is reduced but traffic imbalance is decreased by transferring load from the DL to the UL.

F. Clustering Algorithm for Load Balancing (CaLB)
eCORE takes advantage of UL and DL traffic imbalance to decrease the DL usage at the expense of an increase of the UL usage (only if the DL usage decrease is higher than the UL usage increase).This fact limits the maximum capacity.Let us define the maximum number of PRBs allocated in the DL and in the UL to BS k as N d,max k and N u,max k .The saturation point is of the cell (when the cell capacity reaches its limit) is defined as the situation when either the DL or the UL cannot serve more traffic.Mathematically, the saturation is met when , where Proof: We define the number of available PRBs in the limiting band (the most loaded band) as Knowing that, by definition, ΔN u k > 0 and ΔN d k < 0, it can be found that ( 9) is not true.Therefore, A = (N d,max k − N d k ) must be true (the downlink is more loaded).Rearranging (9) we obtain that , and therefore the number of available resources in the limiting band after clustering is As ΔN d k < 0, then A > A. Lemma 4: Given two users i, j ∈ U k , where user i is not clustered and user j is a cluster head, the number of PRBs required in the DL decreases when i joins the cluster headed by ).If j is not clustered, and given two additional users m and n that minimize , user i must join the cluster headed by j to maximize the reduction in the required PRBs if x i,j ≤ 0. Conversely, if x i,j > 0, users i and m should create a cluster and users j and n should create a second cluster.
Proof: The first case is trivial, since ΔN d k (i, j) is, by definition, the increase in the downlink PRBs.If it is negative, the number of required PRBs decreases.If user j is not a cluster head (second case), user j can become the cluster head of user i or the cluster member of an alternative cluster.In that case, if n = arg min q {ΔN d k (j, q)} and m = 2 According to (8), the gain is defined as the reduction of the required PRBs, whereas ΔN u k and ΔN d k are defined as the increase of the required PRBs.
arg min q {ΔN d k (i, q)}, the maximum overall reduction of PRBs would be ΔN d k (i, m) + ΔN d k (j, n).Therefore, the maximum reduction of the PRBs in the downlink would result from clustering i and In order to further extend the capacity provided by eCORE, CaLB is proposed, mainly based on Lemmas 3 and 4. It is aimed to improve the capacity when no additional spectral efficient clusters can be created, the DL reaches the capacity limit and the UL is still unloaded (see Alg. 2).Therefore, CaLB is always run after the execution of eCORE.The inputs of CaLB are the set of users and clusters created by eCORE and two load thresholds, n d min and n u min for the DL and UL, respectively.These thresholds are used to determine whether a BS DL and UL are loaded or not: if the number of available PRBs in the DL, denoted in Alg. 2 by n d (line 2), is below n d min , the DL of the BS is loaded; if the number of available PRBs in the UL, denoted by n u (line 2), is higher than n u min , the UL of the BS is considered unloaded.Only in this case, each BS executes CaLB and triggers the clustering procedure (line 2).The algorithm establishes the clusters that reduce the load in the DL, by joining users to existing clusters or by establishing new clusters.To do that, all possible pairs of users (defined as Q k in Alg. 2) are ordered according to the reduction that would cause in the number of required DL PRBs if clustered (i.e., ΔN d k (i, j)).There are constraints in this clustering process to prevent spectral efficient clusters (established by eCORE) from being destroyed.First, the cluster head of an existing cluster can serve new users by enlarging the cluster; that is, unclustered users can join existing clusters.A cluster head will not leave an existing cluster to become the cluster member of a new cluster.Finally, the clustering of a user must always result in a decrease of the DL resources; therefore, the channel gain to the BS is higher for the cluster head than for the rest of cluster members (φ d k,h u < φ d k,i when user i joins a cluster head h u ∈ U k ).Based on these constraints and on Lemma 4, CaLB favours the clustering until the number of available PRBs in the DL is larger than n d min or the number of available PRBs in the UL reaches the minimum, n u min .To sum up, CaLB resumes the clustering process initiated by eCORE.The created clusters are not spectral efficient, but reduce the UL and DL imbalance.CaLB is appropriate when the DL is highly loaded.
IV. IMPACT ON ENERGY CONSUMPTION eCORE and CaLB rely on the set-up of cluster heads under the conditions stated in Section III.However, the role of cluster head entails energy consuming tasks, e.g., receiving and retransmitting the data of the rest of cluster members.Therefore, the role of cluster head can cause early battery drain.In this section, the expression of the energy consumption of each stakeholder is derived, and the mitigation of possible energy overconsumption of the clustering approach is studied.In the following, the energy consumption expressions are derived in Section IV-A.In Section IV-B these expressions are used to modify the optimal clustering problem defined in Section III-D and to include energy overconsumption limits.Section IV-C proposes a low complexity Clustering Energy Efficient algorithm (CEEa).

A. Energy Consumption Analysis
The energy consumption of a UE depends on two main factors: the Radio Resource Control (RRC) state of the device, that can be RRC_CONNECTED or RRC_IDLE, and the transmitted power [23].Let us define the RRC state space as S = {I, C tx , C rx }, where I stands for the RRC_IDLE state and the RRC_CONNECTED state has been decoupled into two states, the transmitting state C tx and the receiving state C rx .We define S C = {C rx , C tx }.Based on this, the energy consumed by user i during a subframe time T s is given by E i = T s (P s i + P tx i ), where P s i is the power consumed when user i is in state s i ∈ S and P tx i is the transmitted power.The transmitted power differs in D2D mode (the intra-cluster communications) and in the communication with the BS, and for a user i is described in LTE [24] by, where M i is the number of PRBs scheduled for user i, P 0 is the target received power at BS k, h i,k is the channel gain between user i and BS k, ξ ∈ [0, 1] is the compensating factor and P d2d is the transmitted power per PRB in D2D mode.In the following the role played by user i is denoted by ρ i = {H, M, N }, with ρ i = H for a cluster head, ρ i = M for the rest of the cluster members and ρ i = N for the non-clustered users.Note that a user i is directly connected to a BS if ρ i = {H, N }, and it is in D2D mode if ρ i = M .Each user is characterized by its profile π i , the role ρ i and the location (channel gains with the rest of UEs and BSs), and the expected energy consumed during a subframe is expressed as where, by definition, where P I is the power consumed in state s i = I and P C is the power consumed in state s i ∈ S C .Note that the probability of being in state s i depends on the role of the user.For instance, Taking into account that the cluster head forwards both the UL traffic of all cluster members to the BS, and the DL traffic to the cluster members (intra-cluster communications in D2D mode), the expected transmitted power of a user i connected either to BS k or to cluster head h u can be easily found using (12).

B. Optimal Clustering With Energy Consumption Constraints
In order to limit the energy consumed by the cluster head, the problem defined in (7) must be modified to include the energy consumption constraint.If we define w > 0 as the maximum allowed increase of the expected power/energy of a cluster head, the expected power consumed by a cluster head should not exceed the power consumed if it was not clustered: As shown in ( 13)-( 15), the total power depends on the probability P {s i ∈ S C |ρ i } and on the transmitted power.Regarding the former, when the user i is the cluster head, the probability can be divided into two components: the probability of s i ∈ S C due to the time required to transmit/receive its own traffic from/to the BS k (θ N i,k ) and due to the time required to forward the traffic of the rest of the cluster members (θ H i,j,k , for all users j in the cluster).
where x i,k = 1 if user i is served by BS k and x i,k = 0 otherwise; and y j,i = 1 when user i is the cluster head of user j and y j,i = 0 otherwise (expressions for θ N i,k and θ H i,j,k are derived in [25, Appendix A]).By using ( 13)-( 15) and ( 17), the components of ( 16) can be written as where ΔP C I = P C − P I and E[P tx i |ρ i = H, j] is the power consumed by the cluster head attributable to the traffic of cluster member j, and it is defined as Parameter w must be selected to limit the energy overconsumption of cluster heads while allowing the creation of clusters.For instance, if only a 5% power increase is allowed (w = 0.05), cluster heads will not suffer from rapid battery drain but, in many cases, the establishment of some clusters will be compromised.Therefore, the optimization problem constrained by the energy consumption of the cluster heads results from including as a constraint into (7).

C. Clustering Energy Efficient Algorithm (CEEa)
Due to the complexity of the optimization problem, in this Section we present a low complexity algorithm, namely CEEa, to manage the different energy consumption of each user.As the energy consumed by a cluster head is higher than the energy consumed by a non-clustered user, it is clearly a disincentive for users to become cluster heads, even when w is small.In a scenario without mobility, this disincentive can hardly be addressed (they can only be limited, as proposed in Section IV-B), but the changing environment offered by mobility opens up new possibilities.In order to analyse these possibilities, in the sequel the analysis is carried out as a function of time.Let us define the observation period T ε as the time during which the energy consumption is analysed to prevent users from energy overconsumption.For each user i, T ε can be divided into subperiods Based on the definitions, the time during which each user plays a specific role is the aggregation of periods with the same ρ i (t).Thus, three sets of periods T H i , T M i and T N i are defined as If we denote the power consumed by user i at time t with role ρ i (t) = m as P m i (t), and the power that would have been consumed by user i at time t in case of not being clustered as P N i (t), the energy consumed over a subperiod T i,n ∈ T m with m = {H, M } and the energy that would have been consumed if ρ i (t) = N are given by ).If the definition of energy overconsumption, w(T ε ), is given by As P H i (t) > P N i (t) > P M i (t), user i experiences energy overconsumption due to clustering if w(T ε ) > 0. Although the objective is to keep the overconsumption around 0 in the longterm, lim T ε →∞ w(T ε ) ≈ 0, in practice overconsumption must be limited over finite periods of time.
CEEa (see Alg. 3) limits the overconsumption of users involved in the cluster by setting a maximum overconsumption threshold, referred to as w max , that cannot be exceeded along the observation period T ε .This observation period is divided into a set of n ε subperiods of duration t ε , such that T ε = n ε t ε .Specifically, for a given set of users, CEEa creates a list of users that cannot become cluster heads due to excessive energy consumption in the past, denoted by Z, which is included as a constraint in eCORE.The maximum overconsumption condition, E m i (t) > (1 + w max ) ẼN i (t), is checked at the end of each subperiod of duration t ε in two ways: first, the energy consumption condition is checked for the total time since the beginning of the observation period (line 3); secondly, the condition is checked for the subperiod (line 3).Despite experiencing total overconsumption, the user is not banned from remaining as cluster head if overconsumption is not experienced in the current subperiod (overconsumption is being compensated).If the time during which the user has had the role ρ i = M until time t, τ M i (t), is smaller than the time during which it has had ρ i = H until time t, τ H i (t), the user cannot be cluster head.This condition works proactively to cope with situations where the cluster head suffers from slight but constant overconsumption.As CEEa aims to compensate the overconsumption within T ε , the threshold w max is reduced at every observation subperiod with a factor ( n ε −1 n ε ), since the higher n i is, the more difficult to compensate the energy consumption in the remaining n ε − n i subperiods is.
Although there is not apparent incentive for a user to become cluster head in the short-term, this is not actually true.In loaded scenarios, not only cell-edge users can benefit from the proposed clustering, but also most of the users (even the cluster heads themselves, since the depletion of resources can impact on the resources allocated to them).In this context, CEEa eliminates the disincentive to become cluster head.The detection of selfish users is out of the scope of CEEa, but the proposed clustering algorithm does not preclude the design and implementation of additional algorithms running on top of CEEa to prevent selfish behaviours.

A. Scenario
In this section the proposed algorithms are validated and compared with existing algorithms found in the literature and with the results when no clustering algorithms are implemented (labelled in figures as Without Clustering or w/o Clust.).A custommade simulator implemented in C++ has been used to simulate a network, which consists of a central eNB (macro BS) and the first interfering ring of 6 eNBs, with and inter-site distance of 500 m.Under the coverage area of each eNB, 4 small cells are randomly deployed.The minimum distance between the eNB and a SC is 125 m and the minimum inter-SC distance is 25 m [26].All eNBs are equally loaded and simulated, but only results from the central eNB and the corresponding 4 small cells are collected.Results are averaged over 1000 iterations.Users move at a constant speed of 3 km/h (pedestrian).The hit and bounce technique is used when users move out of the scenario under analysis [27].50% of the deployed users are characterized by symmetric VoIP traffic (64 kbps in DL and UL) while the rest of users demand FTP or streaming traffic (700 kbps in the DL).The system is FDD and spectrum resource partition is considered between eNBs and SCs: eNBs and SCs operate in different bands [28].No interference coordination techniques are considered in the simulations, and the PRBs are allocated randomly among users.Although interference coordination could lead to higher SINR levels, it has been omitted to better characterize the performance of the proposed algorithms.Users and the BS have a single antenna (SISO), and the spectral efficiency lookup table has been obtained from [29].The rest of the parameters can be found in Table I [30].

B. Results
The objective of the optimal clustering for spectral efficiency stated in Section III-D (problem (7)) and labelled in figures as Optimal Clustering, is the minimization of the total number of PRBs required to serve the traffic (i.e., the maximization of the spectral efficiency).Similarly, the opti- mal clustering with energy consumption constraints, detailed in Section IV-B and labelled hereafter as Energy Constrained, is aimed to minimize the required PRBs while imposing energy overconsumption constraints for cluster heads.The spectral efficiency (bps/Hz) of these two solutions can be observed in Fig. 2 for 60 users, along with the results for our previous work CORE [19], and the proposed eCORE, CaLB (with ) and CEEa (with w = 0.2).It can be seen that the Optimal Clustering increases the spectral efficiency in the DL band by clustering users and exploiting the good quality of the link between BS and cluster head.For instance, spectral efficiency in the DL band rises a 54% (from 1.26 bps/Hz to 1.95 bps/Hz) when Optimal Clustering is applied with 60 users.Although clustering solutions incur in additional PRBs utilization in the UL band due to intra-cluster communications, it can be observed that the total spectral efficiency (UL and DL) increases.Thus, the higher UL band utilization is overcompensated by the DL improvement.As it will be seen in Fig. 3, when no clustering solution is applied, cell-edge users are not served due to low spectral efficiency.Fig. 2 also shows the spectral efficiency of Energy Constrained when maximum energy overconsumption of the optimal clustering is limited to 10% and to 50% (w = 0.1 and w = 0.5).As expected, the overconsumption constraint prevents clusters from being set up if they result in excessive energy overconsumption.Thus, only clusters that are simultaneously spectral efficient and keep cluster heads consumption below a threshold (i.e., w) are set up.This is the reason why the spectral efficiency is lower as the energy constraint becomes more restrictive (lower w).For instance, the DL spectral  efficiency is 1.27 bps/Hz when w = 0.1 and 1.36 bps/Hz when w = 0.5.Some insights can be found in Table II, where the average number of clusters and the average size of each cluster are shown for 30 and 60 users.In the Energy Constrained solution, the reduction of w (lower overconsumption is allowed) has a higher impact on the number of clusters created than in the size of the cluster.That is, whereas the size of the cluster remains stable, overconsumption constraints cause a significant reduction in the average number of clusters.Fig. 2 also includes the results for CORE, eCORE, CaLB and CEEa.eCORE achieves results very close to the optima, with a performance less than 5% lower than Opt.Clust.Moreover, eCORE increases the DL spectral efficiency with respect to CORE, since it enables the establishment of clusters among users from different cells.
Table II shows that the intensification in the creation of clusters promoted by eCORE results in the setup of more clusters, although with a similar size.For instance, for 60 users eCORE doubles the number of clusters with respect to CORE while the average size of each cluster is approximately the same.Something similar occurs with CaLB: the number of clusters grows more than the average size of the clusters.That is, CaLB creates new clusters rather than enlarge the clusters established by eCORE.However, CaLB enables the creation of non-spectral efficient clusters if the imbalance between UL and DL is reduced.This is the reason why although the DL spectral efficiency in CaLB is higher than in eCORE, the opposite occurs with the total spectral efficiency (UL and DL bands).Finally, as CEEa limits the energy consumption by deterring some users from being cluster heads, the spectral efficiency is reduced with respect to 4. CDF of the energy overconsumption with 60 users.eCORE and CaLB.Table II shows that the energy consumption constraints reduces the number of clusters.
Fig. 3 shows the DL throughput for each algorithm and includes as baseline the algorithm proposed in [18], which is labelled as CS.As CS is a scheme based on the received SNR to allow or ban cooperation, results for two minimum SNR thresholds have been simulated: 4.73 dB and 2.84 dB.Fig. 3 shows how CaLB outperforms the rest of algorithms, reaching a 59.5% gain in the downlink throughput with respect to the case Without Clustering for 140 users.As expected, it can be also observed that eCORE outperforms CORE and, in turn, CaLB outperforms eCORE.In particular, CORE achieves a throughput 36.6% higher than Without Clustering, whereas eCORE reaches a 47.2% improvement and CaLB a 59.5%.As for CEEa, the additional constraints reduce the DL throughput, but still presents slightly better results than CORE.
Focusing on how CEEa limits the energy overconsumption of cluster heads, Fig. 4 plots the Cumulative Distribution Function (CDF) of the energy overconsumption, w, for eCORE, CaLB and CEEa.The overconsumption is always expressed with respect to the case where no clustering algorithms are implemented.Therefore, without any clustering, the energy overconsumption would be w = 0%.As it can be observed in Fig. 4, the energy underconsumption from which cluster members (except for the cluster head) benefit is similar in eCORE, CaLB and CEEe.However, CEEa limits the overconsumption of cluster heads.For instance 99% of the users have an overconsumption w < 20% with CEEa; in turn, for eCORE the 99% of users experience an overconsumption w < 240% and with CaLB the same percentage of users experience w < 260%.Therefore, CEEa is able to limit the overconsumption of cluster heads.
Given the trade-off between the maximum capacity gain (CaLB) and the minimum impact on energy consumption (CEEa), Fig. 5 sheds light on the energy efficiency of eCORE, CaLB and CEEa for 60 users.Cluster heads present low energy efficiency because they forward traffic to/from cluster members.Therefore, the percentage of users with low energy efficiency grows with the number of cluster heads.This can be significant in CaLB and eCORE.Conversely, CEEa alleviates partially the high energy consumption of cluster heads but decreases the throughput.In none of the cases the low energy efficiency of  cluster heads is compensated by the increased energy efficiency of the rest of cluster members.Accordingly clustering algorithms can improve the capacity of the network at the expense of lower energy efficiency.
In order to see how sensitive CaLB and CEE are to their key parameters (n d min and n u min for CaLB and w for CEEa), simulations have been run with different values.As for CaLB, differences in terms of throughput are not significant and below 2% for a wide range of values n d min and n u min .Although the creation/enlargement of clusters will start before as the values of n d min increase, it is also true that it will not be translated into a significant increase of the throughput.Therefore, CaLB is slightly sensitive to n d min variations in terms of throughput as long as n d min > 0, but should be selected small enough to avoid the creation of additional clusters when it is not actually needed (in terms of throughput). 3 Regarding CEEa, the key parameter is the maximum allowed energy overconsumption w.This parameter has a single objective that is attained in a two-fold manner: firstly, by preventing some users from becoming cluster heads (due to previous energy overconsumption), and secondly by forcing the release of the role of cluster head (if the energy overconsumption is too high).In a nutshell, the larger w is, the more aggressive the clustering is, thus achieving similar results to the ones obtained with eCORE (where no energy consumption constraints are imposed).Conversely, small w values impose additional constraints in the creation of clusters.This effect can be observed in Fig. 6, where the CDF of the energy efficiency is plotted for 3 No figure for throughput is included due to the slight observed differences 60 users and w = {0.2,0.6, 1.5}.Results for eCORE have been also included for the sake of comparison.It is observed that eCORE has cluster heads with low energy efficiency and in turn cluster members with high energy efficiency.The higher w is, the more closed results are to the ones of eCORE, since less constraints on energy consumption are imposed.

C. Discussion on Signalling
Signalling is an important aspect of D2D communications.3GPP establishes control and data plane paths for D2D communications (termed as Proximity Services -ProSe) in [6], and covers these aspects in more detail in [31].The proposed algorithms are framed within the group of UE-to-Network Relay functions [31], since the cluster head acts as a relay from each of the cluster members to the network.In this context two important interfaces are defined: PC3, defined as the interface from the relay (i.e., the cluster head) to the network; and PC5, defined as the one-to-one or one-to-many interface between users (the so-called D2D communication).The proposed mechanisms implement the network-assisted D2D mode with the loosely-controlled scheme, in which the network allocates resources for the D2D communications, and the cluster head reallocates the resources within the cluster.Network-assisted loosely-controlled D2D communications require additional signalling, particularly over PC5 interface.However, as shown in Table II, the proposed algorithms improve the throughput by creating a significant number of small size clusters rather than large size clusters, thus alleviating/reducing the increase of signalling over the PC5 interface.Therefore, although eCORE, CaLB and CEEa require additional signalling, the small size of the clusters limits the additional signalling burden over PC5.
Nevertheless, frequent cluster head (re-)selection could incur excessive signalling burden.There exists a trade-off between signalling and system performance.Algorithms eCORE and CaLB do not include neither parameters to control the number of clusters nor parameters to limit the duration of the clusters.Conversely, CEEa controls indirectly the number and size of the clusters, as well as how long they remain active or with the same cluster head, with parameters w max and T ε .

VI. CONCLUSION
This work presents a complement/alternative to the costly densification of cellular RANs based on the creation of clusters of users, where intra-cluster communications are carried out in a D2D mode.Three clustering algorithms are presented: eCORE, CaLB and CEEa.eCORE optimizes the usage of spectral resources by establishing spectral efficient clusters.Due to the significant imbalance between uplink and downlink traffic, CaLB creates non-spectral efficient clusters that improve the capacity of the network by reducing the aforementioned imbalance.Finally, CEEa is proposed to keep track of the overconsumption of users and ban some users from becoming cluster heads.Results show that the proposed clustering solutions increase the capacity of the network.In particular, the most aggressive clustering algorithm (CaLB) outperforms the rest of algorithms.Yet, any capacity improvement is translated into an increase of the consumed energy.In that sense, CEEa achieves a good energy consumption performance but it leads to the smallest capacity gain.
Fig. 1 sketches the initial scenario with 6 UEs served by one of the BSs (Fig. 1(a)) and the following cases: 1) Extension of the coverage (Fig. 1(b)): Assume that UE5 is in a coverage gap.If the quality of the links UE5-UE4 and

3 :
where N u k and N d k are the PRBs used in each band in BS k without clustering.As DL is generally more loaded, when N d k N u k and N d k ≈ N d,max k it may be convenient to create clusters to increase the capacity even at the expense of a spectral efficiency decrease.Lemma Given a BS k ∈ B with an average number of required PRBs without clustering in the downlink and in the uplink equal to N d k and N u k , respectively, the cell capacity is increased after creating the cluster u (with