Skip navigation
Planning in-building coverage for 5G

Planning indoor 5G coverage

Planning in-building coverage for 5G: from rules of thumb to statistics and AI

Improving the ability of network planners to estimate indoor traffic demand will contribute to more efficient 5G network deployments. Accurate indoor traffic ratio estimates are especially useful to operators rolling out mmWave coverage.

Key findings
New methods are being developed to accurately estimate the proportion of traffic in outdoor base stations that is due to indoor activity. Two distinct but interrelated approaches to the indoor traffic challenge are currently being explored by data scientists: one statistical and the other AI-based. In one dense urban high-rise area, 37 percent of macro traffic was served to indoor users during busy hours, indicating that in-building cell deployment could be increased to meet indoor traffic demand.

Traditionally, it has been assumed that 70–80 percent of mobile data traffic is generated indoors (including traffic served by in-building systems). Now, methods are being developed to accurately estimate the proportion of traffic in outdoor base stations that is due to indoor usage. The results of applying statistical approaches to three different environments in a metro area is recorded in Figure 28.

In urban deployments, the majority of mobile traffic is usually indoors, which is difficult to serve from outdoor base stations due to radio signal attenuation through walls and windows. With 5G systems, this can be even more of a challenge due to the use of ultra-high frequency bands.

The attenuation of radio signal power intensity, as a signal travels through the space between sender and receiver, is referred to as path loss and is the combined result of a number of factors including free-space loss, penetration losses, reflection, refraction and various other forms of fading.

5G systems can operate on a wide range of carrier frequencies, from below 1GHz in the low-band, up to 39GHz in the mmWave spectrum. Lower frequencies have good coverage characteristics, while high-band frequencies are useful for capacity, as the bandwidth available to be allocated is greater. However, signal attenuation increases with frequency.

The effect of frequency on path loss can be exemplified by measuring signal strength between two antennas 500m apart in line-of-sight. At the extremes, compared with a signal on 800MHz, a 39GHz signal has approximately 34dB (around 99.96 percent) more free-space path loss.

Another challenge of the higher frequency bands is the attenuation of signals penetrating buildings. In terms of signal propagation, buildings can be broadly classified into two types: modern thermally-efficient buildings with metallized glass windows, foil-backed panels for the walls, insulated cavity walls and thick reinforced concrete; and traditional buildings without any such material.

The median building loss for a thermally-efficient building is 50 times more than a building made with traditional materials at 800MHz and about 240 times more at 39GHz.1

To compensate for the loss associated with mmWave frequencies, service providers can use a range of solutions, including advanced antenna systems, beam-forming and indoor systems. Considering the high building penetration losses, high indoor traffic demand may make in-building solutions more economic. On the other hand, to properly serve the outdoor traffic, macro site densification may be needed. Having a realistic estimate of the indoor traffic ratio provides a solid ground for network investment decisions.

Percentage of traffic in outdoor base stations that is generated by indoor users in a specific metropolitan area

Figure 28: Percentage of traffic in outdoor base stations that is generated by indoor users in a specific metropolitan area

New methodologies

Two distinct but interrelated approaches to the indoor traffic challenge are currently being explored by data scientists: one statistical and the other AI-based.

Both can be applied to network data that is already available, for example from a 4G network, and used to estimate the indoor traffic ratio at cell level or in a cluster of cells. The data comes from network nodes, as well as crowdsourced data from user equipment (UE), such as smartphones.

Uplink data from performance management (PM) counters can be used. A key PM counter is the uplink path loss distribution (including free space, building penetration and other losses). Crowdsourced data is collected by third parties with the users’ permission through apps that log a range of data types. These include radio signal strength as reference signal received power (RSRP), location information and battery charging status.

The statistical approach

For uplink path loss distribution, a sample is collected at each transmission time interval (TTI), resulting in sufficient samples to allow the use of Gaussian Mixture Modeling (GMM). There is higher path loss for a smartphone located inside a building connected to an outdoor radio base station than for one outdoors due to the building penetration loss. The model works by taking all the data samples in a defined geographic area or a cell and then creating a path loss distribution for the data set. The model subsequently separates the data into user clusters by determining the best fit into a number of Gaussian distributions, each with its own statistical profile. Finally, by analyzing the data from each distribution, it can be determined which user clusters are indoors and outdoors, as can be seen in Figure 29. The statistical approach has the advantages of simplicity and transparency.

Gaussian Mixture Model analysis of path loss distribution of a cell

Figure 29: Gaussian Mixture Model analysis of path loss distribution of a cell

37%
In a dense urban high-rise area, 37 percent of macro traffic was served to indoor users during busy hours, indicating that in-building cell deployment could be increased to meet indoor traffic demand.

The AI approach: unsupervised learning

Compared to the statistical approach, machine learning techniques allow the use of data without directly specifying their contribution to the result. Using a technique called unsupervised learning, more data sources may be added with low effort, and more subtle information in the data can be exploited without direct human interaction.

To label a mobile phone as indoor or outdoor, unsupervised learning is used on data including RSRP, battery charging status and throughput. The machine learning model splits the feature space (that is, a set of measurements which describe the data) into a number of clusters and predicts if a cluster belongs to indoor or outdoor activity. All samples of the mobile phones falling inside an indoor cluster will be labeled as indoor.

Analysis and results

The methods were applied on the path loss distribution from performance management counters during business hours on 21 weekdays in the metropolitan area for the 4G cells. Data from a mobile service provider was analyzed after segregating the path loss distribution for macro and small cell base stations. The statistical analysis included three different environments: urban, dense urban (high-rise) and residential.

In the urban district, the average indoor traffic for outdoor cells was about 64 percent, and the 4G small cell base stations were serving 54 percent or more of the outdoor traffic. These results suggest that the service provider could consider deploying in-building solutions where possible and then augmenting the number of outdoor small cells.

For the dense urban high-rise area, around 37 percent of the macro traffic and 40 percent of the outdoor small cell traffic is being served to indoor users. This indicates in-building cell deployment could be increased to meet this demand, which is mostly in modern thermally-efficient buildings.

The low figures for the proportion of traffic in the dense urban high-rise area are understandable in the context of buildings which have in-building coverage systems.

By quantifying the traffic demand and coverage from both inside and outside, the additional resources that would be needed can be determined so that the least amount of radio power is sacrificed to penetration losses.

The unsupervised learning approach was applied to a larger area that included all three metro districts mentioned in the previous approach. It has the capability of handling more data efficiently and can handle multiple inputs instead of relying on one measure like the statistical approach. More granular results can be obtained, since the labeling is done at the device/sample level rather than an aggregate measure. This allows not only the computation of an indoor traffic ratio but also of any indoor–outdoor ratio using the same model. The percentage of devices that are indoor, and the traffic percentage ratio that is generated from indoor devices, is then calculated; 61 percent of devices were detected to be indoor in the data set with 59 percent of traffic on outdoor base stations serving those devices. These results align with and complement the results from the statistical methodology.

Rolling out 5G networks

Designing Radio Access Networks (RAN) has grown in complexity with every generation of mobile communications. Now, as 5G is being rolled out, finding an optimum design for the service provider – balancing quality of service with efficiency – is made more challenging by the use of high-band frequencies, which combine high capacity with increased absorption losses through obstructions. The ability to increase prediction accuracy to find an optimal distribution between outdoor and indoor radio solutions faster helps deal with the growing complexity. Ultimately these approaches will be automated, allowing for continuous monitoring of 5G RAN efficiency.