Drone cell tower inspection: How to automate site data capture using AI in the metaverse
Cell towers are the fundamental structures of cellular networks where antennas and electric communications equipment are installed. As one of the major players in the wireless telecommunication industry, Ericsson has developed various methods and practices for making the deployment and maintenance of the towers faster, cheaper, and safer.
Safety in particular has been a key topic recently due to the potential casualties of field workers who climb tall towers to troubleshoot and collect status data. To mitigate residual risks, Ericsson introduced the Intelligent Site Engineering (ISE) solution in 2018 by adopting a camera drone which captures photos of equipment on the tower port and transmits to visual inspection experts in the remote office. In order to aid the spatial layout perception of the inspectors, the ISE also provides point clouds of the towers.
The next challenge around the towers is asset management at scale. Since site data capture has been manual and fragmented in terms of the data formats and locations, Communication Service Providers (CSPs) and Tower Companies (TowerCos) are facing the challenge of isolated or inaccurate data causing workers to revisit and re-climb.
The difficulty of site life cycle management intensifies as it combines with the scale of this problem: as of 2022, there are roughly 150,000 cell towers in the US. For tackling this challenge, in 2022 Ericsson proposed a Site Digital Twin (ESDT) embracing modern data sources such as drones and CAD models in combination with the Building Information Modelling (BIM) solution.
If complete information about the site topology and assets is available for a given cell tower in ESDT, it is possible to simulate any needed change and its respective effect there before sending a truck and thereby save the cost of a possible rework, in effect “bringing the IKEA kitchen planner to the telecom industry”. For achieving the goal at scale including all legacy sites globally, it is essential to have an automated tool for creating a granular 3D model of site topology and asset catalog in a format compatible with the BIM/ESDT.
That is why Ericsson's Global Artificial Intelligence Accelerator (GAIA), in collaboration with the company's Market Area North America, has developed the Virtual Network Roll-Out (VNRO) tool as the next generation of site data capture, which is scalable and free from human error.
VNRO for the part-based tower model
In the field of photogrammetry, a lot of computer vision techniques have been proposed to reconstruct a 3D representation of a physical target from its photos. Throughout the long history of multiple view geometry, scientists have developed several de facto standard open-source tools (for example COLMAP) and commercial ones (such as Apple’s Object Capture API) capable of generating point clouds using algorithms including Structure-from-Motion (SfM) and Multi-View Stereo (MVS).
However, because such a point cloud is a whole-body representation of target surfaces without the internal structure, we need to detect, segment, and identify components in the point cloud for making a part-based model that is applicable to simulation and asset management. The real challenge in segmenting the point cloud into meaningful components is the issue of point cloud quality. Due to many factors – bad lighting, fluctuating GPS coordinates, and wrong camera information, to name a few – affecting the quality. Even the point cloud with best efforts may have noise (“fuzziness”) and misaligned points which makes the component analysis difficult.
For tackling these issues at scale, GAIA Silicon Valley has developed the following techniques:
- Automatic evaluation of the quality of the point cloud
Garbage in, garbage out. We should detect and exclude corrupted point clouds before the VNRO pipeline starts because otherwise the component analysis from it is destined to fail. Detecting bad point clouds by eyeballing is not a feasible option for processing hundreds of thousands of site data sets, so GAIA developed a lightweight tool for measuring point cloud quality by comparing the reconstructed trajectory of camera positions and the GPS-based trajectory of the drone. If there is a significant gap between the two trajectories, it implies one or more fundamental factors are distorted: for example, a failed camera extrinsic or a temporary GPS error.
- 3D instance segmentation in a point cloud
Segmenting a point cloud into component-wise point groups is not a trivial challenge, especially if the components are placed densely on the tower port. Clustering may work only if the point cloud is clean and its components are placed sparsely. Considering the average quality of available point clouds, GAIA used 2D-3D co-segmentation by creating a 3D bounding cube from multiple 2D bounding boxes that a 2D object detector generates for the initial 3D point segmentation.
- Component-wise mesh model generation
In order to be imported to structure analysis tools (such as Revit) for simulation or BIMs for management, each point group should be translated to a mesh model with the right model ID which includes the necessary properties. By leveraging prior knowledge about components of interest, we generate an approximate mesh model of cube (antenna, RRU and TMA) or cylinder (mounting bars) that are enough to estimate the dimension, position, and orientation of components. Antenna position (in which sector) and orientation (azimuth, mechanical tilt, and plumb) for example are critical to site inspection because these are dominant factors for determining radio coverage of the cell tower. If the component is identifiable by its model ID, the estimated model can be substituted by the most precise CAD model from a vendor.
Figure 1 below is a sample page of the analysis report which is generated by the VNRO pipeline for a US cell tower. The page shows the spatial layout of twelve antennas with their respective sector ID and orientations.
Neural Radiance Field (NeRF), a New Hope
For the last two years in the computer vision domain, a new technology called a Neural Radiance Field (NeRF) has rapidly emerged as an alternative way of creating a 3D representation using deep neural networks or equivalents.
Although the NeRF was originally not designed to generate point clouds or meshes directly, by adopting several fast variants of the NeRF with some modification, we enhanced both the quality of the point clouds and the processing throughput. As a result, for each site, the processing time is reduced from several hours to less than one hour, while keeping better crispiness and reasonable robustness of environment. A significant proportion of sites where COLMAP failed were restored by applying NeRF.
Video 1 shows a time lapse of NeRF training progress for one of Ericsson’s cell towers. You will see the incremental enhancement of spatial resolution over time.
By leveraging this fast-evolving technology, we expect better granularity of our part-based tower models down to smaller parts such as cables or connectors, which may lead to models that are granular enough for structural analysis.
Asset catalog generation
The last challenge in asset management is identifying each device model to crosscheck if all the components are installed as being registered. Among the various devices on the tower port, an antenna is the most important component for determining the characteristics of the cell: if we know the antenna model, we can run a more accurate radio simulation by importing its known radio properties.
However, identifying antenna models is not an easy task as it involves picking from thousands of models which look very similar to each other (it is particularly hard for novices to tell the difference based on appearance), not to mention the ever-changing list by vendors. Natural bias to a few popular models in the training images brings up the further challenge of a severe imbalance problem for machine learning practitioners.
To tackle these issues, GAIA decided to adopt the approach of using a visual search in the context of metric learning. A fine-tuned combination of the Visual Transformer as an embedder, approximate nearest neighbors (ANN), and sampling techniques for active learning contributed to a promising self-test result of showing that our antenna model identifier achieved a 0.93 F1 score for 200 candidates. Another test of measuring the accuracy of k-nearest neighbors revealed a stable and consistent F1 score distribution across k = 1, 2 and 3, implying its scalability up to thousands of models.
GAIA Silicon Valley proposes a fully automated site data capture for creating a functional digital twin of a cell tower from drone pictures using modern computer vision technologies. The generated part-based tower model from VNRO will contribute to scaling up BIM/ESDT by accelerating the modeling of legacy sites, as well as supporting faster site inspection and maintenance of 5G and other sites in the field. This efficient site data capture will also reduce truck rolls and tower climbs, contributing to reduced carbon footprints.
More on the Ericsson Site Digital Twin
Harnessing drones and applying AI to mobile radio site management
Acknowledgements: Special thanks to the project members in GAIA Silicon Valley: Michael Siemon, Aditya Shah, Qing Wang, Wanlu Lei, and Aviv Bachan. We also appreciate cross-organizational contributions from Thalanayar Muthukumar, Fernando Martinez, Vidya Mani, Sathya Narayanan, Volodya Grancharov, Jiangning Gao, and Rerngvit Yanggratoke.
Like what you’re reading? Please sign up for email updates on your favorite topics.Subscribe now
At the Ericsson Blog, we provide insight to make complex ideas on technology, innovation and business simple.