Towards cross-ecosystem interoperability with automated data model conversions
We are moving towards a fully connected world, where there is an increasing need to interact between digitalized and siloed ecosystems. For example, if there is an automotive accident, cars (belonging to the automotive ecosystem) need to be able to automatically contact health and safety officials (the public safety ecosystem).
However, different ecosystems have traditionally been developed independently since there hasn’t been the need for online interworking between them. As a result, the way data is represented by different ecosystems varies a lot, with many different data models describing the same or similar things in various ways. This makes it a costly and complex challenge to achieve smooth interoperability between devices.
At Ericsson Research we are developing solutions for some of the key challenges for interoperable IoT ecosystems, in particular for enabling interoperable data and interaction in heterogeneous environments.
In our previous blog post Data interoperability across IoT ecosystems with One Data Model (OneDM), we described the purpose of the OneDM liaison group and the Semantic Definition Format (SDF). The OneDM group was formed in 2019 to create more efficient methods for data conversion models between different IoT ecosystems, enabling easier integration and interoperability (see Figure 1). In that blog post we also briefly described the tools that we have been developing for automated conversions between IPSO and SDF data models.
In this article, we delve deeper into some of the considerations when developing conversion mechanisms. Since the publication of the previous blog post, we have extended the tools in OneDM to cover more data model formats such as the Azure Digital Twins Definition Language (DTDL) and the ETSI NGSI-LD. A subset of the conversion tools has now been published as open source GitHub repository to support the standards development. In addition, publishing the tools as open-source software enables the research and development (R&D) community to experiment with them and understand how automated conversions can be achieved, including their limitations and challenges.
Modeling devices and systems with data and information models
Devices and other entities have to be formally described when their deployment and usage is automated. For this purpose, the:
- Capabilities of the devices have to be identified, and
- A commonly understandable semantic language used to describe them and create data models that can unambiguously understand the interactions with the actual devices and entities.
In the context of IoT, data models and information models provide the means to describe simple items or more complex devices consisting of multiple items. Over the years, there’s been discussions about the differences between information models and data models. In that context, the IETF created a document to clarify that situation. The distinction between the models can be summarized as follows:
- An information model is a higher-level concept that describes various entities and the relationships between them. It hides the implementation and protocol details.
- A Data Model, instead, is a derivation from the Information Model and consists of protocol details as well as other implementation related information. Data Models typically target implementors.
This article follows the terminology defined in the Internet Engineering Taskforce (IETF) document. For data model conversions our work focuses on class-level information and does not include instance related details. Instance specific information presentation is not yet defined in SDF and the work is ongoing in the IETF.
Affordances
Each device provides capabilities on their interfaces that are offered for interaction defining its possible uses or making clear how it can or should be used. These capabilities are called with a generic name, affordances, as defined in the SDF terminology, and used also in the Web of Things. Affordances can be readable or writable values (properties), device's capabilities to push information asynchronously to the data consumers (events) and operations that can be triggered on it (actions).
Describing Objects and Things with data models
For a device to be deployed automatically it must have a formal machine-readable description. This formal description can be created using different data modeling languages or schemes such as the Digital Twins Definition Language (DTDL) and the OMA Lightweight M2M (LwM2M) object model used by IPSO Smart Objects. However, since the different modelling languages describe the device affordances in different ways, the resulting data models do not directly enable interoperability - even if they describe the same thing.
Figure 2 shows a snippet of a temperature sensor Data Model using the IPSO model scheme.
In IPSO, sensors are modelled as Objects, and the object capabilities are described as Resources. This temperature object example shows one resource, namely Sensor Value, describing the current temperature value. The complete object model defines additional resources, such as the measurement unit used by the device, the range of the measured values, and the type of the value provided. These are omitted from the figure for brevity. The complete object definition can be found from the LwM2M GitHub repository.
It is quite rare that sensors are used as standalone devices, instead, in most cases they are used as part of a more complex system. Sometimes, for example, a device measuring different weather conditions may consist of temperature, moisture and air pressure sensors. When all these individual sensors reside on a single physical device, we may want to model the whole device instead of each component separately. Thus, the individual data models representing the sensors are merged into one model, which represents the model of the whole system.
Ecosystems
An ecosystem is a network of interacting components or players which can be defined in a number of different ways. It might be that within a specific industry sector, defined by an isolated challenge or solution, or where a single key actor interacts with a range of partners and suppliers all characterized by well-defined common processes or goals. Within these ecosystems, there is a reasonably high expectation of interoperability, meaning devices can communicate and understand each other on the semantic level, i.e., not exchanging just arbitrary values but also sharing a common understanding of the meaning of these values and their context. Typically, these ecosystems are designed and implemented using ecosystem-specific Information data models with protocols that are able to fulfill the needs of the target system. It’s common that players within an ecosystem agree on specific details to get all components working together smoothly.
Ecosystems are often focused on different use cases. For example, one ecosystem can be designed to support standard factory floor operations with all the related equipment, while another could be designed for flight controlling systems. Until recently, there has been little need to interoperate among such systems, which has led to many independent, non-interoperable implementations.
Data model conversion
Enabling interoperability
To enable interoperability between data models with different data modelling languages, it should be possible to convert the data models between the two language formats without losing any relevant information. Given that languages will express the affordances using different vocabularies, a well-defined mapping between the affordance vocabularies is needed. In our experience, often a large portion of the affordances maps easily to one-to-one.
Based on the above description, given n ecosystems, mappings from one ecosystem to every other ecosystem needs to be created, resulting in O(n^2) two-way conversions. If there are ten ecosystems, it would require 45 two-way mappings. Doing this manually is cumbersome, highly inefficient and error prone. Further, this approach is not scalable since the device "properties" are not static. Whenever a change occurs, the change needs to be propagated across all n mappings and then re-implemented in the integration software. This approach is, however, the most common mechanism of connecting different ecosystems together.
To avoid the need to make direct mappings between all models in all the ecosystems, OneDM uses the Semantic Definition Format (SDF) as an intermediate model (Figure 1). In this way, only mappings from each ecosystem to and from the SDF model need to be created, reducing the number of needed conversions to n (i.e., the number of ecosystems). In case of modifications in one data model, only one conversion is affected. The challenge using this approach is to make SDF sufficiently generic but also expressive enough to support as many as possible ecosystems.
Mapping concepts between data models
Most of the Internet of Things (IoT) data modeling languages follow roughly the same structure, with each language defining an entity that further contains different affordances, i.e., the interaction capabilities that the entity has. Figure 3 depicts some of the concepts in Microsoft Azure DTDL and IPSO Smart Objects and their mapping with SDF. First, the entity itself is described as an “Object” in IPSO, “interface” in DTDL and “sdfObject” in SDF.
Going deeper inside an entity, we have affordances defined in different ways in different languages. A DTDL Interface consists of zero or more of the following elements: Property, Telemetry, Command, and Relationship. On the other hand, in IPSO Smart Object the elements are described as Resources. The IPSO resources can be Readable (R), Writable (W), or Executable (E). There is a clear mapping between the readable and writable resources in IPSO to the DTDL Properties as we can see in Figure 3, and with executable resources in IPSO to commands in the DTDL. Now, we can create an SDF model between them, using sdfProperty and sdfAction affordances which can be mapped to both DTDL and IPSO SO.
Figure 4 portrays a conversation snippet from a temperature sensor’s data model from the original IPSO model (Object 3303) to SDF model. Further, Figure 5 shows the conversion from SDF model to DTDL model. This example shows how the IPSO Resources are mapped to the sdfProperties in SDF and further to the DTDL Properties.
Challenges in conversions
In our experience, most parts of the data models can be converted between formats supported by the different ecosystems. However, there are still some gaps that need to be considered. In the following, we give a couple of examples of such challenges that we have faced and provided proposed solutions:
Due to differing ecosystem requirements, there may be situations where the target ecosystem’s data modelling format doesn’t support some features defined in the source data model. This means that additional steps may be required to create a conversion.
For example, a device-initiated event is defined as sdfEvent in SDF and Telemetry in DTDL. This can be used to describe the mechanism to deliver e.g., information from changing sensor values asynchronously to the data consumer. However, in IPSO this is implemented using the LwM2M SEND interface without specifically defining it in the IPSO Data Model (see Figure 3).
Sometimes data models represent specific properties which do not map to other data models. One example is the modelling of complex relations between entities that may not have corresponding definition in other data models. Figure 3 shows the DTDL Relationship element that is used to describe arbitrary relations between different entities. However, the first version of SDF specified in the IETF standard draft, does not have a definition for complex relations. Consequently, for this particular case, we have proposed an extension called sdfRelations to SDF allowing similar relation descriptions as in DTDL.
Every such difference in data model conversion must be handled carefully when the conversion software is designed. This might mean specifying how to create data models keeping certain caveats in mind so that conversion becomes possible, or even to use comments to provide extra context to the conversion tool. This provides a possibility to influence the specifications in other ecosystems and further facilitates the interoperability in the future.
Challenge with units and their conversions
While there are multiple different measurement systems in the world, the two most commonly used are the metric (SI) units and imperial units. For example, DTDL primarily uses the imperial system, while SDF uses the SI-based metric units as described in the Sensor Measurement Lists (SenML) units registry. In human history there have been many examples of how the incorrect handling of unit translations can have a catastrophic outcome. For example, in 1999, the Mars lander was lost due to a mix of two unit systems: the software that was calculating the acceleration data was providing the results in pound-seconds (fps), while the software calculating trajectory was expecting input in newton-seconds (Ns). This resulted in the trajectory calculation being off by a factor of 4.45, causing the lander to crash onto Mars’ surface.
For the DTDL – SDF conversion tool, we have created mappings between the DTDL units and the SI-based SenML units so that the conversion between the units can be done smoothly. Figure 6 (below) shows how the units are converted between these different data models. If the source model uses a unit that is not one of the standard units used by SDF, the unit can be either expressed as URI (pointing to the definition of such unit) or by using a standard unit in SDF and translating the values. In figure 6 the latter approach is used and the information needed for translation is stored outside of the model.
Furthermore, there are existing ontologies such as the OM 2 (Ontology of Units of Measure) that can be used to facilitate unit conversions. OM 2 defines an extensive set of units and formulae for conversions between them. Utilizing such ontologies when describing the units in data models makes unit translation more automatable and reliable.
Current status of implementations and future work
Tools published in the Ericsson Research GitHub
We have been implementing software tools for conversion of data models to support standardization activities. Some of these tools are now publicly accessible via the Ericsson Research public GitHub repository. The tools have been published to enable an easy and automated way to create SDF models from data models in different ecosystems and vice versa.
The tools that are currently published include:
- SDF to/from IPSO models: this tool converts IPSO models to and from SDF models. Set of SDF objects that have been created from IPSO models is available at the OneDM OMA model GitHub repository.
- SDF to/from DTDL models: This tool converts DTDL models to and from SDF models. The current implementation supports the basic operations. As mentioned earlier, there are some features that are not yet available in SDF specification such as arbitrary relations between the objects. Therefore, initial design for sdfRelations has been implemented in the tool, allowing experimenting with high fidelity conversions.
- Thing Creator: An sdfThing is a model of a complex object consisting of one or more atomic objects or even other sdfThings. To simplify the creation of such definitions, this tool combines the given SDF objects into an SDF Thing describing fully e.g., a device that has multiple sensors and actuators installed on it. Having one model containing everything logically organized and describing all the operations that the device can perform helps the deployment and usage of the device.
Future direction
The work described in this blog post is currently under development. The main focus of this activity is to increase the flexibility of the SDF data model format, as well as to identify and define how to handle various corner cases. In addition, we have been experimenting with conversions using data model formats from other ecosystems, including OPC-UA and NGSI-LD to name a few. These tools and the findings from the experiments have not yet been published, so stay tuned!
Read more
Read our blog post Data interoperability across IoT ecosystems with One Data Model (OneDM)
The Ericsson Research public GitHub repository
One Data Model’s Liaison Group’s home page
One Data Model’s Liaison Group’s GitHub page
Read about our research on the future Digitalized programmable world
Like what you’re reading? Please sign up for email updates on your favorite topics.
Subscribe nowAt the Ericsson Blog, we provide insight to make complex ideas on technology, innovation and business simple.