Data harmonization, or: What is my throughput?

OEE Dashboards: 4 Examples with Excel, PowerBI, Grafana & Co.

Deniz Saner

Deniz Saner

|

25.05.2023

25.05.2023

|

Wiki

Wiki

|

3

3

Minutes read

Minutes read

We are pleased to welcome you back! In our last blog post, we highlighted how the lack of a global naming scheme for control programming causes significant effort in machine data acquisition, as one has to infer the desired physical quantities from cryptic designations. This effort scales with the number of machines in heterogeneous machine parks.

In this post, we build on this topic and show how to digitize a heterogeneous plant with the ENLYZE Data Platform and establish a uniform naming scheme.

Overview Blog Series Connectivity & Machine Data:

  1. OPC UA: Blessing or Curse for Industry 4.0?

  2. Digitalization Dilemma: Working for the data or working with the data

  3. From Euromap, data blocks, and harmonized data

  4. Data harmonization, or: What is my throughput called?

  5. How edge devices don’t become a security gap

  6. No more closed systems

  7. With ENLYZE and Grafana ready for all challenges in production

  8. The key to AI in production

Lesson #4: It takes a uniform process to tame the chaos 

As previously mentioned in the last post, on average three data sources per system are integrated on the ENLYZE Data Platform, each containing about 13,000 variables. However, only about 67 of these 13,000 variables are needed and recorded. Therefore, the relevant 0.5% must be identified from several thousand variables. 

After successfully digitizing over 100 systems, we have identified five steps that can help you systematically move from connecting your system to value creation through digitization.

Here are the five steps you should go through:

  1. Choose the application case

  2. Involve all resources

  3. Filter, visualize, and select

  4. Utilize patterns in naming when they exist

  5. Create a consistent naming scheme

🤔 Step 1: Choose the application case

The first and most important step is to choose a clear application case with as much added value as possible. Once this is set, the variables for each machine that are truly necessary automatically emerge. This prevents the common analysis paralysis that arises when thinking about which variables might be relevant for the future could.

From our experience, customers who define a clear application case and its success criteria a priori are the most successful. Often, one learns new things along the way to the goal that affect the next application case. 

Across our customer base, we observe some typical application cases, which we have listed below:

  • Downtime detection and performance tracking (1 - 3 variables)

  • Energy monitoring (1 - 15 variables)

  • Tracking raw material consumption (1 - 20 variables)

  • Process monitoring and alarms (3 - 50 variables)

  • Optimization of setting parameters (10 - 100 variables)

  • Impact and root cause analysis, anomaly detection, and process understanding (50 - 200 variables)

Depending on the application case and the resources available, it may occur that additional sensors need to be installed to measure further variables such as ambient temperature or humidity. 

Production plants are often very differently structured. However, it can be said that a solid data basis that covers all the above-mentioned application cases typically includes data from the main PLC, peripheral devices (dosing units, pressure stations or quality systems), and external energy meters.

📄 Step 2: Involve all resources 

Any type of documentation about your data sources can be helpful when selecting the right variables. This may have been created internally or may come from the plant manufacturer. If possible, communication with your plant manufacturer is often the easiest way. However, with older machines, it can become difficult to find any documents or information at all.

The plant PC or HMI computer often contains valuable information, such as project files or Excel lists with notes on variable names and their meanings. In general, the HMI is a very useful source of information. As a rule of thumb: If a value is displayed on the HMI, it can also be read. 

Furthermore, relevant information can be gathered from the meta-data of the variables:

  • What kind of data type does it have (Float, Integer, Boolean, or String)

  • Is there a target and/or actual value, 

  • What value range does a value fall within (hint at possible scaling).

Our standard approach is to photograph relevant HMI views during ongoing operations to have reference values later.

🕵️ Step 3: Filter, visualize, and contextualize

From our experience, the most labor-intensive step is the actual search for the variables. Ideally, there are documents from step 2 available, from which the exact variable designation can be taken. Then, the variable selection is easy. Unfortunately, this is rather the exception.

When digitizing heterogeneous machine parks, it additionally comes into play that one has to work with various media breaks: Here a project file of an S7300, there an OPC-UA client, and alongside an old data sheet of a gravimetry. If you are currently maintaining a huge Excel sheet with data sources, protocols, variables, and their identifiers, scaling factors, and standardized variable names, you are probably feeling a cold shiver running down your back.

As we at ENLYZE have been confronted with exactly this challenge for years, a central part of our data platform is a unified, browser-available process for variable selection. This so-called ENLYZE Variable Selection offers the following advantages:

All variables in one place

Once a system has more than one data source, searching becomes cumbersome if it cannot be done in one place. Media breaks, such as switching between programs or computers, significantly slow down the process.

The ENLYZE Variable Selection serves as a central point through which all available variables can be searched and managed.

Fine-grained filtering functions

Searching and filtering by different criteria is the main tool for understanding the pattern of control programming. For this purpose, the variable space in the Variable Selection can be restricted via a data type and free text filter: 

The former is relevant as large parts of the control logic are represented by Booleans, which are often not relevant for recording. Excluding these from the search significantly reduces the variable space.  

By selectively applying