Preprint
Article

This version is not peer-reviewed.

Regression Metamodel-Based Digital Twin for an Industrial Dynamic Crossflow Filtration Process

A peer-reviewed article of this preprint also exists.

Submitted:

26 January 2024

Posted:

29 January 2024

You are already at the latest version

Abstract
Dynamic crossflow filtration (DCF) is the state-of-the-art technology for solid-liquid separation from viscous and sensitive feed streams in the food and biopharma industry. Up to now, the potential of industrial processes is often not fully exploited because fixed recipes are usually applied to run the processes. Therefore, a digital twin has been developed to optimize an industrial brownfield DCF plant. The core of the digital twin is a mechanistic-empirical process model combining fundamental filtration laws with process expert knowledge. The effect of variation of selected process and model parameters on plant productivity has been assessed using a model-based design-of-experiments approach and a regression metamodel has been trained with the data. A cyclic program that bidirectionally communicates with the DCF asset serves as frame of the digital twin. It monitors the process dynamics membrane torque and transmembrane pressure and feeds back the optimum permeate flow rate setpoint to the physical asset in almost real-time during process runs. The presented digital twin framework is a simple example how an industrial established process can be controlled by a hybrid model-based algorithm. With a digital process dynamics model at hand, the presented metamodel optimization approach can be easily transferred to other (bio)chemical processes.
Keywords: 
;  ;  ;  ;  
Subject: 
Engineering  -   Bioengineering

1. Introduction

Since the term Industry 4.0 was introduced at the Hanover Fair in 2011, the importance of the development of cyber-physical production systems (CPPS) is growing faster than ever also in the bio-manufacturing industry. CPPS are physical production plants equipped with computation processing units used to automatically control the process. The realization of a cyber-physical system for manufacturing is generally understood to require, among other things, a virtual replica of the process that bidirectionally interacts with the physical process, the collection and usage of process data and the optimization of the process with the help of intelligent algorithms [1]. The virtual part of a CPPS including the digital process model and the algorithms is called digital twin (DT) [2,3].
Concepts and requirements for digital twins in manufacturing have frequently been presented and reviewed in literature [4,5,6,7,8,9,10,11,12,13,14,15], whereas practical use cases where a digital twin has been developed for real-life processes are few to be found in literature. Lopez and co-workers built a hybrid model for a fermentation process that is updated with data during the process [16], but they did not close the information loop from the digital part back to the physical process. A digital process replica with such a “one-way” information flow from the physical to the digital part is called a digital shadow [4,16]. Trunzer et al. used data-driven models trained with expert knowledge to identify the operational states of an industrial chemical reaction plant from the process data [17]. However, the models are not used for plant process optimization. Chew et al. used data-driven models to control a filtration process [18]. The optimization algorithm was tested on a laboratory filtration unit and a benefit for control of industrial filtration units is predicted.
Digital twins rely on digital process models by definition, the backbone that makes process monitoring, simulation and optimization possible [19,20,21]. Regarding the literature, mathematical models used to describe bioprocesses are divided into mechanistic and data-driven models [19,22]. Mechanistic models include models based on first principles namely mass and energy balances and previous process knowledge. Data-driven models are built up on process data without incorporated previous knowledge. The trend of the recent years is to combine mechanistic and data-driven models to hybrid model structures in order to profit from the extrapolation capability of mechanistic models and the flexibility of data-driven models at the same time [22,23]. Mechanistic and data-driven models can be arranged in different serial or parallel configurations to hybrid models, depending on which modeling goal is to be aimed for [24,25]. The most commonly used model configuration in biochemical engineering is a serial structure of a data-driven model and a mechanistic model wherein the data-driven model functions as filler for overall model parts that cannot be expressed based on first principles [24].
Regarding the model description of filtration processes, a spectrum of different approaches and use cases is presented in literature [26]. Krippl and coworkers connected a neural network with mass balances in a serial model configuration to predict the permeate flux over time in a conventional crossflow filtration unit [27]. Piron et al. applied the serial model configuration to yeast filtration [28]. Díaz et al. used a serial neural network and Darcy law configuration for flux prediction, but did not use the model for process control and optimization [29]. Chew et al. used a similar model structure for the prediction of fouling and filtration resistances of an industrial water clarification process [30]. Also here, the model was not yet further used for process optimization.
A special type of model interconnection is the concept of metamodels or surrogate models [19,22,31,32]. A surrogate model is defined as a superordinated, simplified model based on the knowledge of the original, underlying model. The metamodel reduces complexity and increases calculation time to enable model calling during real-time optimization of a process. Franzreb et al. used a model-based design-of-experiments (DoE) approach to make statements about the economic variables of a complex antibody production process [33]. Wang et al. trained a neural network with data generated from a mechanistic model of protein chromatography and he recognized advantages of the metamodel to quickly predict model parameters to adapt the mechanistic model to new feed streams [34]. Reports on a metamodel built for a filtration use case could not be found in the literature.
This paper is aimed to focus on the process of dynamic cross-flow filtration (DCF). DCF is a high performance filtration technique that relies on rotating discs reducing the deposition of fouling components on the membrane surface [35,36]. Modeling of rotating membrane discs requires consideration of physical phenomena occuring only with DCF such as cake reduction and fluid backpressure due to rotation of the membranes [37,38]. In this publication, the reader is shown how we built a mechanistic, semi-empiric, hybrid process model of membrane fouling during DCF of grape must. We present a concept of training a data-driven metamodel banking on model-based DoE and response surface modeling and showcase how to straightforwardly apply the resulting metamodel for the time-critical forecasting of optimum process productivity. The model collective is implemented into a digital twin framework that reads the torque and TMP parameter at regular time intervals from the physical DCF plant during the process run and returns the permeate flow rate setpoint calculated for optimum productivity to the process actuator. The added value of introducing the digital twin to the industrial process is examined by comparing the overall productivity between 2022 and 2023 grape must filtration campaigns.

2. Materials and methods

In this section, the plant set-up of the industrial filtration runs and an examplary set of process data, that is applied for model construction, are presented. The structure and the elements of the digital model collective are explained, followed by a description of the superstruction of the digital twin framework.

2.1. Production set-up and operation

The experimental set-up for the filtration runs is shown in Figure 1. The filtration machine (F1) is a DCF type number 312/32 (Andritz Separation GmbH, Vierkirchen, Germany) equipped with four shafts and 256 ceramic membrane discs. Permeate passes through the porous membrane discs and it is discharged from the machine inside the hollow shafts. Retentate is collected inside the process chamber and can be released by opening a valve (V1). The membrane discs are mounted on top of each other on every shaft, so that they form membrane stacks. Adjacent membrane disc stacks overlap, causing increased shear rate at the overlap region, thus achieving an increased cleaning effect from solid deposits at the overlapping points. The total available filter area is 32.8 m2, the membrane diameter is 312 mm.
The feedstock considered for the filtration experiments of this research study is pressed grapes containing pulp and other solids. Thus, the valuable product of the process is the clarified sweet must, i. e. the permeate. A membrane pore size of 200 n m is used for the filtration process. The filtration runs were performed at a production facility of a wine manufacturer in Italy during the regular sweet must production campaigns from August to October 2022 and 2023.
A centrifugal pump (P1) transports the feed from a storage tank into the DCF chamber. The feedstock is periodically mixed in the storage tank, however, a fully uniform solids content in the feed during the entire filtration process cannot be ensured. The pumping rate of the feed pump is controlled by a programmable logic controller (PLC) using the signal of sensors recording transmembrane pressure (TMP) and permeate flow rate. TMP is measured by the difference between feed inlet pressure and permeate outlet pressure, i. e. TMP reduction due to membrane rotation is neglected. The filtration process starts in “flow controlled” operation mode, that is pumping rate of the feed pump is controlled for a constant permeate flow rate. The initial permeate flow setpoint is predefined by the plant operator at the beginning of the process run from experience dependent on estimated solids concentration in the feed and post-cleaning permeability of the membrane discs. As soon as the TMP setpoint of 0.8 bar is exceeded once during the production run, operation mode automatically switches to “TMP controlled” and the feed pump rate is controlled for keeping the TMP at a constant value.
The rotation speed of the membrane drive is controlled to a constant value of 340 min−1 during all experiments. Readout of power consumption from the frequency converter of the motor rotating the membrane shafts is done to receive data about the membrane drive torque. The membrane drive torque is strongly correlated to the retentate viscosity inside the DCF chamber and provides information about the increase of solids concentration in the retentate during a filtration run. Opening of the retentate valve for release of retentate from the DCF chamber is controlled by the membrane torque. At the beginning of the process, the DCF chamber is filled with feed suspension. During concentration phase, the retentate valve is kept closed until the torque setpoint of 300 N m is reached. Subsequently, retentate discharge begins by gradual opening of the retentate valve until the torque does not increase any further. Closing of the retentate valve happens with a torque hysteresis of 10 %, i. e. it closes only when membrane torque falls below 270 N m.
The end of a filtration run is decided by the operator based on the amount of feed suspension available. A cleaning procedure with variable efficiency is followed by each filtration process. The permeate water flux is measured after every cleaning cycle for the assessment of the cleaning efficiency. All sensor and actuator setpoint data is logged 24/7 in 5 s intervals as time series in a server database and was used for model and digital twin validation after download in csv format and segmentation into single filtration runs.

2.2. Mechanistic-empirical process dynamics model

A dynamics model that describes the physical process as close as possible is the backbone of the digital twin. We decided to build a mainly mechanistic model complemented with semi-empirically chosen adjustment parameters in order to use existing process knowledge as a counterbalance to the limited variations in the available process data from filtration runs in industrial production.
The calculation of the process parameters TMP, permeate, retentate and feed flow rates, membrane drive torque and valve opening percentage is implemented in a time loop with a cycle time of 3.6   s . In order to replicate the solids concentration gradient along the shaft axis from the feed inlet to the retentate outlet, the DCF chamber was conceptually divided into segments and the permeate flow rate and membrane drive torque were calculated separately for each segment. The number of segments was set to s max = 4 after a segment parameter study and the comparison of model simulation accuracy. TMP was not calculated per segment, since pressure drop along the shaft axis was assumed to be negligible. The sequence of calculation steps of the process dynamics model is presented in Figure 2 in the right-hand detail box.
The Darcy equation was used to obtain permeate flow rates during TMP-controlled operation mode or TMP during flow-controlled mode, respectively:
TMP ( t i ) = η · Q per ( t i , s ) · R tot ( t i , s ) A segm
where η is the dynamic viscosity of water in Pa s , Q per is the permeate flow rate in L h−1, R tot is the total filtration resistance in m−1 and A segm is the filtration surface area of one segment. The dependencies of the variables on the time t i and the segment number s are marked. In case of flow-controlled operation mode, the calculation is not trivial, because although the total permeate flow rate is set by the operator or the digital twin, its distribution between the different segments of the DCF model is unknown at this stage of the time step. Therefore, in an intermediate calculation for each segment a reference flow rate was calculated for a reference TMP. Subsequently, the sum of these reference permeate flow rates was compared to the desired total permeate flow rate and the TMP was adjusted in order to match the flow rates. Knowing the required TMP, the real segment-wise permeate flow rates were calculated from the Darcy equation.
The resistance-in-series approach was chosen for the modeling of the filtration resistances:
R tot ( t i , s ) = R m + R pore ( t i , s ) + R cake ( t i , s )
where R m is the intrinsic membrane resistance in m , R pore is the pore resistance in m−1 and R cake is the cake resistance in m−1. The pore resistance was defined to follow an exponential relation:
R pore ( t i , s ) = R pore , ref · exp k pore · k p , i ( t i , s ) 1
where R pore , ref is the reference pore resistance in m−1, k pore a model adaption parameter in L g m−1. The kinetic parameter k p , i ( t i , s ) is calculated as follows:
d k p , i ( t i , s ) d t i = c fine · Q per ( t i , s ) A segm
where c fine is the concentration of fine particles in the DCF chamber in g L−1. Fine particles are assumed to pass through the membrane and fine particle concentration is set to 20 % of the feed concentration. A rational equation of the Langmuir type was used to model the cake resistance striving asymptotically to a maximum during filtration runs:
R cake ( t i , s ) = R cake , SS ( t i , s ) · k c , i ( t i , s ) k c , i ( t i , s ) + k cake
where R cake , SS ( t i , s ) is the steady-state cake resistance in m−1 and k cake a model adaption parameter in g h L−1. The increase of the parameter kinetic parameter k c , i ( t i , s ) in time is defined as being dependent on the chamber concentration:
d k c , i ( t i , s ) d t i = c ( t i , s )
and the steady-state cake resistance follows the equation:
R cake , SS ( t i , s ) = R cake , SS , ref · Q t , per ( t i 1 ) Q t , per , ref n Q · exp TMP ( t i 1 ) TMP compress · c ( t i 1 , s ) c ref n c
where R cake , SS , ref in m 1 , Q t , per , ref in L h−1, TMP compress in bar, c ref in g L−1 and dimensionless n Q and n c are model parameters. Variable values taken from the previous time step of the calculation are marked with index i 1 . The composition of the steady-state resistance equation results from the analysis of the 2022 campaign’s process data and is based on physical knowledge about cake formation, cake shearing and cake compressibility in filtration.
Fluid mass balances were implemented to obtain the retentate flow rates per segment, in which feed flow rates of a segment is understood as the retentate flow rate of the previous segment:
Q ret ( t i , s ) = Q ret ( t i , s 1 ) Q per ( t i , s )
The overall feed flow rate of the process defines the inlet of the first segment:
Q ret ( t i , 0 ) = Q feed
Chamber concentration values are calculated from segment-wise, instationary species mass balances:
V segm · d c ( t i , s ) d t i = Q ret ( t i , s 1 ) · c ( t i , s 1 ) Q ret ( t i , s ) · c ( t i , s )
with the following boundary condition:
c ( t i , 0 ) = c feed
where c is the coarse solids concentration in the DCF chamber in g L−1, saying the concentration of particles and colloids which are retained by the membrane. The membrane drive torque is calculated from the solids concentration in two steps, firstly via an empirical equation relating the viscosity exponentially to the concentration:
η ( t i , s ) = η ref · exp k η · f c ( t i ) · c ( t i )
where η is the dynamic viscosity in Pa s , η ref and k η are model parameters and f c ( t i ) is an empirical correction function that takes into account that the suspended solids concentration in the DCF chamber does not exactly follow Equation 10, possibly because a fraction of the solids is fixed in the filter cake or attached to the chamber walls. In order to consider the effect, which results in a slowed down increase of the torque, f c ( t i ) follows an exponential decay until the retentate valve is determined to open, indicating the approach of a quasi-steady state in the DCF. Secondly, the well-known first principle mechanistic equation linking viscosity and torque is applied:
M ( t i , s ) = M seal + k g · ω · η ( t i , s )
where M seal is the torque already caused by the seals of the rotating shafts and k g is model parameter determined by geometry of the DCF, mainly the total filter area and the distance between the discs.
The last calculation step is to obtain the total retentate flow rate from the valve opening:
Q t , ret ( t i ) = k valve · TMP ( t i ) · X ( t i ) τ y
where Q t , ret is the total retentate flow rate in L h−1, k valve is the valve cross-section coefficient in L bar−1 h−1, X the valve opening in % and τ y the flow limit in bar. When the critical torque is reached, the degree of valve opening is gradually increased starting from a minimum opening percentage until the torque starts to descend.
For the simulation of the model, the initial permeate flow rate setpoint, the solid feed concentration, the process duration and all model parameters have to be defined as input. After the digital process simulation, the mean productivity over a filtration run is obtained via a time averaging and space summation of the permeate flow rate:
P = 1 t end · t = t 0 t end s = 1 s max Q per ( t i , s ) d t i
where P is the average productivity of the production batch in L h−1, which we simply call productivity in the following, t end the process duration in h and s max the dimensionless number of segments.

2.3. Regression metamodel

A series of in silico experiments was performed with the process dynamics model to obtain the data for building the data-driven metamodel. A fullfactorial experimental design was chosen and the initial permeate flow rate setpoint, the feed concentration and the reference steady-state cake resistance were varied in the boundaries of [400, 1200]L h−1, [20, 100]g L−1 and [6 · 1012, 2.2 · 1013]m−1, respectively. Five levels were calculated per factor. All in silico experiments were carried out with a process duration of 10 h . Subsequently, a second-degree polynomial was fitted to the obtained values of the average productivities. Because of the strongly differing magnitude of their numerical values, the factors had to be scaled to the interval [0, 1] before performing the fit. The calculations were done using Python including pyDOE2 and the preprocessing.PolynomialFeature and linear_model.LinearRegression packages from the scikit-learn library.
Including the first- and second-order parameter terms, the interaction terms and the constant bias term, the resulting response surface equation approximating the achievable productivity in dependence of the process parameters feed concentration, permeate flow rate setpoint and steady-state cake resistance has the form:
P = w 0 + w 1 c feed + w 2 Q per , set + w 3 R cake , SS + w 4 c feed 2 + w 5 c feed Q per , set + w 6 c feed R cake , SS + w 7 Q per , set 2 + w 8 Q per , set R cake , SS + w 9 R cake , SS 2
where w i , i [0, 9], are the parameters of the response surface, that are identified via polynomial regression.
The aim of the digital twin is to optimize the productivity of the DCF. Therefore the extrema of the predicted productivity of the meta model are determined via the first-order derivative with respect to the permeate flow rate setpoint equals zero:
d P d Q per , set = w 2 + 2 w 7 Q per , set + w 5 c feed + w 8 R cake , SS = ! 0
The second-order derivative was checked to verify that the extremum found is a productivity maximum:
d 2 P d Q per , set 2 = 2 w 7 < ! 0
Equation 17 allows for real-time calculation of the permeate flow rate setpoint reaching for the maximum productivity of the process. Since the feed concentration is included, the optimum permeate flow rate setpoint can be easily recalculated when a change in the feed composition occurs during the filtration process run. If an unmodeled physical effect in the process causes a mismatch between the real and the modeled process dynamics, the reference steady-state cake resistance parameter is used to adapt the mechanistic model during industrial DCF runs in 5 min intervals, if necessary. Since also this parameter describing physical properties of the solids in the feed is included as factor in the response surface equation, changes of the natural feed solution sweet must during or between industrial runs do not require the repetition of the DoE pre-studies.

2.4. Digital twin framework

The digital twin framework is built to use the model components and to communicate with the physical asset. The algorithm evaluates the plant status every 60 s by watching the PLC internal status variable. As soon as a process run is detected, the digital twin framework’s calculations are started. In Figure 2 the flow sheet of the digital twin framework is shown in the left-hand detail box. Once at the process startup the intrinsic membrane resistance is calculated from the permeate water flux according to Darcy Equation 1. Due to detectable cake formation during DCF chamber filling already, the cake resistance parameter k c , i is initialized once at the moment regular filtration starts using measured TMP and permeate flow rate.
During the filtration process, the membrane drive torque is read every 5 min and a permeate flow rate setpoint recommendation for the optimum productivity is returned to the DCF process control system. In every optimization cycle, the current feed concentration is estimated using the mechanistic-empirical correlation between the feed concentration and the torque from Equation 12 and Equation 13 in the sense of a soft sensor. Secondly, the function based on first order derivative of the regression metamodel from Equation 17 is called and the permeate flow rate recommendation is obtained for the current feed concentration and reference steady-state cake resistance. Thirdly, the mechanistic-empirical model is used to simulate the past 5 min starting from the simulated status of the DCF obtained by the respective 5 min simulation in the previous optimization cycle of the digital twin. The simulation used the real experimental permeate flow rate setpoint and the current feed concentration. The resulting TMP from the simulation is compared to the actual TMP in the process. In case the TMP difference between simulation and experiment exceeds 0.1 bar, the reference steady-state cake resistance is adapted by 10 % for model refinement.
All program parts are written in Python and uploaded on the data server platform of the filtration plant. The platform serves as an interface between the digital twin framework and the local DCF process control system and allows reading of sensor data and writing of actuator instructions during the process. Due to safety reasons, changes of the permeate flow rate setpoints were manually authorized by the operator.

3. Results

The results section is divided into the presentation of the process dynamics model functions, the model validation with industrial experiment data and the results of the metamodeling. The impact of applying the digital twin’s permeate flow rate recommendations on the 2023 process campaign is evaluated in the last section.

3.1. Mechanistic-empirical process dynamics model

A mechanistic-empirical model has been created to calculate the time course of all important process variables. Exemplary results of the predicted process dynamics can be seen in Figure 3. The simulation was defined to start with a permeate flow rate setpoint of 415 L h−1 and the target permeate flow rate is realized by the model system adjusting TMP over time. The TMP increases during the initial 2.5   h of the modeled process indicating a progressive membrane blocking. The slope of the TMP decreases during the initial 2.5   h of the modeled process because the build-up of the cake filtration resistance, that tends towards a maximum, is dominant over the increase of the pore filtration resistance. The process time 2.5   h after the start of the simulated process is characterized by the opening of the retentate valve, therefore TMP increase is stopped and the TMP level is held. The membrane rotation speed is defined to remain constant. At the simulated process time of 5 h , the permeate flow rate setpoint was defined to increase by 80 L h−1 to showcase the switch from flow controlled to TMP-controlled filtration mode. Until the end of the simulated process, the TMP is held at the pre-defined setpoint of 0.8 bar and the permeate flow rate is calculated to slowly decrease because of increasing pore filtration resistance.
Due to the constant permeate flow rate and a constant feed concentration of 60 g L−1, a linear increase of the lump chamber concentration calculated by simple mass balance is observed until the retentate valve opens for the first time. Thereafter, the simulated concentration decreases and again increases dependent on the opening state of the valve. During the retentate discharge phase, the higher the valve opening percentage is, the higher retentate volume is discharged and thus the faster the chamber concentration falls. The increase and decrease of the membrane drive torque correlates with those of the chamber concentration, following the modeled relationship between torque, viscosity and concentration.
According to the results, the mechanistic-empirical model is capable of a simplified replication of the physical effects of membrane blocking and the control behavior of the filtration plant.

3.2. Process dynamics model validation

During the process runs with the digital twin, the current TMP and torque from the experiment and the model simulation were recorded after every optimization cycle. The plots resulting from two process runs are presented in Figure 4. A visual assessment shows a good agreement between simulation and experiment. Differences between simulated and real TMP are smaller than the measurement noise. The increase in TMP within the first hour of the process run 2-22 and the subsequent stabilization to a constant TMP level is recognized by the model. The initial, real TMP of run 2-22 is met by the simulation with a deviation of less than 10 %. During the concentration phase of run 2-22, the increase in torque is adequately modeled for the initial 2 h of the process. However, the subsequent gradient of torque increase is limitedly overestimated by the model, so that the setpoint torque for the valve opening is reached 1 h earlier within the simulation. In run 2-22, the valve opening percentage was precisely set, so that alternating opening and closing of the valve does not occur. This characteristic is properly simulated from the process run time of 5.5   h . Overall, the torque profile during the concentration phase and the retentate discharge phase is satisfactorily modeled.
Considering the TMP course validation of run 14-23, although the initial TMP is simulated to be 0.15 bar lower than the actual experimental value, the correction function of the digital twin adjusted the simulation within the first 30 min of the run. The shift in the TMP course at 4 h results from a change in the permeate flow rate setpoint from 400 L h−1 to 500 L h−1 and is reproduced by the model simulation. The constant TMP of 0.8 bar due to the change to TMP-controlled operation mode at the time of the permeate flow rate setpoint change is simulated correctly. In the torque course of run 14-23, the increase during the concentration phase, the torque level during the retentate discharge phase and the fluctuations due to the changes of the valve opening percentage are adequately approximated by the simulation.
All validation results of 2022 and 2023 campaign’s production runs can be viewed in Figure S1 and Figure S2 in the Supplementary Information. The validated process dynamics model sets the basis for training the metamodel and for the recommendations for optimum permeate flow rate setpoints of the digital twin.

3.3. Regression metamodel

125 in silico experiments were carried out with variation of the feed concentration, the initial permeate flow rate setpoint and the reference steady-state cake resistance. Using the productivity output from the in silico experiments, a regression metamodel was created which allows inverse calculation to the optimum initial permeate flow rate setpoint in the digital twin framework. The relationship between the initial permeate flow rate setpoint and the resulting experiment’s productivity in the metamodel is shown together with the regression residuals in Figure 5. With one exception, the absolute residuals from the polynomial regression are lower than 60 L(per) h−1, indicating a satisfactory goodness of fit of the simple regression metamodel to the complex mechanistic model (Figure 5 d).
All in silico experiments with the initial permeate flow rate setpoint of 400 L h−1 result in a productivity of 400 L(per) h−1 within the accuracy of the regression model. With the initial permeate flow rate setpoint of as low as 400 L h−1, the maximum TMP of 0.8 bar is not reached during the 10 h run and consequently the permeate flow rate could be kept for the complete simulation. The same holds true for the experiments having an initial permeate flow rate setpoint of 600 L h−1 and a reference steady-state cake resistance of 6.0 · 1012 m−1. However, even in the case of this low cake resistance and the smallest feed concentration investigated of 20 g L−1, this 1 : 1 dependency between productivity and initial flow rate setpoint does not hold anymore when the initial permeate flow rate setpoint is increased to 800 L h−1. At the initial permeate flow rate of 800 L h−1, the maximum TMP of 0.8 bar is reached before the process time of 10 h is finished and the process has to switch to TMP-controlled operation mode. As a consequence, the permeate flow rate starts to decay after this time point and the (average) productivity of the run is less than the initially chosen permeate flow rate setpoint. In Figure 5 a, the effect is shown in an increasing deviation of the course of the curves from a linear correlation of productivity with increasing initial permeate flow rate setpoint. If the assumed reference steady-state cake resistance is set to higher values of 1.4 · 1013 m−1 or 2.2 · 1013 m−1 (Figure 5 b and c), the time until the TMP setpoint is reached and the system switches to TMP-controlled mode with reduced productivity gets shorter. In addition, the compressibility of the filter cake starts to show a significant negative effect on the productivity at higher cake resistances. The combination of these effects can result in a situation in which the selection of a too high initial flow rate setpoint corresponds with a productivity which is lower than the productivity which could have been achieved if the process had been operated more carefully. Consequently, Figure 5 b and Figure 5 c show the existence of an optimal initial permeate flow rate setpoint.
If the feed concentration increases, e. g. from 20 g L−1 to 40 g L−1, productivity also decreases, however the effect is relatively minor. The reason for the decrease in productivity can be found in the fact that higher solids concentrations accelerate the increase in filtration resistance over the process time. Consequently, TMP increases faster, resulting again in an earlier change from flow-controlled to TMP-controlled operation mode. In most cases, the influence of the feed concentration onto the optimal initial permeate flow rate setpoint is quite small, however at high reference steady-state cake resistances it can be seen that a feed concentration of 100 g L−1 shifts the optimum to lower values of the flow rate setpoint (Figure 5 c).
In summary, the variation of the optimum permeate flow rate setpoint with the model parameters reference steady-state cake resistance and feed concentration confirms the relevance of a metamodel enabling the calculation of the appropriate permeate flow rate setpoint in the digital twin framework during the filtration process.
In case of varying properties of the feed entering the DCF or an adjustment of the mechanistic model caused by an observed significant deviation of the simulated TMP from the experimental data, the metamodel allows to conduct a new calculation of the optimal initial permeate flow rate setpoint in practically real-time. The frequent calculation of the optimal initial permeate flow rate setpoint would not be possible when using the mechanistic model for optimization calculations directly. To find the optimum of the productivity resulting from a 10 h filtration, several runs of the mechanistic model covering the full process time would be needed, resulting in a computation time clearly exceeding the 5 min intervals of the optimization loop of the digital twin.

3.4. Effect of the digital twin on the productivity

By start of the 2023 production campaign, the digital twin was implemented on the server platform and put into operation. At this stage of the development, the permeate flow rate recommendations from the digital twin were transferred manually to the physical process by the operator during the course of the production runs due to safety reasons. The productivities of the 2022 and 2023 campaigns’ production runs and the campaign mean productivity are presented in Figure 6.
The average productivity of the productions runs in the 2022 campaign is 466 L(per) h−1 ± 88 L(per) h−1. An increase in average productivity is achieved from the 2022 campaign to the 2023 campaign, as the average productivity of the 2023 campaign is 536 L(per) h−1 ± 63 L(per) h−1. This shows, that in addition to the increased productivity, a reduction of the fluctuations of the productivity of individual runs could be achieved by implementing the digital twin. The variations in the productivity of the individual runs during both filtration campaigns are explained by different feed characteristics of the respective batches. For instance, solids concentration, fouling particle compressibility and temperature influence the degree of membrane blocking and thus the maximum applicable permeate flow rate setpoint.
The achieved increase in productivity in the campaign applying the digital twin can be mainly attributed to fact that the digital twin suggested initial permeate flow rate setpoints which resulted in DCF operation at higher TMPs. In the manual operation during the 2022 campaign the operators strived to avoid a transition into TMP-controlled operating conditions. However, the simulation runs and metamodel showed, that the reference steady-state cake resistance of sweet must is only moderate and that a transition into TMP-controlled operation mode in the course of the filtration does not harm the productivity. This predictions could be confirmed in the 2023 campaign, as can be seen in the individual TMP time courses of all runs plot in Figure S1 in the Supplementary Information.

4. Discussion

The results are discussed with regard to the quality of the mechanistic-empirical process dynamics model and the digital twin’s faculty to identify the optimum process setpoint.

4.1. Process dynamics model quality

Generally, the mechanistic-empirical process model provides a good replication of the time dynamics of the process variables TMP and membrane drive torque. However, there are effects observed during some process runs that cannot be explained with current process knowledge. For instance, the initial TMP of process run 14-23 in Figure 4 is unexpectedly high and the subsequent TMP course stays constant within 4 h despite a constant permeate flow rate and a closed retentate valve. Nonetheless, the adaption functionality of the digital twin recognizes the TMP deviations and corrects the model simulation. The metamodel is trained using the fundamental mechanistic-empirical model, i. e. the metamodel does not take the individually observed, unmodeled effects in the production data into account. A further development of the digital twin with regard to the ability to learn on the basis of observed effects in current or past production runs can possibly improve the model.
The development and iterative refinement of the process dynamics model was based on the data of the complete sweet must production campaign in 2022. However, the industrial production runs in 2022 were often carried out with similar setpoint values, giving only limited variation to the process data. We tried to compensate for the limited information available from the data by considering the expert knowledge from project partners and process data obtained with other feed solution types. Nevertheless, the acquisition of data from further industrial filtration runs with variations in e. g. TMP maximum setpoint, permeate flow rate setpoint and membrane drive torque setpoint, followed by the data integration into further model development would be desirable.

4.2. Digital twin optimization capability

The capability of the digital twin to optimize the process depends on the quality of the process dynamics model, but also on the accuracy of the trained metamodel. To generate the training data for the metamodel, in silico experiments were carried out, with the duration of all experiments fixed at 10 h . Originally, the real productions runs were planned to also be carried out with a fixed duration of 10 h to ensure precise comparability. However, in practice, the duration of the industrial runs is primarily determined by the amount of feed available at the time of the filtration run and it is not set by the operator in advance. Thus, the duration of the filtration runs varies considerably over a campaign. Here, the optimization potential can be further exploited by matching the duration of the industrial run and the in silico experiment duration of the metamodel. For this, the expected duration of the industrial runs needs to be known. In case of variable durations of the industrial runs, the duration can be added as an additional factor to the design space of the in silico experiments, thus extending the metamodel by one parameter.
More generally, the concept of the regression metamodel based on in silico process simulations offers the ability that further parameters of the underlying mechanistic-empirical process dynamics model can be integrated into the metamodel without experimental effort on site. Possibly, besides the reference filter cake resistance some more, currently fixed, model parameters could be changed to an adaptable form to improve the predictive power of the process dynamics model during the industrial runs. In addition, a higher flexibility of the modeled operation conditions, such as a permeate flow rate ramp can be added to open new possibilities for productivity optimization. Last not least, the optimization objective is imagined to be extended from only productivity to other objective variables, for instance product yield, energy demand of the membrane drive or membrane cleaning effort.

5. Conclusion

In this study, a digital twin for an industrial DCF process was developed and applied successfully for increasing the campaign productivity. The digital twin was constructed around a process dynamics model that we based on mechanistic filtration knowledge and fine-tuned with empirical correlations derived from historical process data. A regression metamodel was interposed between the basal process dynamics model and the digital twin framework, because we found that the metamodel approach is superior to several calculations of the mechanistic-empirical process dynamics model in terms of computational speed and, together with the step-by-step simulation of the process dynamics model, a setpoint recommendation is obtained within seconds. By comparing the TMP live data with the simulation result, the process dynamics model is continuously validated during the production run. If a non-replicated process effect occurs, the digital twin adjusts a model parameter already included in the meta model. By this, the meta model approach holds even in the case of changes in the filtration properties of the feed, and allows the robust prediction of an optimal permeate flow rate setpoint.
Given a mechanistic process model, the presented method can be used to optimize a (bio)chemical process without having to perform repeated computation-intensive calculations during the operation. The expandability of the metamodel offers the prospect of integrating further process parameters into the optimization or optimizing further target variables.

Supplementary Materials

The following supporting information can be downloaded at: Preprints.org, Figure S1: Overview of TMP courses of all production runs performed during 2022 and 2023 campaigns in comparison with the corresponding simulation results.; Figure S2: Overview of torque courses of all production runs performed during 2022 and 2023 campaigns in comparison with the corresponding simulation results.

Author Contributions

Conceptualization, M. H. and M. F.; methodology, M. H. and M. F.; software development, M. H. and M. F.; software implementation onsite, G. G. and J. R.; data acquisition, G. G. and J. R.; data curation, M. H.; writing—original draft preparation, M. H.; writing—review and editing: M. H., M. F. and G. G.; visualization, M. H.; supervision, M. F. and G. G.; project administration, M. H., M. F. and J. R.

Funding

This research is part of the project “Intensification of separation processes in the food industry through digital twins and intelligent process control” (InSeLDiP). The project was supported by funds of the Federal Ministry of Food and Agriculture (BMEL) based on a decision of the Parliament of the Federal Republic of Germany via the Federal Office for Agriculture and Food (BLE) under the innovation support program.

Institutional Review Board Statement

Not applicable

Informed Consent Statement

Not applicable

Data Availability Statement

The data are not available due their proprietary nature with regards to the industry partner.

References

  1. Vogel-Heuser, B.; Hess, D. Guest Editorial Industry 4.0–Prerequisites and Visions. IEEE Transactions on Automation Science and Engineering 2016, 13, 411–413. [Google Scholar] [CrossRef]
  2. Negri, E.; Fumagalli, L.; Macchi, M. A Review of the Roles of Digital Twin in CPS-based Production Systems. Procedia Manufacturing 2017, 11, 939–948. [Google Scholar] [CrossRef]
  3. Uhlemann, T.H.J.; Lehmann, C.; Steinhilper, R. The Digital Twin: Realizing the Cyber-Physical Production System for Industry 4.0. Procedia CIRP 2017, 61, 335–340. [Google Scholar] [CrossRef]
  4. Kritzinger, W.; Karner, M.; Traar, G.; Henjes, J.; Sihn, W. Digital Twin in manufacturing: A categorical literature review and classification. IFAC-PapersOnLine 2018, 51, 1016–1022. [Google Scholar] [CrossRef]
  5. He, B.; Bai, K.J. Digital twin-based sustainable intelligent manufacturing: a review. Advances in Manufacturing 2020. [Google Scholar] [CrossRef]
  6. Rasheed, A.; San, O.; Kvamsdal, T. Digital Twin: Values, Challenges and Enablers From a Modeling Perspective. IEEE Access 2020, 8, 21980–22012. [Google Scholar] [CrossRef]
  7. Ashtari Talkhestani, B.; Jung, T.; Lindemann, B.; Sahlab, N.; Jazdi, N.; Schloegl, W.; Weyrich, M. An architecture of an Intelligent Digital Twin in a Cyber-Physical Production System. at - Automatisierungstechnik 2019, 67, 762–782. [Google Scholar] [CrossRef]
  8. Zhang, C.; Xu, W.; Liu, J.; Liu, Z.; Zhou, Z.; Pham, D.T. A Reconfigurable Modeling Approach for Digital Twin-based Manufacturing System. Procedia CIRP 2019, 83, 118–125. [Google Scholar] [CrossRef]
  9. Malakuti, S.; Grüner, S. Architectural aspects of digital twins in IIoT systems. In Proceedings of the Proceedings of the 12th European Conference on Software Architecture: Companion Proceedings. ACM, sep 2018. [CrossRef]
  10. Chen, Y.; Yang, O.; Sampat, C.; Bhalode, P.; Ramachandran, R.; Ierapetritou, M. Digital Twins in Pharmaceutical and Biopharmaceutical Manufacturing: A Literature Review. Processes 2020, 8, 1088. [Google Scholar] [CrossRef]
  11. Klein, M.; Maschler, B.; Zeller, A.; Talkhestani, B.A.; Jazdi, N.; Weyrich, M.; Rosen, R. Architektur und Technologiekomponenten eines digitalen Zwillings. In Automation 2019; VDI Verlag, 2019; pp. 89–102. [CrossRef]
  12. Smiatek, J.; Jung, A.; Bluhmki, E. Towards a Digital Bioprocess Replica: Computational Approaches in Biopharmaceutical Development and Manufacturing. Trends in Biotechnology 2020, 38, 1141–1153. [Google Scholar] [CrossRef]
  13. Martinez, E.M.; Ponce, P.; Macias, I.; Molina, A. Automation Pyramid as Constructor for a Complete Digital Twin, Case Study: A Didactic Manufacturing System. Sensors 2021, 21, 4656. [Google Scholar] [CrossRef]
  14. Moyne, J.; Qamsane, Y.; Balta, E.C.; Kovalenko, I.; Faris, J.; Barton, K.; Tilbury, D.M. A Requirements Driven Digital Twin Framework: Specification and Opportunities. IEEE Access 2020, 8, 107781–107801. [Google Scholar] [CrossRef]
  15. Durão, L.F.C.S.; Haag, S.; Anderl, R.; Schützer, K.; Zancul, E. Digital Twin Requirements in the Context of Industry 4.0. In Product Lifecycle Management to Support Industry 4.0; Springer International Publishing, 2018; pp. 204–214. [CrossRef]
  16. Lopez, P.C.; Udugama, I.A.; Thomsen, S.T.; Roslander, C.; Junicke, H.; Mauricio-Iglesias, M.; Gernaey, K.V. Towards a digital twin: a hybrid data-driven and mechanistic digital shadow to forecast the evolution of lignocellulosic fermentation. Biofuels, bioproducts and biorefining 2020, 14, 1046–1060. [Google Scholar] [CrossRef]
  17. Trunzer, E.; Wu, C.; Guo, K.; Vermum, C.; Vogel-Heuser, B. Data-Driven Approach to Support Experts in the Identification of Operational States in Industrial Process Plants. In Proceedings of the IECON 2018 - 44th Annual Conference of the IEEE Industrial Electronics Society. IEEE, oct 2018. [CrossRef]
  18. Chew, C.M.; Aroua, M.K.; Hussain, M.A. Advanced process control for ultrafiltration membrane water treatment system 2018. Journal of cleaner production 2018, 179. [Google Scholar] [CrossRef]
  19. Herwig, C.; Pörtner, R.; Möller, J., Eds. Digital Twins; Springer International Publishing. [CrossRef]
  20. Narayanan, H.; Luna, M.F.; Stosch, M.; Bournazou, M.N.C.; Polotti, G.; Morbidelli, M.; Butté, A.; Sokolov, M. Bioprocessing in the Digital Age: The Role of Process Models. Biotechnology journal 2020, 15, 1900172. [Google Scholar] [CrossRef]
  21. Sokolov, M.; von Stosch, M.; Narayanan, H.; Feidl, F.; Butté, A. Hybrid modeling — a key enabler towards realizing digital twins in biopharma? Current Opinion in Chemical Engineering 2021, 34, 100715. [Google Scholar] [CrossRef]
  22. Sansana, J.; Joswiak, M.N.; Castillo, I.; Wang, Z.; Rendall, R.; Chiang, L.H.; Reis, M.S. Recent trends on hybrid modeling for Industry 4.0. Computers & Chemical Engineering 2021, 151, 107365. [Google Scholar] [CrossRef]
  23. Solle, D.; Hitzmann, B.; Herwig, C.; Pereira Remelhe, M.; Ulonska, S.; Wuerth, L.; Prata, A.; Steckenreiter, T. Between the Poles of Data-Driven and Mechanistic Modeling for Process Operation. Chemie Ingenieur Technik 2017, 89, 542–561. [Google Scholar] [CrossRef]
  24. von Stosch, M.; Oliveira, R.; Peres, J.; de Azevedo, S.F. Hybrid semi-parametric modeling in process systems engineering: Past, present and future. Computers & Chemical Engineering 2014, 60, 86–101. [Google Scholar] [CrossRef]
  25. Narayanan, H.; von Stosch, M.; Feidl, F.; Sokolov, M.; Morbidelli, M.; Butté, A. Hybrid modeling for biopharmaceutical processes: advantages, opportunities, and implementation. Frontiers in Chemical Engineering 2023, 5. [Google Scholar] [CrossRef]
  26. Rizki, Z.; Ravesloot, R.; van Beckhoven, R.; Ottens, M. Model-based optimization of multistage ultrafiltration/diafiltration for recovery of canola protein. Food and Bioproducts Processing 2023, 140, 212–229. [Google Scholar] [CrossRef]
  27. Krippl, M.; Dürauer, A.; Duerkop, M. Hybrid modeling of cross-flow filtration: Predicting the flux evolution and duration of ultrafiltration processes. Separation and Purification Technology 2020, 248, 117064. [Google Scholar] [CrossRef]
  28. Piron, E.; Latrille, E.; René, F. Application of artificial neural networks for crossflow microfiltration modelling: "black-box" and semi-physical approaches. Computers & Chemical Engineering 1997, 21, 1021–1030. [Google Scholar] [CrossRef]
  29. Díaz, V.H.G.; Prado-Rubio, O.A.; Willis, M.J.; von Stosch, M. Dynamic hybrid model for ultrafiltration membrane processes. In Computer Aided Chemical Engineering; Elsevier, 2017; pp. 193–198. [CrossRef]
  30. Chew, C.M.; Aroua, M.K.; Hussain, M.A. A practical hybrid modelling approach for the prediction of potential fouling parameters in ultrafiltration membrane water treatment plant. Journal of Industrial and Engineering Chemistry 2017, 45, 145–155. [Google Scholar] [CrossRef]
  31. Jin, R.; Chen, W.; Simpson, T. Comparative studies of metamodelling techniques under multiple modelling criteria. Structural and multidisciplinary optimization 2001, 23, 1–13. [Google Scholar] [CrossRef]
  32. Romijn, R.; Özkan, L.; Weiland, S.; Ludlage, J.; Marquardt, W. A grey-box modeling approach for the reduction of nonlinear systems. Journal of Process Control 2008, 18, 906–914. [Google Scholar] [CrossRef]
  33. Franzreb, M.; Muller, E.; Vajda, J. Cost estimation for protein a chromatography: An in silico approach to mab purification strategy. 12. Cited by: 17.
  34. Wang, G.; Briskot, T.; Hahn, T.; Baumann, P.; Hubbuch, J. Root cause investigation of deviations in protein chromatography based on mechanistic models and artificial neural networks. Journal of Chromatography A 2017, 1515, 146–153. [Google Scholar] [CrossRef]
  35. Bott, R.; Langeloh, T.; Ehrfeld, E. Dynamic cross flow filtration. Chemical Engineering Journal 2000, 80, 245–249. [Google Scholar] [CrossRef]
  36. Jaffrin, M.Y. Dynamic Crossflow Filtration. [CrossRef]
  37. Marke, H.S.; Breil, M.P.; Hansen, E.B.; Pinelo, M.; Krühne, U. Cake resistance in rotational dynamic filtration: Determination and modeling. Separation and Purification Technology 2020, 250, 117100. [Google Scholar] [CrossRef]
  38. Marke, H.S.; Breil, M.P.; Hansen, E.B.; Pinelo, M.; Krühne, U. Investigation of the velocity factor in a rotational dynamic microfiltration system 2019. Separation and Purification Technology 2019, 220, 69–77. [Google Scholar] [CrossRef]
Figure 1. Piping and instrumentation diagram of the filtration process complemented by typical trends of central process variables during a filtration run. The plant set-up includes sensors for TMP, permeate flow rate, membrane drive torque and retentate valve opening. Actuators of the process control system are a feed pump, membrane drive and retentate valve.
Figure 1. Piping and instrumentation diagram of the filtration process complemented by typical trends of central process variables during a filtration run. The plant set-up includes sensors for TMP, permeate flow rate, membrane drive torque and retentate valve opening. Actuators of the process control system are a feed pump, membrane drive and retentate valve.
Preprints 97409 g001
Figure 2. Overview about the digital twin structure. Obtaining torque data from the physical DCF asset, the digital twin applies a mechanistic-empirical soft sensor to estimate the feed concentration. The regression metamodel is used for finding the optimum permeate flow rate setpoint whereas the digital process dynamics model is called to assess the model validity in each process optimization cycle.
Figure 2. Overview about the digital twin structure. Obtaining torque data from the physical DCF asset, the digital twin applies a mechanistic-empirical soft sensor to estimate the feed concentration. The regression metamodel is used for finding the optimum permeate flow rate setpoint whereas the digital process dynamics model is called to assess the model validity in each process optimization cycle.
Preprints 97409 g002
Figure 3. Simulation results of the digital process dynamics model for a 10 h in silico experiment. Time courses of (a) permeate flow rate, (b) TMP, (c) membrane rotation speed, (d) chamber solids concentration, (e) membrane torque, and (f) retentate valve opening are plotted. The permeate flow rate setpoint was defined to escalate after half of the simulated process time. The rotation speed was set constant. The modeled switch from flow to pressure control operation mode and the begin of the retentate discharge are marked.
Figure 3. Simulation results of the digital process dynamics model for a 10 h in silico experiment. Time courses of (a) permeate flow rate, (b) TMP, (c) membrane rotation speed, (d) chamber solids concentration, (e) membrane torque, and (f) retentate valve opening are plotted. The permeate flow rate setpoint was defined to escalate after half of the simulated process time. The rotation speed was set constant. The modeled switch from flow to pressure control operation mode and the begin of the retentate discharge are marked.
Preprints 97409 g003
Figure 4. Validation of modeled TMP and torque with experimental data. The experiment from the 2022 campaign (TMP in (a) and torque in (b)) was carried out with a fixed permeate flow rate setpoint of 500 L h−1. In the the experiment from the 2023 campaign (TMP in (c) and torque in (d)), the permeate flow rate setpoint was increased from initial 400 L h−1 to 500 L h−1 after 4 h process time. The difference in the overall TMP level of both experiments can be explained by different membrane cleaning efficiency prior to the process run.
Figure 4. Validation of modeled TMP and torque with experimental data. The experiment from the 2022 campaign (TMP in (a) and torque in (b)) was carried out with a fixed permeate flow rate setpoint of 500 L h−1. In the the experiment from the 2023 campaign (TMP in (c) and torque in (d)), the permeate flow rate setpoint was increased from initial 400 L h−1 to 500 L h−1 after 4 h process time. The difference in the overall TMP level of both experiments can be explained by different membrane cleaning efficiency prior to the process run.
Preprints 97409 g004
Figure 5. Results from the metamodel training. Subplots (a)-(c) show the dependency of productivity on initial permeate flow rate setpoint. Reference steady-state cake resistance is constant in each plot, feed concentration varies within the curve families. Subplot (d) presents the residuals of the regression.
Figure 5. Results from the metamodel training. Subplots (a)-(c) show the dependency of productivity on initial permeate flow rate setpoint. Reference steady-state cake resistance is constant in each plot, feed concentration varies within the curve families. Subplot (d) presents the residuals of the regression.
Preprints 97409 g005
Figure 6. Productivity comparison between the production runs of the 2022 and 2023 campaigns. The digital twin was active from begin of the 2023 filtration runs. An increase in average productivity was achieved from 2022 to 2023.
Figure 6. Productivity comparison between the production runs of the 2022 and 2023 campaigns. The digital twin was active from begin of the 2023 filtration runs. An increase in average productivity was achieved from 2022 to 2023.
Preprints 97409 g006
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated