Next Article in Journal
GRNN: Graph-Retraining Neural Network for Semi-Supervised Node Classification
Next Article in Special Issue
Fusion of CCTV Video and Spatial Information for Automated Crowd Congestion Monitoring in Public Urban Spaces
Previous Article in Journal
Fourier Neural Operator Network for Fast Photoacoustic Wave Simulations
Previous Article in Special Issue
Audiovisual Biometric Network with Deep Feature Fusion for Identification and Text Prompted Verification
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Assessing the Mass Transfer Coefficient in Jet Bioreactors with Classical Computer Vision Methods and Neural Networks Algorithms

by
Irina Nizovtseva
1,2,
Vladimir Palmin
3,
Ivan Simkin
4,
Ilya Starodumov
1,5,
Pavel Mikushin
1,3,
Alexander Nozik
3,
Timur Hamitov
3,6,
Sergey Ivanov
7,
Sergey Vikharev
1,7,
Alexei Zinovev
8,
Vladislav Svitich
1,
Matvey Mogilev
4,
Margarita Nikishina
1,
Simon Kraev
7,
Stanislav Yurchenko
4,
Timofey Mityashin
1,
Dmitrii Chernushkin
9,
Anna Kalyuzhnaya
7 and
Felix Blyakhman
1,5,*
1
Laboratory of Multiphase Physical and Biological Media Modeling, Ural Federal University, 620000 Ekaterinburg, Russia
2
Otto-Schott-Institut fur Materialforschung, Friedrich-Schiller University of Jena, 07743 Jena, Germany
3
Moscow Institute of Physics and Technology, 141701 Moscow, Russia
4
Soft Matter and Physics of Fluids Centre, Bauman Moscow State Technical University, 105005 Moscow, Russia
5
Department of Biomedical Physics and Engineering, Ural State Medical University, 620028 Ekaterinburg, Russia
6
The Institute for Nuclear Research of the Russian Academy of Sciences, 117312 Moscow, Russia
7
Department of High Performance Computing, ITMO University, 197101 Saint Petersburg, Russia
8
Department of Educational Programmes, Institute of Education Faculty, HSE University, 101000 Moscow, Russia
9
NPO Biosintez Ltd., 109390 Moscow, Russia
*
Author to whom correspondence should be addressed.
Algorithms 2023, 16(3), 125; https://doi.org/10.3390/a16030125
Submission received: 15 January 2023 / Revised: 8 February 2023 / Accepted: 10 February 2023 / Published: 21 February 2023
(This article belongs to the Special Issue Recent Advances in Algorithms for Computer Vision Applications)

Abstract

:
Development of energy-efficient and high-performance bioreactors requires progress in methods for assessing the key parameters of the biosynthesis process. With a wide variety of approaches and methods for determining the phase contact area in gas–liquid flows, the question of obtaining its accurate quantitative estimation remains open. Particularly challenging are the issues of getting information about the mass transfer coefficients instantly, as well as the development of predictive capabilities for the implementation of effective flow control in continuous fermentation both on the laboratory and industrial scales. Motivated by the opportunity to explore the possibility of applying classical and non-classical computer vision methods to the results of high-precision video records of bubble flows obtained during the experiment in the bioreactor vessel, we obtained a number of results presented in the paper. Characteristics of the bioreactor’s bubble flow were estimated first by classical computer vision (CCV) methods including an elliptic regression approach for single bubble boundaries selection and clustering, image transformation through a set of filters and developing an algorithm for separation of the overlapping bubbles. The application of the developed method for the entire video filming makes it possible to obtain parameter distributions and set dropout thresholds in order to obtain better estimates due to averaging. The developed CCV methodology was also tested and verified on a collected and labeled manual dataset. An onwards deep neural network (NN) approach was also applied, for instance the segmentation task, and has demonstrated certain advantages in terms of high segmentation resolution, while the classical one tends to be more speedy. Thus, in the current manuscript both advantages and disadvantages of the classical computer vision method (CCV) and neural network approach (NN) are discussed based on evaluation of bubbles’ number and their area defined. An approach to mass transfer coefficient estimation methodology in virtue of obtained results is also represented.

1. Introduction

Over the last decade a range of new methods have been shown to outperform previous state-of-the-art machine learning techniques in a number of fields with computer vision and in particular neural network algorithms being the ones of the most prominent cases [1,2,3,4,5]. The key to understanding, controlling, harnessing and ultimately gaining from natural biological processes in favor of the biotechnology industry lies in providing a controllable dynamic environment for development in vitro while being able to track cell activity in real time [6]. With recent advances in digitization and big data analytics, the majority of biotechnological research and industry centers are adopting digital tools to achieve a challenging modernization level. The biological and corresponding hydrodynamics, heat and mass transfer, and other related phenomena within bioreactor (fermenter) circuits are a key target for such digital approaches, as these processes are often complicated and quite tricky to analyze and scale [7].
The application of both computer vision and deep learning (DL) methods [2], or their combination with biophysical processes, including those in bioreactors, is becoming a part of the mandatory set of methods. Used in order to capture real-time parameters and support decision-making to control and optimize the key indicators of biosynthesis process productivity and safety, successful cases arise with increasing frequency in the scientific and technical literature and reports, when computer vision turns out to be not only one of the most effective but also affordable nondestructive practices compared with pricey invasive techniques that require additional, occasionally redundant, equipment with sets of sensors, transducer and detectors, while neural networks in their turn also represent themselves as a fairly seductive tool in terms of a number of their characteristics, such as, e.g., accuracy and the ability to handle really small objects.
The key process determining the bioreactors’ efficiency is the transfer of the gas phase through the gas–liquid interface. The gas flow from gas bubbles into liquid is known to be a function of two important parameters: the surface area of the phase separation a and the mass transfer coefficient k L [8]. These two parameters are usually combined into the volumetric mass transfer coefficient k L a [9,10,11]. Since gas bubble sizes are far from being constant in various regions of the bioreactor, the gas flow (oxygen and also methane for methanotrophic bacteria as in [12]) also reaches various values. In turn, the uniformity of the microorganism growth medium is one of the key factors for a sustainable process with a high yield of products [13]. A fundamental understanding of the mechanisms lying behind should include comprehension of a range of processes, namely fluid dynamics principles, mixing conditions and mass transfer technique estimation and control, corresponding mathematical apparatus, etc., in order to provide required decision information to design and implement the modern technology platform for reliable, safe and efficient biological processes [14,15,16]. We also note that understanding, modeling and simulation of the complex interactions between microbiology component and fluid dynamics, and heat and mass transfer is the basis of successful transfer from laboratory scale to industrial one [17,18,19]. This understanding can only be achieved if the hydrodynamic and physical mechanisms which are responsible for these gradients are understood in detail through experimental and numerical tools. There are various methods for measuring k L in general [8,20,21] allowing mass transfer to be measured as a complex phenomenon; however, the measurement of individual components, namely the interfacial area a, is of particular interest, since it allows determination of the factors that specifically affect the interfacial surface known as the most important component of the mass transfer process when aiming to achieve high efficiency of technological processes based on the mass transfer.
Meanwhile the direct measurement of the interfacial surface is a well-known challenge [22]. Various bubble size measurement tools have been developed and are in use both in research and industry to provide the determination of the gas bubble sizes, e.g., the Mettler Toledo ParticleViewTM Measurement probe (PVM) or The Analyzer EasyViewer 100 System, as well as The Anglo Platinum Bubble Sizer (APBS) probe manufactured by Stone Three, comparative analysis for which was performed by [23]. However, these methods and tools have limitations that significantly narrow the possibility of using them for project tasks. Thus, PVM has a microfield of view of 1300 × 890 μ and known to be ineffective at high flow rates such as at 1 m/s and over, while APBS requires sampling which significantly complicates the measurement procedure and might lead to data loss.
We also note that, in addition to pure fundamental research interest, the task of assessing the mass transfer coefficient in bioreactors using the latest modern methods is also of high practical importance, since the cost of the final products directly depends on exactly the efficiency of bioreactor operation [24,25,26]: the cost of electricity can reach up to 25% of bioprotein total cost, while the mass transfer coefficient k L determines the maximum achievable concentration of microorganisms, set by the limit of dissolved gases, and therefore determines the productivity of the process.

2. Materials and Methods

In the current study (i) the experiment was designed and carried out tracing the gas–liquid bubble flow through viewing windows (upper and lower ones) of fermenter circuits thanks to high-frequency video recording, (ii) the classical methodology for determining bubble flow characteristics from obtained frames was also developed and (iii) neural networks were applied and a comparative analysis of approaches was carried out. After preliminary post-processing of received frames a traditional computer vision method using a number of filters was applied first for the analysis: the technique was developed for determining the bubble parameters in an elliptical approximation, as well as a technique aimed at reducing the effect of bubble overlays. Lastly the application of neural network algorithms was applied to the problem of detection, segmentation and parameterization of bioreactors’ gas–liquid bubble flow. We highlight the crucial methodology solutions applied within the current research in the current section.

2.1. Methodology for Conducting the Verification Base for the Model: High-Speed Filming on the Experimental Stand

The fermenter unit, also known as the bioreactor, is the production chain core of neoteric biotechnological facilities producing significant products for sustainable development by means of biosynthesis, e.g., single cell proteins [27]. Performing essentially as a mass transfer apparatus, the fermenters are designed to carry out the process of microorganism cultivation [28] with the ability to work on high pressure regimes (up to 6 and above atmospheres) [29,30]. Nowadays one can find a number of versions of geometrical fermenter solutions both in the literature and in practice [16,31], whereas the jet stream bioreactor takes a confident lead being recognized in terms of energy efficiency, scalability and productivity. Initially developed by a multidisciplinary group of researchers in the early 1970s, the jet stream bioreactor is the only type of fermentation apparatus with historically confirmed long-term productive operation on an industrial scale up to 2000 m 3 by the beginning of the third decade of the twentieth century and therefore might be regarded as one of the most promising for industrial implementation at present. In order to conduct the current research study, an experimental stand was developed reproducing the jet stream bioreactor circuit, namely, a mass-exchange vertically oriented fermentation apparatus with a circulation system, as well as the research program for studying the gas–liquid bubble flows through high-speed video recording serving to obtain the input research data for subsequent analysis (Figure 1).
The developed set up includes a pump, shell-and-tube heat exchanger, systems of discharge and distribution pipelines, and an aerator installed above the vessel consisting in its order of an all-welded shell, upper and lower bottoms. The experimental set up, or stand, is equipped with all the necessary fittings for connecting technological pipelines and installing the required instrumentation and sensors. The aerator represents itself as a cylindrical shell with internal confuser-type device including distribution device. The liquid phase is fed into the side surface of the aerator shell, while the gas phase is captured through the top aerator fitting. The design of the aerator ensures the ejection of the gas phase by the liquid flow and the recirculation of the gas phase as well as its subsequent dispersion. Shell-and-tube type heat exchangers provide gas–liquid media (culture liquid in industrial bioreactors) cooling to the operating temperature; the gas–liquid media level in the fermenter is maintained automatically at a constant level. During the experimental runs the bioreactor was launched and the process investigated within various modes described in Table 1. Image data mining was carried out using a high-speed PHANTOM V2511 camera with a resolution from 640 × 480 up to 1280 × 800 and additional external lighting, i.e., macro lenses used since the camera was placed closed enough to the object being filmed; a WASP PLASMA studio light (2 units) was used at a distance of up to 2 meters; Coherent Powerline 500mW and Coherent StingRay lasers for various regimes of wavelengths and powers were placed close enough to the object being photographed. Filming of the recording was performed from 25,000 fps to 130,000 fps depending on the concrete experiment setting: a series of measurements were carried out for two viewing windows with different lighting parameters and various fermenter’s operating modes. In order to provide a qualitative analysis, the obtained video was divided into separate frames (Figure 2); each frame was analyzed first independently: the methodology is described in detail in the following Section 2.2.

2.2. Bubble Processing Methodology by Virtue of Computer Vision Approach

As mentioned, the frames obtained as a result of video filming by a high-speed camera during the experimental runs were taken as input data for processing and analysis. In the current subsection we dwell in detail on the method of processing, extracting the individual bubble parameters and ultimately determining the total phase contact area. One classical methodology for solving a similar class of problems is deep learning methods, including labeling the training data set, model training and its validation on a dataset that was not used during the training. Meanwhile the labeling of the collected set is quite a resource-intensive task, in particular since deep learning would require re-labeling and training for each new type of the measurement, also discussed later, so at the first stage classical computer vision methods were deemed for bubble processing. The methodology approach might be partitioned down into several steps, i.e., (i) image pre-processing of the original image (Figure 2) including contrast enhancement; (ii) filters applied to highlight bubble boundaries, as well as clustering to extract individual bubbles; (iii) the overlap compensation methodology; (iv) determining the parameters of selected bubbles by classical analysis methods.
Assuming the bubble is defined by its boundary one should keep in mind the light reflection effects which might result in a false boundary appearance not related to the real bubble. Therefore, before clusterization we take the first step related to separate the real boundary from the false one, i.e., by the filter or the combination of filters to be determined. A number of neat transformations should be applied to the image in order the make the boundary more clear. Among a number of approaches the black-hat transform was chosen due to its advantages in visual allowance to highlight the edges better with no additional noise effect production affecting in a negative way the original image (Figure 3a). Meanwhile the contract increase for the desired bubble boundaries leads to the corresponding increase for false boundaries, in fairness not as strong. Applying a blur filter should thus improve the selection of the desired boundaries. A normal distribution filter is chosen and after image preprocessing the boundary selection filter is applied, namely the Canny filter. It should be noted that, depending on the threshold parameters, this filter leads either to an excessive quantity of false boundaries or too little of the real ones; therefore, finally the Frangi filter is considered as having fewer problems with false boundaries, while the real ones are blurry. As mentioned this problem might be now solved by the Canny filter and so finally the following combination of filters is chosen: black-hat transform on the original image, then blurring by Gaussian kernel, the Frangi filter and the Canny filter as the final transformation as seen in Figure 3 and Figure 4. Now after extracting bubble boundaries (Figure 5) the next step would be to separate the bubbles into clusters: the DBSCAN method is chosen due to its feature of group nested circles as separate clusters. However, we understand the remaining possibility of tuning its parameters possibly allows the number of different bubbles combined into one cluster to be reduced.
Now when the boundaries are obtained the final preparatory step will be a well studied clustering method application. The complexity is determined by the superposition of bubbles on the top of each other, which might lead to the allocation of two bubbles in one cluster: we will consider this situation in detail in the following section. Meanwhile in general, when the bubbles are isolated as separate clusters, it is possible to approximate the bubble using the ellipse model (Figure 6), since bubble objects are visually quite similar. The approach allows, on the one hand, the target parameters to be calculated, e.g., area, and on the other hand the ellipse model reduces the number of model parameters and, as a result, allows more stable results to be achieved [32]. The ellipse is a curve of the second order and therefore we approximate a general function of a x 2 + b x y + c y 2 + d x + e y + f = 0 . A numerically stable variation of the least squares method was used, namely least squares minimization with a guaranteed solution specific to the ellipse, also relevant for scattered or noisy data. The optimal solution is calculated directly, no iterations are required, which results in a simple, stable and reliable fitting method with undemanding implementation. The proposed algorithm has no computational ambiguity and throughput is more than 100.000 per second. For our case it is convenient to parameterize the ellipse in terms of the coordinates of the center, the semi-axis and the angle of inclination of the main semi-axis. Finally, it is worth noting that various parts of the ellipse contain different amounts of necessary information. As a result, if the defined bubble boundary does not contain kinks in the ellipse, there are a number of ellipses which can fit the data. Such uncertainty might result in a distorted ellipse, e.g., smaller and more elongated. In the following section we represent the results of the single bubble size and bubble intersection examination, as well as results on bubble projection area estimation and model verification on the labeled data.
Since different parts of the ellipse contain different amounts of the required information, one can obtain results with different accuracies depending on the parts data were generated from, meaning that, in case data contains points from both ellipse turns, the result is revealed to be much closer to the true values. Thus, for the following examples (Figure 7), the true values are x 0 = 4; y 0 = −3.5; a = 7; b = 3; θ = 0.7854 ( π /4): in Figure 7c the result is closer to the true values since the number of points is still 20 and all of them have more information. When calculating a number of derivative sign changes we keep in mind that the greater the number the more confident the result is. However due to the noise factor there might be more sign changes as seen in Figure 8. When edges are obtained the next step is to cluster them into separate bubbles. Since due to the process and filming specific bubbles might overlap on the image, a clustering method should in general tend to reduce the number of bubbles considering a few as one bubble; however, there might be smaller bubbles on bigger ones or a pseudo-bubble, i.e., not correctly detected edge. One can see a good comparison of clustering algorithms in [33,34,35,36,37]. DBSCAN was selected since it can cluster nested circles as separate clusters; also DBSCAN does not require one to specify the number of clusters in the data a priori; in the case of bubbles it is quite important despite K-means. As in the scikit-learn implementation [37] the default parameter values were used except for the maximum distance between two samples, whereby one is considered to be adjacent to the other. The distance parameter was set as 2.8 instead of 0.5 since this value defines fewer bubble intersections as a single cluster but still clusters a single bubble correctly. However, there is still room for parameter tuning. In order to check the method works we plot the fitted ellipses in the original image as seen in Figure 9. For the simplest case the method works well (Figure 10a); however, for more complex cases when the number of bubbles increase dramatically, the method starts to mis-detect bubbles (Figure 10c). We consider F3 then in detail since in Figure 10 it looks more accessible for analysis: the complete video was taken and evaluated by the method for each 30th frame. As a result, we can consider distributions for all the data or how characteristics change from frame to frame. Let us discuss it in the next section relying on Figure 11. The code of the method is published on GitHub and open to use: https://github.com/VladimirPalmin/BubblesReactor, accessed on 15 January 2023.

2.3. Neural Networks Methodology for Bubble Analysis

As an alternative to the classical method the deep learning (DL) approach was also considered: in the current subsection we discuss image processing methods for analyzing the phase contact area using neural networks. A few methods are compared for processing the results of video filming with a high-speed camera during experimental runs (Table 1), namely the MaskR-CNN [38] and StarDist [39] models. Instance segmentation of bubbles on the frame is carried out on pre-trained weights using the mentioned machine learning models. Final arguments are given in favor of choosing the StarDist model for the further full processing cycle of bubble images.
In the subsection above the difficulties associated with the methodology of bubble segmentation in the volume are discussed and some of them were overcome with a trained Mask R-CNN for the spatiotemporal variation in the interfacial shape between gas–liquid phases. An average accuracy of 98% is shown [40]: Mask R-CNN was trained on both experimental and synthetic bubbly flow images obtained from the upward bubbly flows in an expansion pipe and BubGAN algorithm [41], respectively. We applied the MaskR-CNN model pre-trained on data from [41] in order to get the bubble image mask for our own images (Figure 12).
This trained Mask R-CNN model showed an average accuracy of A P 50 = 0.244 on our test data. In order to improve the performance of Mask R-CNN for our task, it is necessary to retrain the model on our own data, which requires (due to the peculiarities of the architecture) a large size of the training data set and the availability of sufficient computing power, meaning it is also quite expensive. Therefore other neural network methods should be considered.
The second open-source method that is known to be applicable to detect bubbles is StarDist [39]. It was originally developed as a cell/nuclei detection method for microscopy images with star-convex shape priors which seemed to be relevant for our task in general. In fact the approach has its own benefits and drawbacks. The result of the StarDist mask extraction with pre-trained weights is shown in Figure 13: it misses about half of the total number of bubbles. Such a poor result on our own data is shown due to the fact that it was trained exactly to detect cell nuclei. At the same time as shown in [42] StarDist has still higher accuracy and requires less training data than MaskR-CNN for bubble detection. In addition, in contrast to Mask R-CNN, StarDist has only a few hyper-parameters that do not need careful tuning to achieve good results [43]; therefore, it was chosen to train the model StarDist for bubble instance segmentation as is discussed further in Section 4.

3. Application of the Developed Classical Computer Vision Method to the Bubble Dynamics Problem

When the initial processing of bubble images described in Section 2.2 is provided one is faced with the need to get the bubble distribution, projection area estimation and a number of other characteristics as well as verification of the model developed on the labeled data: this and much more will be discussed in the current section.

3.1. Bubble Distribution Investigation

As mentioned above, the peculiarities of the bioreactor design and the hydrodynamic processes occurring in its circuit determine the problem of video camera focusing due to the fact that the bubbles are located in the volume; as a result some bubbles are closer while others are further from the camera lens. As a result, there is a certain distance with clear bubble boundaries, while all the other (closer or farther) bubbles become less clear, which means that their boundaries are less clarified and thus are more difficult to detect. At the same time, due to the camera settings and the installation design, more distant bubbles should be larger than nearby ones, so blurred bubbles will have smaller sizes than the real ones. We can then consider the problem of determining the distance at which the bubble will be too blurred for detection in order to get the estimate of the bubble size correction. Adding another smoothing might also be considered to detect less blurry bubbles but this might negatively affect the overall assessment, so we will assume that all the detected bubbles are clear and come from the general selection in order to proceed with the determination of different bubble types.
Let us consider the distributions of the ellipse parameters (Figure 11): coordinate centers, inclination angle and principal semi-axes. The fact that the coordinate distributions (Figure 11a,b) are not uniform, as one might expect, can be explained by the fact that various parts of the frame were illuminated differently and bubbles are better defined in brighter parts. The angle distribution (Figure 11c) is symmetrical and most of the bubbles turned out to be horizontal. This is due to the bubble dynamics, which actually go more along with the horizontal axis. Additionally, the more round the bubble, the less pronounced its slope. The minor semi-axis (Figure 11d) has two peaks. If only points on one side are selected from the ellipse, then the second semi-axis is poorly determined, because there is not enough information on it. This causes uncertainty, which can give rise to two peaks in the distribution. It should also be noted that we can measure only a projection of a bubble which might also cause a second peak. If ellipses were formed from circles by the movement along the horizontal axis, the third axis should also be a minor one. As a result these two axes can substitute one another. The eccentricity distribution (Figure 11f) seems to be a combination of two exponential distributions requiring additional analysis on the reasons. No signs that could be associated with fuzzy bubbles were found in the distributions. This means that there are not enough of them among the detected, or the size change caused by the camera focus does not make a significant contribution. Note that we can also use the error distribution (Figure 11g) to cut bad detected ellipses. In fact, we can change the error metrics to improve this cut. In [44,45], it was earlier noted that the bubble size distribution is related to the intensity of the gas–liquid flow in the system. Thus, as the flow rate increases, the distribution tends to shift from a lognormal to a normal form. Despite the difference in experimental apparatus and methods of estimating the bubble shape, we also observe a similar ratio between the fluid dynamics of the medium and the bubble size distribution.

3.2. Bubble Projection Area Estimation

The bubble projection area is calculated by the formula S = π R r , where R and r are the radii of the major and minor semi-axes, obtained as a result of applying the elliptic regression described above. It should be noted that henceforth the procedure for area determining might be refined in a number of different ways, e.g., average areas between successive frames. Note that the bubble surface area can not be fully estimated from a single projection; thus, the simplest assumption would be to consider the bubbles as ellipsoids of revolution. Thus, the total contact area of the phases is calculated by approximate formulas and is either a function of the projection semi-axes for ellipses or a function of the area of a pixel cluster for irregularly shaped bubbles.
The dynamics were investigated, namely how the number of identified bubbles and their total area change with time. In order to proceed not a single frame was considered, but the entire clip: since the video was recorded with high frequency, the adjacent frames were quite similar; therefore, not all frames of the video were considered, but every 30th. As a result, the dynamics of the number of bubbles, although they have a large spread, are stationary. At the same time, the dynamics of the total area show a declining trend. However, there is a correlation with the number of bubbles. This might indicate that over time there are more smaller bubbles or the shooting conditions have developed in such a way that the smaller ones have started to be detected better (Figure 14).
The dynamics represented in Figure 14a also show frames with the maximum and minimum number of bubbles. These frames are shown in Figure 15. We note that visually the first image of Figure 15 (left) has generally more medium-sized bubbles, while the second frame (Figure 15 (right)) is darker and represents bigger bubbles. This fact might cause a difference in the estimated number of bubbles, while this in its turn together with bubble motion (bubbles appear in front of the viewing window and disappear, collide and diverge, etc.) cause fluctuations as seen in Figure 14.

3.3. Investigation of Bubble Sizes at Intersection

As a result of bubble boundary line detection and clustering them into separate entities, two bubbles might happen to be combined into one cluster in the case where the distance between them is negligible, which might lead to incorrect results of fitting the bubble shape with an ellipse. In order to get rid of, or prevent, this potentially negative effect the following procedure was tested and verified: when the defined quality criterion for fitting with an ellipse exceeds a threshold value, then such a fit should be considered as a “bad” one and thus another attempt to fit it with two ellipses (triple overlays are not considered though the approach might be similar) should be made. In order to make a fit with two ellipses a random selection of some continuous arc in the cluster occurs, which is fitted by the first ellipse: then all the points close to the first ellipse are discarded and the remaining points are fitted with the second ellipse. The process should be repeated a few times (up to ten for more accuracy) in order to avoid the random arch selection affecting the result; then a double fit with the best quality criterion is selected. The quality criterion used to determine the threshold value that distinguishes “bad” from “good” fits was the average distance from the point to the ellipse divided by the sum of the lengths of the semi-axes of the ellipse (the division eliminates the dependence on the size of the criterion). In order to compare a single fit with a double fit, two metrics were used: the absolute average distance from the points to the ellipse, as well as the absolute average distance from the points equidistantly distributed along the ellipse to the nearest point from the cluster to each of them. If a double fit did show an improvement in both metrics, then two ellipses were considered to describe the cluster better than one (Figure 16). The applicability of the method is limited by the following fact: by sifting out fits by the threshold of the average relative distance of points to the ellipse, it is possible to distinguish “good” images from “bad” ones, but the class of “bad” images itself is divided into two subclasses: overlapping bubbles or jerky pieces of some boundaries, along which it is not possible to set the number of ellipses, and it should be noted that the second case is much more common. Another important fact to mention is that the correctness-of-fit metrics definitions require further analysis. The thing is that by fitting almost any set of points with one ellipse we will get metrics associated in one way or another with the distance in between the points and the ellipse which is worse than if we do it with two ellipses, since it is easier to approximate the shape of an object with two entities than one. Perhaps some normalization is required, e.g., to the total perimeter of the figures, or any other option that makes the metric independent of the number of degrees of freedom when fitting. In order to assess the impact of the method on the overall performance the F3 video fragment was analyzed from which every 90th frame was taken (Figure 17). The average number of bubbles without double fitting was determined as 137.34 while with the method—138.78; thus, the average difference does not exceed one ellipse per frame. Summing up, the more perspective way for quality improvement might not be further development of post-processing methods but rather improving the edge detection itself. This includes tuning the method’s parameters as the Gaussian kernel of the blurring procedure and adding (or changing) layers of filters. However, the approach still requires an independent metric to avoid the problem of over-fitting.

3.4. Developed Computer Vision Methodology Verification on Labeled Data

Now since the methodology has been developed and tested on real data, we proceed with its correctness verification. The most natural approach would be to provide the visual analysis of the method operation results both on individual frames and the calculation of the cumulative distribution. Keeping in mind the filming was carried out for various regimes possibly affecting the density of the bubble number on the frame as well as on the size of the bubbles themselves, it should be noted that the more sparse the picture is the better the method works. The latter is explained by sharper bubble contours for lower bubble densities. Furthermore, the resolution of the method depends on the quality of the frame illumination: darker frames are expectedly processed with worse quality result.
The labeling was done manually in the VGG Image Annotator (VIA) web application. The selected frames were uploaded into the application and the “bubble” parameter was defined in the Attributes field. Thereafter choosing the ellipse (selecting the appropriate shape in region shape parameter) we mark the maximum distinguishable number of bubbles visible on the image. Bubbles that are located partly beyond the frame are also labeled. After labeling the frame is performed the files were exported for analysis. To apply benchmarking 7741 bubbles were used (labeled on two dozen selected from the F3 video frames), while the developed method evaluated 318 frames and 68429 bubbles from the same video. When comparing the dynamics of the number and area of both labeled (Figure 18) and detected bubbles we get that the obtained numbers are within the same order of magnitude (Figure 19).
There exists an acceptable overestimation of accuracy when labeling due to marking clearly smeared bubbles caused by the fact of difficulty of formally defining the differences between a clear and smeared bubble for human vision. Summarizing, the method developed and described in Section 3.1, Section 3.2 and Section 3.3 above gives a more stable result, i.e., least subject to change from frame to frame. In other words the represented labeling can only serve as an indirect indicator of quality not being a metric, e.g., for filter parameter tuning. At the same time, the developed method does not cope with the detection of clusters of overlapping bubbles that might meanwhile generate certain fluctuations in the number of bubbles. Considering there also appear false positive bubbles (bubble detection succeeded while there were no real ones) and false negative bubbles (not all the bubbles are detected), these two resulting components might be assumed as compensating each other when averaged. Another intriguing aspect to analyze is to compare the distribution of labeled and detected bubbles: as seen from the center distributions the edges of the frame are darker according to expectations, so a certain underestimation of the quality of detection might take place, while the labeling distribution is closer to uniform. The bubble area distribution is defined as exponential (see the histogram on a logarithmic scale in Figure 20). At the same time, we observe an increase in the distribution difference with an increase in bubble area when in fact the number of the labeled bubbles is crucially less than the number of detected ones, and the number of bins in the histogram is different, so it cannot be concluded that the method determines bubbles of a larger area with worse quality (Figure 21).

3.5. Application of the CCV Method Developed for Various Operating Bioreactor’s Modes

Finally we verify the developed CCV method by testing it for various operating modes of the bioreactor and check whether it allows one to capture these modes’ peculiarities which could serve as a method’s effectiveness demonstration. In view of the fact that the filming was carried out in several modes of the bioreactor determined by the value of the frequency pump, let us compare the results of the method for the specified video files from Table 1. Visually one can assume the number of bubbles as the largest for F1, then (decreasing) for F2, then F3 and then finally (most rarefied) F4. When comparing the dynamics of registered bubbles, such a trend is observed indeed, but the number of F1 bubbles is rather comparable to F2 ones. This is explained by the fact that the bubbles are not only getting bigger, but their density is also increasing. As a result, no increase in the number of detected bubbles is observed when proceeding from F2 to F1 (Figure 22a). The average number of bubbles per frame is defined as F1: 334 ± 66, F2: 330 ± 45, F3: 204 ± 26, F4: 160 ± 39, while the median area of bubbles does not increase; in other words, taking into account the statistical spread, the average median areas turned out to be similar (Figure 22b), namely: F1: 222 ± 37, F2: 238 ± 35, F3: 295 ± 68, F4: 212 ± 41.
When constructing dependence of these characteristics on the regime, one can see that the number of bubbles themselves changes significantly while the median value of the area remains the same within the limits of statistical fluctuations (Figure 23).
Comparison of logarithmic bubble area distributions for F2 and F3 videos (Figure 24a) reveals a predominance of smaller bubbles for F3. This feasibly might be the method’s artifact which seems to be not able to detect small bubbles in a denser cluster of bubbles. At the same time a comparison of F2 with F4 (Figure 24b) shows that F4 clearly corresponds to an exponential distribution with a different coefficient—a larger slope of the histogram. As a result, the bubble size distribution can be estimated in terms of the exponential distribution parameter. We emphasize the demonstrated bubble averaging approach should improve the assessment of the current state of the bubble size distribution, that in its turn might subsequently be utilized when developing a complex system for optimal control and management of gas–liquid flows in the bioreactor circuit—also applied for the industrial fermenters. All in all we have proved the developed method to be able to catch the bioreactor’s modes distinction. Meanwhile there exists a limit to the method’s resolution associated with the density of the bubble distribution over the frame. This limit is probably reached or close to being reached on the F1 video at 30 Hz, since this video has the highest density of bubbles observed.
Finally, since realizing that while filming the camera angle captures rather a small fraction of the entire bubble volume of the bioreactor’s gas–liquid medium and, in order to check the dependence of bubble flow characteristics behavior for a fixed pump mode (e.g., 50 Hz for F3), a similar survey (F5) was carried out through another viewing window located lower than the first one (Figure 1). Now when comparing the bubble area distributions for the upper level one can see them coincide to a large extent, while the number of bubbles for the lower one turns out to be significantly lower (Figure 25). The smaller number of bubbles leads to an increase in statistical errors, which can result in a more significant scatter for the median area and, finally, statistical bubble coincidence. The average number of bubbles per frame is calculated as F3: 204 ± 26, F5: 66 ± 23, while the average median area of bubbles per frame is F3: 295 ± 68, F5: 439 ± 108. The comparison shows that at different levels of bubble flows filming through the viewing windows of the bioreactor, the number of bubbles does change, but their distribution remains similar (Figure 26). Note that when developing a subsequent program of experiments, one should provide parallel rather than sequential filming in order to obtain greater confidence in the absence of, e.g., distortion effects due to a change in the shape of the bubble distribution, as well as an increase in the training base and verification of the model.

4. Application of Neural Network Algorithms to the Problem of Detection, Segmentation and Parameterization of Bioreactor Bubble Flow

Now when classical methodology is described in detail in terms of content and application let us focus on the alternative approach, namely to adjust and use the deep learning algorithms based on the preliminary methodology described above in Section 2.3.

4.1. StarDist Approach to Object Detection

StarDist’s method key feature lies in predicting directly the masks for each object of interest by star-convex polygon for every pixel. In order to understand the principle of the StarDist method, let us consider an image as a matrix of i columns and j rows. Every combination of i and j pixels corresponding to a bubble is described by two different functions: the first is r i , j describing a star-convex polygon that extends from the current pixel position to the edge of the bubble (Figure 27, r i , j k ), while the second is d i , j —a normalized shortest of Euclidean distance to the nearest background pixel from the bubble centers. In these terms d i , j is considered as the probability of the object (Figure 27, d i , j ) meaning the closer d i , j is to the center the higher its value. In order to yield r i , j k and d i , j for each pixel a custom-designed U-Net architecture is trained using annotated data sets. As a result of application of this trained U-Net network the values of r i , j k and d i , j for images might be predicted. Non-maximum suppression (NMS) is used to select one single r i , j k for every actual bubble in the image.

4.2. StarDist Method Application for Mask Extraction of the Bubbles

Application of deep learning networks, e.g., StarDist, normally requires the user to train a model using their peculiar data. Simultaneously in order to train the StarDist model an end-to-end open source platform for machine learning TensorFlow 2 was used, allowing researchers to train (and retrain) and validate deep learning networks. Therefore, our approach realization can be divided into four steps:
  • Ground-truth annotations in order to train the StarDist model. This part requires careful manual annotations of the images for subsequent analysis.
  • StarDist model training and validating.
  • Trained StarDist model application to segment the objects for further analysis.
  • Image augmentation to improve the StarDist model performance.
Training the StarDist model requires naturally a set of images and their corresponding masks. We started to train StarDist from a small set of training data in order to test and adjust the results and then iterate. The training dataset consisted of 25 images with resolution 1280 × 800 pixels taken from 10 different video files. The annotation was carried out in the QuPath program [46]. The training data consists of the corresponding input image as shown in Figure 28a (i.e., original images) and fully annotated label images (Figure 28b) (i.e., each pixel is labeled with a unique object ID—or assigned 0 when background). Each image was sliced into four equal parts of 640 × 640 pixels and further augmentation was performed. Augmentation consists of four stages, namely:
  • Reflection of the image (vertical or horizontal, 50% probability).
  • Rotation of the image 90 degrees (50% probability).
  • Change the pixel intensity: the intensity of each pixel is multiplied by a number from the interval (0.6…1.4) and added to the number from the interval (−7…7) (uniform distribution).
  • Gaussian noise overlay (mean = 0, standard deviation = 2).
Thus, we got 500 images for training the neural network. The StarDist model was trained on GTX Titan X with our own data set (Table 1).
The model average precision on the training samples, meaning the area under the precision-recall curve, was achieved at A P 50 = 0.731 based on the IoU = 0.5 threshold and A P 75 = 0.544 based on the IoU = 0.75 threshold. The average precision on the test dataset was also calculated. The test dataset consists of four images that were not used to train the StarDist model network. As a result of a similar slicing and augmentation procedure, 80 test data were obtained. The model average precision on the training dataset was achieved at A P 50 = 0.544 based on the IoU = 0.5 threshold and A P 75 = 0.294 based on the IoU = 0.75 threshold. The processing time of one frame by the trained model is 0.28 s: a worthy result that can also be improved by means of parallel processing.
Application of the trained model to the sample frame is shown in Figure 29b: the algorithm is able to separate the overlapping bubbles. Unlike MaskR-CNN, StarDist does not use axis alignment bounding boxes as a shape representation. Instead, this model predicts a star polygon for each pixel, which allows bubbles to be demarcated. Therefore, we are spared the problem of separating the overlapping bubbles, for which purpose special clustering algorithms are implemented in Section 3.3 and we proceed then to the analysis of the resulting bubble masks.

4.3. Analysis of Bubble Features Obtained from the Trained StarDist Model

Since the result of the StarDist model is a label image and/or a list of ROIs, one per bubble, we calculate their total number per frame and get information about every bubble with the scikit module [37]. Bubble boundaries can be obtained from the individual bubble mask by generalizing bounding polygons containing sets of non-zero pixels (Figure 30a). At this stage, we acquire the information about the center of mass of the bubble and the coordinates of its boundary with another phase. Figure 30b shows calculated bubble area distribution after StarDist method application. Using the resulting masks and the skimage package we get various characteristics of bubbles: eccentricity, perimeter, orientation and others. Figure 31b shows the calculated eccentricities for the approximated ellipses.
Further analysis of the obtained bubble’s mask is considered in Section 5.

5. Results and Discussion

5.1. Developed Computer Vision Method Benefits and Drawbacks

Figure 25 shows the comparison of number and total area surface of all detected bubbles via manual and automatic markup; while point-to-point stability of the result is rather low (up to 50%) one needs to remember that the initial data was measured by high-frequency camera and any single separate frame is not important in itself. The result of separate frames could be averaged later on in order to obtain the required trend information (Figure 14 shows more frames to estimate the spread). More important is the scale difference between manual and automatic markup. On average the effectiveness of automatic markup is only about 50% when assuming manual markup as 100%. The odds could be caused by a few factors:
  • Manual markup overestimates the number of bubbles, especially small ones. Furthermore, manual markup tends to detect bubbles with different depths of view, whereas automatic markup is focused on a more narrow focus distance range.
  • The automatic procedure is just not sensitive enough, though the trend stability on Figure 14 is good enough.
  • Different parts of the frame have different lighting conditions and therefore different probability of detection. This assumption is partially confirmed by Figure 21a,b showing the distribution of x and y coordinates of automatically recognized bubbles. The distributions are expected to be uniform in the ideal case, and they are semi-uniform for labeled data but not for the automatic analysis.
The first factor (overestimated number of bubbles in the field of view in manual markup) does not affect the total quality of research since it was used only for the preliminary results check. The real calibration of the automatic procedure will use the reference value of other experiments, not the manual markup. The global scaling factor includes several factors that could not be obtained from image analysis:
  • Geometry ratio between the observed window and the whole setup;
  • Non-homogeneity of bubble distribution in the setup;
  • The depth of field of the camera.
The second factor (automatic markup sensitivity) could be improved by fine-tuning filter parameters. Those parameters must adjusted individually for each measurement setup. As a part of the future development of the project we plan to perform the automatic optimization of filter parameters in order to improve the recognition quality. Still it is not that crucial as long as recognition parameters are stable. The third factor (non-uniform lighting of the view field) should be improved by better light setup during the measurements. In the set of measurements external light was used which was revealed to be not sufficient for providing uniform lighting. In between the ways to circumvent the arising inconveniences when developing the project on the next stage, internal lights might be installed covering only a narrow slice of the bioreactor volume (it would also improve the quality of the procedure as well), e.g., [47] proved itself with deserving results with a similar technique, but the initial data was declared in another way, i.e., this sort of measurement procedure was limited for the provided experiment due to the specific design of the apparatus).
Another point of discussion is the comparison of direct CV methods used in the current research with some more frequently used deep learning methods. The preliminary results presented in Section 4 are promising in terms of point-to-point stability but it requires additional study for different data sets as well as productivity analysis. Discovery and analysis of the sufficient (in terms of control, safety and effectiveness of fermentation process) number of bubbles located on the frame as well as their optimal sizes, in other words defining the value of target parameter, is another sophisticated question requiring further research development.

5.2. Results and Prospects of Neural Network Method Development

As for the classical computer vision method, the applied NN model also possesses a number of benefits and drawbacks. In the current subsection we consider the key advantages and disadvantages of bubble detection by the neural network method, as well as possible prospects for improvement and implementation in the following research stages. In contradistinction to CCV, neural network methods allow an automated bubble detection and mask extraction tool that works universally at various experimental conditions. Unfortunately CCV methods require optimization of threshold parameters, which are not universal for all experiments. As it was demonstrated in Section 4 StarDist star-convex polygons localize bubbles’ boundaries quite accurately even under complicated conditions. The StarDist model even detects overlapping bubbles and there is no need to use clustering algorithms to separate them unlike for CCV methods. Furthermore, the NN method is able to detect small bubbles that contribute to the phase contact area, since naming them small does not in any way mean they might be treated as minor or especially negligible as their impact on mass transfer is a quite insidious question requiring special attention. However, the DL approach in general, and the demonstrated NN method particularly, are not devoid of disadvantages, including the need to annotate data and the lack of an increase in the speed of work when compared to the CCV methods discussed above. At once it is possible to speed up the NN method’s runs by calculation optimizing: neural network output post-processing does not use the GPU, but uses rather all available CPU cores. Use of the more powerful multi-core CPUs and installing the multi-core support package (e.g., OpenMP) will speed up the mask prediction process significantly. Among the other challenging approaches to quality improvement, the selected StarDist neural network might be combined with the popular TrackMate tracking software for bubble tracking [48]. It might be significant to describe the bubble shape evolution over time. For automatic tracking we will not need to retrain another neural network—the already trained StarDist model should rather be further developed.

5.3. Classical Computer Vision Method vs. Neural Network Approach

When providing the comparison of the results of the classical computer vision and neural network methods for the number and total area of bubbles we get the NN method as the one showing more stable results in terms of the number of bubbles (Figure 32a). The quantity of the detected bubbles also prevails for NN by 20%. However, the numbers are still in the 95% confidence intervals of each other; therefore, it cannot be claimed that statistically an improvement was achieved. However, as for the estimated areas—they differ much more (Figure 32b), namely the NN method provided twice greater estimation. It is odd since it means that 20% of bubbles undetected by the classical method have 100% of total area of detected bubbles. The reason may lie in different approaches for the area calculation: the NN method uses points of the mask, while the CCV method calculates the elliptical area.
Thus, we also compared the methods on videos (Figure 33a). The ratios between the number of bubbles and frequency for both methods is shown to be similar. Although the NN provides relatively better results for the F3 video (50 Hz), the ratio did not change. As a result, both methods seem to be able to detect and distinguish various bioreactor modes. In addition, there is a space for the CCV method to obtain more stable results: its parameters might be tuned; however, it seems to be unusable since the CCV method can already detect the differences and tuning might result in over-fitting. A comparison of the total area estimations obtained by both of the methods shows that the gap between estimates remains (Figure 33b). However, the NN method results have a greater variance now: it might be explained by the fact that the CCV method uses the ellipse as an approximation model which adds the information to a method and might result in more stable estimates. However, more data with smaller steps between bioreactor modes are needed to investigate the limits of the methods’ detection and sensitivity. The current limit is defined as the 30 Hz mode since it produces too high bubble density values.

5.4. Mass Transfer Coefficient Estimation Approach Based on Developed Methodology

Now when we are all set to handle the crucial parameter of the fermenter’s efficiency mentioned in Section 2, namely to provide the phase contact area calculation methodology, it can be assumed again for CCV that bubbles have ellipsoidal form. Since the two-dimensional image gives us only an estimate of the cross-sectional area of the ellipsoid’s projection, we can assume that the value of the missing semi-axis along the line of sight is equal in order of magnitude to c = a b . Using the approximate formula [49] for estimating the surface area of an ellipsoid, we estimate the contact area for one bubble of the gas phase and liquid as S = γ 4 π a p b p + b p c p + a p c p 3 1 / p , where p 1.6075 (for this parameter the relative error of the formula is ca. one percent), γ — conversion factor of pixel area to physical, ( a , b , c ) — semiaxes of the ellipsoid. In this case the total observed phase area is the sum of the bubbles areas.
For irregular-shaped bubbles in the NN method we use another formula, S γ ( V ) 2 / 3 , where V R S p r o j e c t i o n —the 3D volume of cluster, S p r o j e c t i o n —the total sum of the cluster’s pixels and we assume R = S p r o j e c t i o n 4 π . Since the capabilities of optical methods for bubble detection have certain limitations on the minimum sizes of the objects under consideration, it is necessary to use methods other than a simple summation of all the phase contact areas in order to evaluate the real mass transfer value. Indeed, a significant contribution to mass transfer might be made by submillimeter bubbles and even smaller ones, not caught by conventional camera, as also discussed in Section 3 and Section 4, as well as in [45,50].
As seen in Figure 34 applying a mathematical function to random variables increases the final variances of estimates. In addition, the calibration coefficients are expected to have much more variance than estimated standard deviations for the obtained total area estimates since there is more uncertainty in calibration estimating. Despite the fact that the methods used made it possible to determine the type of bubble size distribution functions, the number of bubbles which are invisible to the eye and their characteristics remain unknown. In order to overcome this barrier it is possible to compare the observed bubbles with the final phase contact area calculated using other methods.
Thus, for k L a calculations the methods presented in [8,51,52] might be used. The observed and real areas of phase contact should be compared based on experimental data, then the coefficient k L can also be found experimentally or theoretically by analyzing the chemical components of the liquid. Summarizing, the final (real) phase contact area is represented as an implicit function of the observed one a r e a l = f ( a o b s e r v e d ) and k L a = k l f ( a o b s e r v e d ) . The represented approach defines trends for the further development of the experimental measuring and analytical part of the methodology assessment of the mass transfer coefficient for highly intensive gas–liquid flows in bioreactor loops.

5.5. Methodology and Results of Approach Comparison

In view of the fact that the manuscript outlines two approaches developed to study the bubble flows, the next fair step is to develop a methodology for a qualitative comparative assessment of the methods applied; thus a criteria set was developed in order to compare CCV and NN methods. The set included over a dozen different criteria, i.e., model and sample complexity, robustness of the model to changes in sample labeling and number of processed samples, etc. Being mostly general criteria they provide nevertheless a comparison of the two models obtained in a very different way in terms of the speed of development, data preparation, their speed and resistance. A number of quantitative assessments is represented in Table 2. Let us provide the results of applying the assessment methodology to both methods in this subsection:
Both models were used on comparable hardware with the same amount of free RAM, namely 32 GB—however, both models did use a relatively small amount of available memory.
Both models have a comparable single-frame processing speed with bubbles. When assuming the NN method, based on modification of StarDist, spends from 250 ms to 650 ms per frame with a resolution of 1280 by 800, then the CCV method, based on DBScan, varies from 180 to 1090 ms depending on the number of bubbles for detection.
Data labeling in both cases takes comparable time and is used primarily to build a dataset in order to assess the quality of the model according to the selected metric.
An advantage of the classical model (CCV method) is minor training required, while the deep learning one developed for the modification of StarDist (NN method) requires training for several hundred epochs on hundreds of labeled samples. It is worth noting, however, that, in case of obtaining new data that differs from the previous, the CCV method might require manual hyperparameter tuning, while in the NN method training occurs automatically.
Meanwhile the strength of the model based on the NN method is the augmentation including blurring and rotations for each image that makes this model potentially more resistant to the filming condition modifications. It is worth mentioning notwithstanding that the study of stability lies out of the current paper’s focus.
Summarizing, none of the developed models has a significant advantage over another with the studied criteria considered significant for further practical use.

5.6. Prospects for the Development of the Methodology and Areas of Its Applicability

Considering the prospects for topic development, including alternative research methods, the approach for determining the key parameter of the biosynthesis process developed in the current paper might represent the core for determination (and prediction) of the required characteristic based on classical computer vision methods, neural network algorithms and their possible superposition. The principal idea of handling the biosynthesis process key parameters in virtue of the demonstrated approach is based on the connectivity of a wide spectrum of fermentation process (occurring in a bioreactor) parameters at the physical level, where a combination of certain characteristics allows one to calculate parameters which cannot be explicitly observed. These characteristics may include degree of gas substrate utilization, crude protein yield, methane, air, titrate consumption, etc. Thus, regarding the problem of determining the phase contact area in gas–liquid flows, a special regression model can be developed, although the problem can be formulated in the form of classification of time series [53] for the case of discretization of the target variable state scale. In general, this approach corresponds to sensorless measurement, which is currently being actively developed [54], especially in the field of electrical engineering. The advantage of this approach lies in utilization of only those process characteristics that are measured using basic sensors with additional investments in equipment, while the disadvantages include the facts that, firstly, it is necessary to prepare a labeled dataset of sufficient size and, secondly, the development of the algorithm itself can be quite laborious, since there is no guarantee of a good connection between the measured parameters and the target value. In other words it might turn out that in order to solve the problem it is necessary to add a set of new measurements or change the sensor parameters. Thus, it would be necessary to conduct a joint analysis of measurement data, extract additional features from them using data mining techniques [55], and possibly reduce the dimension or get embeddings (for neural networks). Furthermore, it is necessary to choose and fine-tune the deep learning method to determine the target value, controlling possible overfitting. Good examples of this approach are known [56] when the only operational data from oil wells collected initially for other process purposes made it possible to develop an algorithm for determining lost circulations, diagnostics for which by traditional methods are known to be slow and pricey. In either event the demonstrated approach lays the foundation for modern method development for bioreactor bubble flow analysis, which is the core for understanding and controlling both its productivity and safety, and ultimately economic efficiency when applied on an industrial scale of biotechnological production.

6. Conclusions

Within the represented research a comprehensive methodology for determining the components of a mass transfer coefficient evaluation was developed and verified. The datasets obtained for the consequent model verification during the experimental runs were conducted on the research set up, emulating the jet bioreactor circuit: frames were taken by a high-speed camera and utilized later as input data for the research algorithms. Both classical and new computer vision methods were developed and applied; benefits and drawbacks of each approach were analyzed. The individual bubble parameters were extracted; the total area of phase contact, known as a crucial value to calculate the volumetric mass transfer coefficient in light of its principal impact on the efficiency and performance of mass transfer apparatuses, bioreactors and fermenters, was determined. Accurate data was obtained on gas bubble sizes and their distribution as well as parameters of the interfacial surface representing the necessary conditions for precise fermentation control, development and optimization of bioreactor control systems. The demonstrated approach is valuable not only as an innovative complex methodology among non-invasive phase contact area measurements not involving ultrasound or potential measurements, but as the perspective to be applied in a literal or hybrid version for a wide range of high-velocity gas-saturated flow handling including reliable control of industrial fermentation machines. The current research is a part of a global research project devoted to studying the properties of jet bioreactors as the core of biotechnological solutions for single-cell protein production.

Author Contributions

Conceptualization, D.C. and I.N.; methodology, V.S., S.Y., A.Z. and A.N.; software, V.P., I.S. (Ivan Simkin), P.M., T.H., M.M. and S.V.; formal analysis, S.I. and S.K.; investigation, I.S. (Ilya Starodumov), M.N., T.M. and A.K.; resources, F.B. and D.C.; data curation, T.M. and P.M; writing—original draft preparation, I.S. (Ilya Starodumov), I.N., P.M., V.P., V.S., I.S. (Ivan Simkin) and D.C.; writing—review and editing, I.S. (Ilya Starodumov), I.N., P.M., V.P. and D.C.; visualization, V.P., I.S. (Ivan Simkin) and P.M.; supervision, F.B. and D.C.; project administration, I.S. (Ilya Starodumov) and D.C.; funding acquisition, I.S. (Ilya Starodumov), F.B. and D.C. All authors have read and agreed to the published version of the manuscript.

Funding

The research funding from the Ministry of Science and Higher Education of the Russian Federation (Ural Federal University Program of Development within the Priority-2030 Program) is gratefully acknowledged.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Voulodimos, A.; Doulamis, N.; Doulamis, A.; Protopapadakis, E. Deep Learning for Computer Vision: A Brief Review. Comput. Intell. Neurosci. 2018, 2018, 7068349. [Google Scholar] [CrossRef] [PubMed]
  2. O’ Mahony, N.; Campbell, S.; Carvalho, A.; Harapanahalli, S.; Velasco-Hernandez, G.; Krpalkova, L.; Riordan, D.; Walsh, J. Deep Learning vs. Traditional Computer Vision. arXiv 2019, arXiv:1910.13796. [Google Scholar]
  3. Shafizadeh, A.; Shahbeig, H.; Nadian, M.H.; Mobli, H.; Dowlati, M.; Gupta, V.K.; Peng, W.; Lam, S.S.; Tabatabaei, M.; Aghbashlo, M. Machine learning predicts and optimizes hydrothermal liquefaction of biomass. Chem. Eng. J. 2022, 445, 136579. [Google Scholar] [CrossRef]
  4. Ignacz, G.; Alqadhi, N.; Szekely, G. Explainable machine learning for unraveling solvent effects in polyimide organic solvent nanofiltration membranes. Adv. Membr. 2023, 3, 100061. [Google Scholar] [CrossRef]
  5. Mann, V.; Venkatasubramanian, V. AI-driven hypergraph network of organic chemistry: Network statistics and applications in reaction classification. React. Chem. Eng. 2023. [Google Scholar] [CrossRef]
  6. Zohar, B.; Blinder, Y.; Epshtein, M.e. Multi-flow channel bioreactor enables real-time monitoring of cellular dynamics in 3D engineered tissue. Commun. Biol. 2019, 2, 7068349. [Google Scholar] [CrossRef]
  7. Alavijeh, M.; Baker, I.; Lee, Y.; Gras, S. Digitally enabled approaches for the scale up of mammalian cell bioreactors. Digit. Chem. Eng. 2022, 4, 100040. [Google Scholar] [CrossRef]
  8. Starodumov, I.; Nizovtseva, I.; Lezhnin, S.; Vikharev, S.; Svitich, V.; Mikushin, P.; Alexandrov, D.; Kuznetsov, N.; Chernushkin, D. Measurement of Mass Transfer Intensity in Gas-Liquid Medium of Bioreactor Circuit Using the Thermometry Method. Fluids 2022, 7, 366. [Google Scholar] [CrossRef]
  9. Aroniada, M.; Maina, S.; Koutinas, A.; Kookos, I.K. Estimation of volumetric mass transfer coefficient (kLa)—Review of classical approaches and contribution of a novel methodology. Biochem. Eng. J. 2020, 155, 107458. [Google Scholar] [CrossRef]
  10. Ho, D.; Kim, K.; Earmme, T.; Kim, C. Enhancing gas–liquid volumetric mass transfer coefficient. J. Ind. Eng. Chem. 2020, 87, 1–17. [Google Scholar] [CrossRef]
  11. Abbasian-arani, M.; Hatamipour, M.S.; Rahimi, A. Experimental determination of gas holdup and volumetric mass transfer coefficient in a jet bubbling reactor. Chin. J. Chem. Eng. 2021, 34, 61–67. [Google Scholar] [CrossRef]
  12. Richard, H.; Irina, N.; Dmitri, C.; Kalyuzhnaya, M.G. C1-Proteins Prospect for Production of Industrial Proteins and Protein-Based Materials from Methane. In Algal Biorefineries and the Circular Bioeconomy; CRC Press: Boca Raton, FL, USA, 2022; pp. 251–276. [Google Scholar]
  13. Kalyuzhnaya, M.; Gomez, O.; Murrell, J. The Methane-Oxidizing Bacteria (Methanotrophs). In Taxonomy, Genomics and Ecophysiology of Hydrocarbon-Degrading Microbes; Springer: Cham, Switzerland, 2019; pp. 245–278. [Google Scholar] [CrossRef]
  14. León-Becerril, E.; Maya-Yescas, R. Axial Variation of Mass Transfer Volumetric Coefficients in Bubble Column Bioreactors. Chem. Prod. Process Model. 2010, 5. [Google Scholar] [CrossRef]
  15. Rahimi, M.J.; Sitaraman, H.; Humbird, D.; Stickel, J.J. Computational fluid dynamics study of full-scale aerobic bioreactors: Evaluation of gas–liquid mass transfer, oxygen uptake, and dynamic oxygen distribution. Chem. Eng. Res. Des. 2018, 139, 283–295. [Google Scholar] [CrossRef]
  16. Nizovtseva, I.G.; Starodumov, I.O.; Schelyaev, A.Y.; Aksenov, A.A.; Zhluktov, S.V.; Sazonova, M.L.; Kashinsky, O.N.; Timkin, L.S.; Gasenko, V.G.; Gorelik, R.S.; et al. Simulation of two-phase air–liquid flows in a closed bioreactor loop: Numerical modeling, experiments, and verification. Math. Methods Appl. Sci. 2022, 45, 8216–8229. [Google Scholar] [CrossRef]
  17. Charles, M. Fermentation scale-up: Problems and possibilities. Trends Biotechnol. 1985, 3, 134–139. [Google Scholar] [CrossRef]
  18. Gemello, L.; Cappello, V.; Augier, F.; Marchisio, D.; Plais, C. CFD-based scale-up of hydrodynamics and mixing in bubble columns. Chem. Eng. Res. Des. 2018, 136, 846–858. [Google Scholar] [CrossRef]
  19. Finkler, A.T.J.; de Lima Luz, L.F.; Krieger, N.; Mitchell, D.A.; Jorge, L.M. A model-based strategy for scaling-up traditional packed-bed bioreactors for solid-state fermentation based on measurement of O2 uptake rates. Biochem. Eng. J. 2021, 166, 107854. [Google Scholar] [CrossRef]
  20. Linek, V.; Sinkule, J.; Benes, P. Critical assessment of the dynamic double-response method for measuring kLa: Experimental elimination of dispersion effects. Chem. Eng. Sci. 1992, 47, 3885–3894. [Google Scholar] [CrossRef]
  21. Patel, N.; Thibault, J. Enhanced in situ dynamic method for measuring KLa in fermentation media. Biochem. Eng. J. 2009, 47, 48–54. [Google Scholar] [CrossRef]
  22. Chen, F.; Gomez, C.; Finch, J. Technical note bubble size measurement in flotation machines. Miner. Eng. 2001, 14, 427–432. [Google Scholar] [CrossRef]
  23. Wang, J.; Forbes, G.; Forbes, E. Frother Characterization Using a Novel Bubble Size Measurement Technique. Appl. Sci. 2022, 12, 750. [Google Scholar] [CrossRef]
  24. Fynn, G. Bioprotein manufacture: A critical assessment: By David H. Sharp. Pp. 140. Ellis Horwood, Chichester. 1989. £34.95. Endeavour 1990, 14, 100. [Google Scholar] [CrossRef]
  25. Sharif, M.; Zafar, M.H.; Aqib, A.I.; Saeed, M.; Farag, M.R.; Alagawany, M. Single cell protein: Sources, mechanism of production, nutritional value and its uses in aquaculture nutrition. Aquaculture 2021, 531, 735885. [Google Scholar] [CrossRef]
  26. Xu, J.; Wang, J.; Ma, C.; Wei, Z.; Zhai, Y.; Tian, N.; Zhu, Z.; Xue, M.; Li, D. Embracing a low-carbon future by the production and marketing of C1 gas protein. Biotechnol. Adv. 2023, 63, 108096. [Google Scholar] [CrossRef]
  27. Woolley, L.; Chaklader, M.R.; Pilmer, L.; Stephens, F.; Wingate, C.; Salini, M.; Partridge, G. Gas to protein: Microbial single cell protein is an alternative to fishmeal in aquaculture. Sci. Total Environ. 2023, 859, 160141. [Google Scholar] [CrossRef]
  28. Stanbury, P.F.; Whitaker, A.; Hall, S.J. Principles of Fermentation Technology; Elsevier: Amsterdam, The Netherlands, 2013. [Google Scholar]
  29. Schügerl, K. Comparison of different bioreactor performances. Bioprocess Eng. 1993, 9, 215–223. [Google Scholar] [CrossRef]
  30. Moser, A. Bioreactor Performance: Process Design Methods. In Bioprocess Technology; Springer: New York, NY, USA, 1988; pp. 307–405. [Google Scholar]
  31. Petersen, L.A.; Villadsen, J.; Jørgensen, S.B.; Gernaey, K.V. Mixing and mass transfer in a pilot scale U-loop bioreactor. Biotechnol. Bioeng. 2017, 114, 344–354. [Google Scholar] [CrossRef] [PubMed]
  32. Haloy, R.; Flusser, J. Numerically Stable Direct Least Squares Fitting of Ellipses; WSCG: Beaufort, SC, USA, 1998. [Google Scholar]
  33. Hubert, L.; Arabie, P. Comparing partitions. J. Classif. 1985, 2, 193–218. [Google Scholar] [CrossRef]
  34. Comaniciu, D.; Meer, P. Mean Shift: A Robust Approach Toward Feature Space Analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 603–619. [Google Scholar] [CrossRef]
  35. von Luxburg, U. Statistics and Computing; John Wiley & Sons Ltd.: Hoboken, NJ, USA, 2007. [Google Scholar]
  36. Shi, J.; Malik, J. Normalized Cuts and Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 888–905. [Google Scholar] [CrossRef]
  37. Ccikit Learn Team. 2.3—Clustering. Available online: https://scikit-learn.org/stable/modules/clustering.html (accessed on 15 January 2023).
  38. Abdulla, W. Mask R-CNN for Object Detection and Instance Segmentation on Keras and TensorFlow. 2017. Available online: https://github.com/matterport/Mask_RCNN (accessed on 15 January 2023).
  39. Schmidt, U.; Weigert, M.; Broaddus, C.; Myers, G. Cell Detection with Star-Convex Polygons. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2018, Granada, Spain, 16–20 September 2018; Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 265–273. [Google Scholar]
  40. Kim, Y.; Park, H. Deep learning-based automated and universal bubble detection and mask extraction in complex two-phase flows. Sci. Rep. 2021, 11, 8940. [Google Scholar] [CrossRef]
  41. Fu, Y.; Liu, Y. BubGAN: Bubble generative adversarial networks for synthesizing realistic bubbly flow images. Chem. Eng. Sci. 2019, 204, 35–47. [Google Scholar] [CrossRef]
  42. Hessenkemper, H.; Starke, S.; Atassi, Y.; Ziegenhein, T.; Lucas, D. Bubble identification from images with machine learning methods. Int. J. Multiph. Flow 2022, 155, 104169. [Google Scholar] [CrossRef]
  43. Gerber, M.; Pillay, N.; Holan, K.; Whitham, S.A.; Berger, D.K. Automated Hyper-Parameter Tuning of a Mask R-CNN for Quantifying Common Rust Severity in Maize. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; pp. 1–7. [Google Scholar] [CrossRef]
  44. Bi, R.; Tang, J.; Wang, L.; Yang, Q.; Zuo, M.; Chen, C.; Xiang, S. Experimental Study on Bubble Size Distribution in Gas-Liquid Reversed Jet Loop Reactor. Int. J. Chem. React. Eng. 2020, 18, 20190102. [Google Scholar] [CrossRef]
  45. Wang, B.; He, R.; Chen, M.; Pi, S.; Zhang, F.; Zhou, A.; Zhang, Z. Intensification on mass transfer between gas and liquid in fine bubble jet reactor. J. Environ. Chem. Eng. 2021, 9, 104718. [Google Scholar] [CrossRef]
  46. Bankhead, P.; Loughrey, M.B.; Fernández, J.A.; Fernández, J.A.; Dombrowski, Y.; McArt, D.G.; Dunne, P.D.; McQuaid, S.; Gray, R.T.; Murray, L.J.; et al. QuPath: Open source software for digital pathology image analysis. Sci. Rep. 2017, 7, 16878. [Google Scholar] [CrossRef]
  47. Cerqueira, R.; Paladino, E.; Ynumaru, B.; Maliska, C. Image processing techniques for the measurement of two-phase bubbly pipe flows using particle image and tracking velocimetry (PIV/PTV). Chem. Eng. Sci. 2018, 189, 1–23. [Google Scholar] [CrossRef]
  48. Fazeli, E.; Roy, N.H.; Follain, G.; Laine, R.F.; von Chamier, L.; Hänninen, P.E.; Eriksson, J.E.; Tinevez, J.Y.; Jacquemet, G. Automated cell tracking using StarDist and TrackMate. F1000Research 2020, 9, 1279. [Google Scholar] [CrossRef] [PubMed]
  49. Klamkin, M. Elementary approximations to the area of n-dimensional ellipsoids. Am. Math. Mon. 1971, 78, 280–283. [Google Scholar] [CrossRef]
  50. Mandal, A.; Kundu, G.; Mukherjee, D. Gas-holdup distribution and energy dissipation in an ejector-induced downflow bubble column: The case of non-Newtonian liquid. Chem. Eng. Sci. 2004, 59, 2705–2713. [Google Scholar] [CrossRef]
  51. Rathore, A.S.; Kanwar Shekhawat, L.; Loomba, V. Computational Fluid Dynamics for Bioreactor Design; Wiley Online Library: Hoboken, NJ, USA, 2016; pp. 295–322. [Google Scholar]
  52. Maischberger, T. Optimized process and bioreactor characterization. Chem. Ing. Tech. 2019, 91, 1719–1723. [Google Scholar] [CrossRef]
  53. Bagnall, A.; Lines, J.; Bostrom, A.; Large, J.; Keogh, E. The great time series classification bake off: A review and experimental evaluation of recent algorithmic advances. Data Min. Knowl. Discov. 2017, 31, 606–660. [Google Scholar] [CrossRef] [PubMed]
  54. Wu, D.; Huang, H.; Qiu, S.; Liu, Y.; Wu, Y.; Ren, Y.; Mou, J. Application of Bayesian regularization back propagation neural network in sensorless measurement of pump operational state. Energy Rep. 2022, 8, 3041–3050. [Google Scholar] [CrossRef]
  55. Mörchen, F. Time Series Feature Extraction for Data Mining Using DWT and DFT; Philipps-Marburg University: Marburg, Germany, 2003. [Google Scholar]
  56. Khodnenko, I.; Ivanov, S.; Perets, D.; Simonov, M. Detection of lost circulation in drilling wells employing sensor data using machine learning technique. Procedia Comput. Sci. 2019, 156, 300–307. [Google Scholar] [CrossRef]
Figure 1. Schematic representation of the experimental hydrodynamic stand for the bubble flow study. Geometry and technical characteristics of the stand designed in accordance with jet bioreactor circuit.
Figure 1. Schematic representation of the experimental hydrodynamic stand for the bubble flow study. Geometry and technical characteristics of the stand designed in accordance with jet bioreactor circuit.
Algorithms 16 00125 g001
Figure 2. Sample image of the frame from F3 video described in Table 1 under consideration within the current study. Despite the filming of bubble flows being performed by means of the high-speed and high-resolution camera as well as with additional lighting, the flow’s specificity determined the complexity of the primary task, namely, marking and model training.
Figure 2. Sample image of the frame from F3 video described in Table 1 under consideration within the current study. Despite the filming of bubble flows being performed by means of the high-speed and high-resolution camera as well as with additional lighting, the flow’s specificity determined the complexity of the primary task, namely, marking and model training.
Algorithms 16 00125 g002
Figure 3. Image transformations after application of (a) black-hat transform, (b) histogram equalization and (c) CLAHE on the sample frame from the F3 video. One can see that all the methods highlight edges in different ways. After the series of tests for all the transformations above the black hat transform was selected as the most promising and correct.
Figure 3. Image transformations after application of (a) black-hat transform, (b) histogram equalization and (c) CLAHE on the sample frame from the F3 video. One can see that all the methods highlight edges in different ways. After the series of tests for all the transformations above the black hat transform was selected as the most promising and correct.
Algorithms 16 00125 g003
Figure 4. Examples of (a) Canny filter and (b) Frangi filter applied on a sample frame from the F3 video. One can see that the Canny filter produces more false edges while Frangi filter’s edges are wide which is not an easement for the following analysis. For the final methodology we have used both, namely Canny filter after Frangi filter, in order to make edges thinner while keeping in mind this approach might result in double-edging.
Figure 4. Examples of (a) Canny filter and (b) Frangi filter applied on a sample frame from the F3 video. One can see that the Canny filter produces more false edges while Frangi filter’s edges are wide which is not an easement for the following analysis. For the final methodology we have used both, namely Canny filter after Frangi filter, in order to make edges thinner while keeping in mind this approach might result in double-edging.
Algorithms 16 00125 g004
Figure 5. Examples of the final edge detection method on (a) F1, (b) F2, (c) F3 video sample frames. One can see that due to a high density of the image objects the method finally fails to detect bubbles correctly.
Figure 5. Examples of the final edge detection method on (a) F1, (b) F2, (c) F3 video sample frames. One can see that due to a high density of the image objects the method finally fails to detect bubbles correctly.
Algorithms 16 00125 g005
Figure 6. An ellipse parameterization approach provides an appropriate way to define bubbles. The ellipse model might also reduce the statistical error related to the unstable bubble surface.
Figure 6. An ellipse parameterization approach provides an appropriate way to define bubbles. The ellipse model might also reduce the statistical error related to the unstable bubble surface.
Algorithms 16 00125 g006
Figure 7. Dependence of the ellipse regression on the localization of the part observed and points obtained. True values of a model ellipse are: x 0 = 4; y 0 = −3.5; a = 7; b = 3; θ = 0.7854 ( π /4) while (a) to (c) represent fitted parameters, namely: (a) x 0 = 4.06; y 0 = −3.51; a = 7.07; b = 3.06; θ = 0.757; (b) x 0 = 0.33; y 0 = −4.89; a = 3.85; b = 0.98; θ = 1.066; (c) x 0 = 4.01; y 0 = −3.51; a = 7.0; b = 3.01; θ = 0.767.
Figure 7. Dependence of the ellipse regression on the localization of the part observed and points obtained. True values of a model ellipse are: x 0 = 4; y 0 = −3.5; a = 7; b = 3; θ = 0.7854 ( π /4) while (a) to (c) represent fitted parameters, namely: (a) x 0 = 4.06; y 0 = −3.51; a = 7.07; b = 3.06; θ = 0.757; (b) x 0 = 0.33; y 0 = −4.89; a = 3.85; b = 0.98; θ = 1.066; (c) x 0 = 4.01; y 0 = −3.51; a = 7.0; b = 3.01; θ = 0.767.
Algorithms 16 00125 g007
Figure 8. Highlighted points of derivative sign changes of (a) the ellipse with noise and (b) the smoothed one. As a result it might indeed be reasonable to weigh estimated ellipses according to how much information they have. However, it was not used in the current CVV method since it would make the method more complex and it already shows sufficient efficiency as shown in the following sections.
Figure 8. Highlighted points of derivative sign changes of (a) the ellipse with noise and (b) the smoothed one. As a result it might indeed be reasonable to weigh estimated ellipses according to how much information they have. However, it was not used in the current CVV method since it would make the method more complex and it already shows sufficient efficiency as shown in the following sections.
Algorithms 16 00125 g008
Figure 9. Fitted ellipses and obtained clusters of edges: as clearly seen most clusters are determined correctly with a certain number of false bubbles detected which might be caused by double edges produced by the edge detection method described above.
Figure 9. Fitted ellipses and obtained clusters of edges: as clearly seen most clusters are determined correctly with a certain number of false bubbles detected which might be caused by double edges produced by the edge detection method described above.
Algorithms 16 00125 g009
Figure 10. Sample images of fitted ellipses on frames from (a) F1, (b) F2, (c) F3, (d) F4 and (e) F5 videos according to Table 1. Although there are just sample frames, one can still see some patterns. The F1 video has more bubbles than others and also results in a higher density of bubbles making detection more difficult. The F2 video seems to provide the bubble density less than might be visually determined in F1, while the F3—less than in F2. At once the F4 video has an additional object, namely the fluid interface. It does not occur for all frames, however. Finally the F5 video has a greater ratio of out-of-camera-focus bubbles.
Figure 10. Sample images of fitted ellipses on frames from (a) F1, (b) F2, (c) F3, (d) F4 and (e) F5 videos according to Table 1. Although there are just sample frames, one can still see some patterns. The F1 video has more bubbles than others and also results in a higher density of bubbles making detection more difficult. The F2 video seems to provide the bubble density less than might be visually determined in F1, while the F3—less than in F2. At once the F4 video has an additional object, namely the fluid interface. It does not occur for all frames, however. Finally the F5 video has a greater ratio of out-of-camera-focus bubbles.
Algorithms 16 00125 g010
Figure 11. Basic distributions of ellipse parameters: (a) horizontal distribution of bubble centers; (b) vertical distribution of bubble centers; (c) distribution according to the angle of inclination of the semi-major axis; (d) minor axis distribution; (e) major axis distribution; (f) eccentricity; (g) error distribution.
Figure 11. Basic distributions of ellipse parameters: (a) horizontal distribution of bubble centers; (b) vertical distribution of bubble centers; (c) distribution according to the angle of inclination of the semi-major axis; (d) minor axis distribution; (e) major axis distribution; (f) eccentricity; (g) error distribution.
Algorithms 16 00125 g011
Figure 12. Bubble segmentation using pre-trained Mask R-CNN.
Figure 12. Bubble segmentation using pre-trained Mask R-CNN.
Algorithms 16 00125 g012
Figure 13. Bubble segmentation using pre-trained StarDist.
Figure 13. Bubble segmentation using pre-trained StarDist.
Algorithms 16 00125 g013
Figure 14. Dynamics of changes in observed bubble quantity and their median area in the frame: (a) variability in the number of registered bubbles and (b) variation of the median bubble area in the view field (frame indexes are indicated on the x-axis). Both the number and median area might be considered as a Gaussian process as expected for a scale of one second.
Figure 14. Dynamics of changes in observed bubble quantity and their median area in the frame: (a) variability in the number of registered bubbles and (b) variation of the median bubble area in the view field (frame indexes are indicated on the x-axis). Both the number and median area might be considered as a Gaussian process as expected for a scale of one second.
Algorithms 16 00125 g014
Figure 15. Successful (left) and unsuccessful (right) frame samples. In the successful frame there are generally more medium-sized bubbles, while the unsuccessful frame is darker.
Figure 15. Successful (left) and unsuccessful (right) frame samples. In the successful frame there are generally more medium-sized bubbles, while the unsuccessful frame is darker.
Algorithms 16 00125 g015
Figure 16. (a) Sample of successful bubble separation fitting into one cluster. (b) Sample of a bad fit of the unsuccessfully defined boundary. For both figures blue dots correspond to the data obtained, green colored lines indicate a fit by single ellipse for the entire cluster at once, red ones show the ellipse resulting from fitting a randomly selected arc, while yellow colored curves indicate the ellipse obtained as a result of fitting points located at a significant distance from the red ellipse.
Figure 16. (a) Sample of successful bubble separation fitting into one cluster. (b) Sample of a bad fit of the unsuccessfully defined boundary. For both figures blue dots correspond to the data obtained, green colored lines indicate a fit by single ellipse for the entire cluster at once, red ones show the ellipse resulting from fitting a randomly selected arc, while yellow colored curves indicate the ellipse obtained as a result of fitting points located at a significant distance from the red ellipse.
Algorithms 16 00125 g016
Figure 17. Comparison of bubble number (every 90th frame from F3 video).
Figure 17. Comparison of bubble number (every 90th frame from F3 video).
Algorithms 16 00125 g017
Figure 18. Labeled frame sample.
Figure 18. Labeled frame sample.
Algorithms 16 00125 g018
Figure 19. Number (a) and median area (b) of detected (blue) and labeled (orange) bubbles. One can see that the CCV method produces more stable results that manual labeling since a person labeled all bubbles including out-of-focus ones. It might bring additional factors related to the movement of more distant bubbles that the NN method did not detect. Furthermore, note the lack of an obvious explicit difference between clear and blurred bubbles increases the labeling error.
Figure 19. Number (a) and median area (b) of detected (blue) and labeled (orange) bubbles. One can see that the CCV method produces more stable results that manual labeling since a person labeled all bubbles including out-of-focus ones. It might bring additional factors related to the movement of more distant bubbles that the NN method did not detect. Furthermore, note the lack of an obvious explicit difference between clear and blurred bubbles increases the labeling error.
Algorithms 16 00125 g019
Figure 20. Bubble area distribution on a logarithmic scale.
Figure 20. Bubble area distribution on a logarithmic scale.
Algorithms 16 00125 g020
Figure 21. Comparison of the distributions for horizontal (a) and vertical (b) coordinates of bubble’s center for the computer vision method (blue) and labelled results (orange).
Figure 21. Comparison of the distributions for horizontal (a) and vertical (b) coordinates of bubble’s center for the computer vision method (blue) and labelled results (orange).
Algorithms 16 00125 g021
Figure 22. Rolling means of three frames of number (a), median area (b), total area (c) and mean area (d) of bubbles for 20 Hz (red), 30 Hz (blue), 40 Hz (orange) and 50 Hz (green). As seen the number (quantity) (a) and the total area (c) are highly correlated. The median (b) and mean areas (d) look alike for all modes. Therefore various distribution characteristics have different resolutions in terms of bioreactor modes.
Figure 22. Rolling means of three frames of number (a), median area (b), total area (c) and mean area (d) of bubbles for 20 Hz (red), 30 Hz (blue), 40 Hz (orange) and 50 Hz (green). As seen the number (quantity) (a) and the total area (c) are highly correlated. The median (b) and mean areas (d) look alike for all modes. Therefore various distribution characteristics have different resolutions in terms of bioreactor modes.
Algorithms 16 00125 g022
Figure 23. Ratio between (a) median area, (b) total area, (c) number of bubbles and the bioreactor mode. The number and the total area seems to be highly correlated for various bioreactor modes; thus, they have the ability to indicate the chosen mode, unlike the median area. Furthermore, note the area estimation has a lower variance so it might be more preferred to use it in the following calibration of the results. However, it requires more experiments to measure and justify the ratios more precisely.
Figure 23. Ratio between (a) median area, (b) total area, (c) number of bubbles and the bioreactor mode. The number and the total area seems to be highly correlated for various bioreactor modes; thus, they have the ability to indicate the chosen mode, unlike the median area. Furthermore, note the area estimation has a lower variance so it might be more preferred to use it in the following calibration of the results. However, it requires more experiments to measure and justify the ratios more precisely.
Algorithms 16 00125 g023
Figure 24. Comparison of logarithmic area distributions for (a) 40 Hz vs. 50 Hz, (b) 40 Hz vs. 20 Hz, (c) 40 Hz vs. 30 Hz bioreactor modes. One can see how the slope of the distribution changes for various bioreactor modes. The parameter of the exponential distribution might be used in order to parameterize bioreactor modes. However, it requires more datasets to measure and justify the ratio.
Figure 24. Comparison of logarithmic area distributions for (a) 40 Hz vs. 50 Hz, (b) 40 Hz vs. 20 Hz, (c) 40 Hz vs. 30 Hz bioreactor modes. One can see how the slope of the distribution changes for various bioreactor modes. The parameter of the exponential distribution might be used in order to parameterize bioreactor modes. However, it requires more datasets to measure and justify the ratio.
Algorithms 16 00125 g024
Figure 25. Number (a) and median area (b) of bubbles from upper window (blue) and lower window (orange) videos.
Figure 25. Number (a) and median area (b) of bubbles from upper window (blue) and lower window (orange) videos.
Algorithms 16 00125 g025
Figure 26. Comparison of logarithmic distributions for F3 vs. F5 videos (filming through upper and lower viewing windows, respectively): the slopes of the distributions seems similar which might indicate that distribution properties remain the same regardless of the window level. However, it might just be a resolution limit similar to the one at 30 Hz (F1 video). Thus, similar to the data represented in Figure 23 and Figure 24, it requires additional investigation.
Figure 26. Comparison of logarithmic distributions for F3 vs. F5 videos (filming through upper and lower viewing windows, respectively): the slopes of the distributions seems similar which might indicate that distribution properties remain the same regardless of the window level. However, it might just be a resolution limit similar to the one at 30 Hz (F1 video). Thus, similar to the data represented in Figure 23 and Figure 24, it requires additional investigation.
Algorithms 16 00125 g026
Figure 27. Star-convex polygons (left to right) parameterized by the radial distances r i , j k and predicted object probabilities d i , j , using U-Net architecture for density predictions.
Figure 27. Star-convex polygons (left to right) parameterized by the radial distances r i , j k and predicted object probabilities d i , j , using U-Net architecture for density predictions.
Algorithms 16 00125 g027
Figure 28. Training image (a) and mask (b) samples for the StarDist method.
Figure 28. Training image (a) and mask (b) samples for the StarDist method.
Algorithms 16 00125 g028
Figure 29. StarDist method application: sample image from training (a) and test (b) sets.
Figure 29. StarDist method application: sample image from training (a) and test (b) sets.
Algorithms 16 00125 g029
Figure 30. StarDist method application: bubble boundaries (a) and distribution of areas (b).
Figure 30. StarDist method application: bubble boundaries (a) and distribution of areas (b).
Algorithms 16 00125 g030
Figure 31. StarDist method application: bubble approximations by ellipses (a) and eccentricities of bubbles (b).
Figure 31. StarDist method application: bubble approximations by ellipses (a) and eccentricities of bubbles (b).
Algorithms 16 00125 g031
Figure 32. Comparison of the results of the classical computer vision (blue) and neural network (orange) methods for (a) number and (b) total area of bubbles. The number estimations seems to be close but the NN method results have lower variance. There is also a gap between the total area estimation but the CCV method shows in its turn lower variance results for this case.
Figure 32. Comparison of the results of the classical computer vision (blue) and neural network (orange) methods for (a) number and (b) total area of bubbles. The number estimations seems to be close but the NN method results have lower variance. There is also a gap between the total area estimation but the CCV method shows in its turn lower variance results for this case.
Algorithms 16 00125 g032
Figure 33. Ratio between the arithmetical mean of (a) number and (b) total area of bubbles per frame and bioreactor modes for the classical computer vision (blue) and neural network (orange) methods. The NN method produces results with less variance for the bubble number estimation, whereas the CCV method does the same for the total area estimation.
Figure 33. Ratio between the arithmetical mean of (a) number and (b) total area of bubbles per frame and bioreactor modes for the classical computer vision (blue) and neural network (orange) methods. The NN method produces results with less variance for the bubble number estimation, whereas the CCV method does the same for the total area estimation.
Algorithms 16 00125 g033
Figure 34. The arithmetic ellipsoidal surface area of bubbles was obtained from the elliptical approximation of the third axis by the CCV method (blue) and for cluster surface area for NN (orange). One can see that the ratio is similar to the one with the 2D area estimate. It requires more tests; however, the first conclusion is a 2D area estimation instead of a 3D one since we need to calibrate results anyway and for the calibration procedure absolute values are not that crucial, while the ratio matters.
Figure 34. The arithmetic ellipsoidal surface area of bubbles was obtained from the elliptical approximation of the third axis by the CCV method (blue) and for cluster surface area for NN (orange). One can see that the ratio is similar to the one with the 2D area estimate. It requires more tests; however, the first conclusion is a 2D area estimation instead of a 3D one since we need to calibrate results anyway and for the calibration procedure absolute values are not that crucial, while the ratio matters.
Algorithms 16 00125 g034
Table 1. Correspondence between settings and video files for analysis: experiment was conducted and the corresponding data analyzed for the set of bioreactor modes represented.
Table 1. Correspondence between settings and video files for analysis: experiment was conducted and the corresponding data analyzed for the set of bioreactor modes represented.
Video File NumberValue on the Frequency
Converter of the Pump, Hz
Viewing Window Level
F130 u p p e r
F240 u p p e r
F350 u p p e r
F420 u p p e r
F550 l o w e r
Table 2. Selected quantitative estimates of the applied methods: the metrics were considered due to their direct correspondence to project goals. One should note that the absolute values should be calibrated; thus, the standard deviations are more representative.
Table 2. Selected quantitative estimates of the applied methods: the metrics were considered due to their direct correspondence to project goals. One should note that the absolute values should be calibrated; thus, the standard deviations are more representative.
MethodEstimate20 Hz30 Hz40 Hz50 Hz
CCVMean total area of bubbles, 10 5 p.d.u.
Mean number of bubbles
0.57 ± 0.15
160 ± 39
1.5 ± 0.3
334 ± 66
1.59 ± 0.16
330 ± 45
0.96 ± 0.16
204 ± 26
NNMean total area of bubbles, 10 5 p.d.u.
Mean number of bubbles
1.48 ± 0.27
186 ± 34
3.3 ± 0.3
340 ± 29
3.1 ± 0.4
326 ± 30
2.01 ± 0.19
253 ± 13
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nizovtseva, I.; Palmin, V.; Simkin, I.; Starodumov, I.; Mikushin, P.; Nozik, A.; Hamitov, T.; Ivanov, S.; Vikharev, S.; Zinovev, A.; et al. Assessing the Mass Transfer Coefficient in Jet Bioreactors with Classical Computer Vision Methods and Neural Networks Algorithms. Algorithms 2023, 16, 125. https://doi.org/10.3390/a16030125

AMA Style

Nizovtseva I, Palmin V, Simkin I, Starodumov I, Mikushin P, Nozik A, Hamitov T, Ivanov S, Vikharev S, Zinovev A, et al. Assessing the Mass Transfer Coefficient in Jet Bioreactors with Classical Computer Vision Methods and Neural Networks Algorithms. Algorithms. 2023; 16(3):125. https://doi.org/10.3390/a16030125

Chicago/Turabian Style

Nizovtseva, Irina, Vladimir Palmin, Ivan Simkin, Ilya Starodumov, Pavel Mikushin, Alexander Nozik, Timur Hamitov, Sergey Ivanov, Sergey Vikharev, Alexei Zinovev, and et al. 2023. "Assessing the Mass Transfer Coefficient in Jet Bioreactors with Classical Computer Vision Methods and Neural Networks Algorithms" Algorithms 16, no. 3: 125. https://doi.org/10.3390/a16030125

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop