Laboratory for Tensor Networks and Deep Learning for Applications in Data Mining

Cichocki Andrzej Stanislaw

Contract number

14.756.31.0001

Time span of the project

2017-2021

Laboratory's website

As of 01.11.2022

Number of staff members

112

scientific publications

Objects of intellectual property

Name of the project: Tensor networks and deep learning for intelligent data analysis

Research directions: Development of algorithms of tensor decompositions and neural networks for data compression and machine learning, as well as practical application of neural networks and tensor decompositions in biomedicine, telecommunications, agrochemistry, robot control, quantum computers and other domains

Project objective: Computer science and information technologies

Scientific results:

Our researchers have reviewed the current problems of tensorized machine learning algorithms and produced a number of new solutions of the problems of tensorization and tensorized representation of structured data. We have proven that tensor networks provide an opportunity to perform efficient distributed computations involving ultra-large volumes of data and parameters, thus reducing or even eliminating the effect of «the curse of dimensionality». Tensorized algorithms for a number of applied problems have been developed.
We have studied possible methods of practical use of low-rank tensor approximations for the solution of a wide range of problems of large-scale linear/polylinear dimensionality reduction and related problems of optimization based on tensor manifolds which is extremely hard to solve using classical methods of machine learning.
We have researched methods of training of generative models with the use of approximating Bayesian inference. A noise-resistant method has been proposed for training a variational autoencoder with the use of weighted K-selected estimates of stable likelihood. The method is based on the replacement of the target functional of the variational autoencoder, the marginal likelihood, on its noise-resistant modification. We also proposed and researched a number of lower estimates generalizing weighted K-selected estimates for the efficient maximization of noise-proof likelihood.
A new approach has been proposed for building a set of distinct near-optimal configurations of generative neural networks, since for the variational approximation storing K independent variables is costly in terms of memory, an efficient method has been proposed for ensembling, which only requires to store three artificial neural networks.
For the first time in literature we have proposed a multi-modal method for the assessment of the quality of multi-dimensional data generated by deep generative models, including generative adversarial networks. This method uncovers important properties of neural networks and blazes a trail to the development of even more efficient methods based on geometric representations. We have also proposed a new parameterless method for assessment of the size of a generative model. A metric for determining the degree of has been obtained for the evaluation of the degree of unevenness of a generative distribution.
A new approach has been proposed for generating video sequences on the basis of deep neural networks and with a small number of examples. A structure has been developed for the meta-training of generative adversarial models that is able to train very realistic virtual talking heads relying on generative-adversarial networks.
The Laboratory has developed models on the basis of artificial neural networks (including recurrent networks) for the optimization of the functionals in multi-dimensional spaces and proposed a method of training that significantly reduces the effect of retraining. Our researchers have also designed a new probabilistic model of group thinning and proven its applicability to modern deep computer vision architectures like VGG and ResNet. The results of numerical experiments have demonstrated a high degree of compression and acceleration without the loss of the forecasting capability when the proposed model is used. We have also proposed a new method for the acceleration of deep neural networks based on low-dimensional approximation and not requiring the retraining of the model on the basis of ordinary differential equations.
A database of annotated X-ray images that allows to test machine learning algorithms for the the segmentation, cleaning of noise, automated detection of the positions of object in an image and smart data classification. This database can be used for the development of automated decision-making support systems in medical applications.
We have implemented a new layer that employs a number of computer vision methods for various architectures of convolutional artificial neural networks that allows to significantly accelerate the process of automated segmentation of images and searching for objects in images while preserving the precision of the result, which makes this approach promising for the use in real-time systems, including those operating on low-power mobile platforms. The proposed model of an artificial neural network on the basis of the new layer can compete in terms of quality with ENet, one of the fastest modern models.
New models have been developed on the basis of deep artificial neural networks for problems of the photorealistic synthesis of images, including such domains as changing the attributes of images of human faces, the neural-network-based transfer of style on the example of generating clothing items, the semantic segmentation of clothing items and alpha-matting of hair, the summarization of textures and biological images.
Using mechanisms of generalized tensor decomposition, we have demonstrated the universality and the existence of the efficiency of depth in recurrent neural networks, as well as proposed a predictor modeling all two-dimensional interactions of multi-dimensional data by representing an exponentially big tensor of the parameters in the compact tensor-train format (TT), a stochastic training algorithm has been developed on the basis of Riemann optimization. We have provided proofs of a number of theorems on the universality of generative and recurrent neural networks. The results demonstrate the existence of neural networks approximating arbitrary manifolds but do not indicate how the required size of the neural network can be estimated,
We have obtained theoretical results for the local convergence of algorithms of alternating optimization for multi-linear and low-rank optimization, particularly, for the algorithm of alternating least squares that does not depend on the representations of low-rank tensors.
Our researchers have developed a new method for the study of hidden Markov chains that uses tensor networks and proposed an efficient iterative approach for the compression and acceleration of neural network models that allows to achieve a significant degree of compression preserving the predictive capability of the model.
A method has been proposed for building biomarkers of medicinal plants by machine learning methods with the use of tensor decompositions on the basis of chromato-mass-spectrometry data in the problem of species identification.We have developed a method for constructing embedded graphs that can be used in any machine learning algorithms as a vector of features that describes a graph and proposed and researched non-smooth non-negative matrix factorization, demonstrated its relation to the special type of deep neural networks and its efficiency in application to cluster analysis.
The Laboratory has conducted an experimental research of the properties of the loss functions of deep neural networks and demonstrated that the local optima obtained in practice in the space of weights are not isolated and can be connected by curves along which the loss function and the error of classification keep low values.
We have demonstrated that to recover a high-resolution image from a lower-resolution image using a neural network, the architecture is important and not the amount of data used for the training of its parameters, we heave proposed trainable latent convolutional manifolds, individual for each image of the training sample, which is a convolutional neural network, as well as a generator that is common for all the images and that is a convolutional neural network and is optimized in the process of training along with the latent convolutional manifolds, which allows to obtain the optimal latent representation for any data set and to efficiently eliminate any blurriness and distortions in the image.
A new architecture has been proposed for neural networks that allows to recognize objects in images with high precision using a lightweight convolutional network that can be launched on low-power devices and mobile phones and does not require a GPU.
A new neural network architecture has been developed for solving the problem of obtaining a dense representation of a depth map from 3D data measured using a laser range finder. The proposed model extrudes the local nuclei from the corresponding auxiliary RGB images, the local nuclei evaluate the directions of diffusion for each of the pixels, which allows to disseminate information over the proposed architecture in accordance with the whole scene and to achieve a precision higher then that demonstrated by the previously published methods.
We have proposed deep a priori distributions — an approach to the determination of a priori distributions of Bayesian convolutional networks that allows to account for the spatial correlations of the weights of the network. It has been experimentally demonstrated that the proposed approach allows to accelerate the training of architectures as well as to improve the quality of their operation in the case of limited training samples.
Our researchers have presented a new greedy method for deep neural network training for the approximation of functions. We have experimentally demonstrated that greedy training can be very efficient for the study of the architecture of neural networks compared to standard deep training methods. We have demonstrated the possibility of achieving an experimentally exponential drop of the error which is predicted by the theory but is very hard to verify in practice.
A new TT embedding layer has been proposed for the compression of very large search tables used for the coding of categorical features. The new layer of TT embeddings allows to significantly reduce the memory volume of neural network models.
A concept has been developed that allows to determine the values of controlled parameters that increase the efficiency of seed germination in various conditions. The use of this method allowed to increase the efficiency of seed germination. By employing the new approach based on Bayesian methods, we have shown the increase of the efficiency of seed germination.
Our Laboratory has developed an algorithm that allows to eliminate various distortions in images (for instance, noise) in several minutes using a CPU and not requiring any other data except for the source image, while in standard quality improvement methods relying on neural networks, a large data set of low-quality/high-quality image pairs is required for training, and the process of the training of parameters takes from several hours to several days.
We have presented an algorithm relying on deep neural networks for the detection of objects in images. A feature of this algorithm is the capability to work efficiently in conditions of a small amount of data for training, which is a weakness of deep neural network architectures.
A data set has been created for the detection of objects in deteriorating weather conditions. This data set has no counterparts in its domain, since all the data was accumulated in conditions that resemble, to the most possible extent, the conditions of operation of self-driving cars and includes the broadest range of weather conditions among similar data sets.
We have proposed a new model, Block Hankel Tensor — ARIMA, for the forecasting of multiple time series. BHT-ARIMA can improve the precision of forecasts and the rate of computation, especially for multiple time series. We have empirically studied the stability with respect to different parameters. Experiments involving five real-life data sets of time series demonstrate that BHT-ARIMA is superior to the SOTA methods with a significant improvement.
A new approach has been proposed for building a low-rank approximation of the tensor in the format of a tensor train from a part of its elements with a use of a convenient heuristic of choosing the initial approximation on the basis of regression Gaussian process. Moreover, the TT-rank of the initial tensor is automatically corrected by multi-dimensional cross approximation in the TT format and it determines the rank of the tensor. Collectively these methods allow to automatically choose the TT ranks when building TT approximations. Experimental research has confirmed that the proposed initialization ensures less recovery errors for many optimization procedures. Moreover, a modified algorithm has been proposed for integrating a utility function for problems of stochastic optimal control in the form of a low-rank tensor train that possesses a lower computational complexity.
New algorithms have been developed for TT decomposition that are able to simultaneously correct series of two- and three-core tensors while preserving the other core. When the progressively computation of compressed tensors and precompression is used, the proposed algorithms demonstrated a low computational complexity. The proposed algorithms naturally apply to the TT decomposition with limitations imposed on non-negativity or the decomposition of incomplete data. The proposed algorithms were designed using MATLAB TENSORBOX.
We have presented a new application of the decomposition of a tensor train for computing canonical polyadic decompositions (CPD) of tensors of higher ranks. We gave demonstrated that the proposed method ensures the general structure and includes a precise transformation from a TT decomposition to CPD and an iterative algorithm for the assessment of CPD from the TT representation. Modeling confirmed the possibility of applying the approach to both precise tensor rank estimation and for the efficient computation of CPD tensors of a higher rank.
Our researchers have demonstrated the possibility of the significant compression and acceleration of the model (decreasing the size of the model by a factor of 17 and improving the delay by a factor of two) for the problem of estimation DensePose.
We have proposed a new approach to tensor conclusion with the use of tensor ring decomposition in an embedded space. The results of modeling demonstrate the efficiency of the proposed approach in comparison with the modern conclusion algorithms.
We have demonstrated how to efficiently use the method of searching for a rectangular matrix of the maximum volume for the reduction of the dimensionality of layers of neural networks, the error of such approximation has been assessed.
Our laboratory has developed new methods for the randomized fast computation of a decomposition of a tensor ring and proposed a new method for automatically choosing the ranks for a tensor rings.
A theory and various methods have been proposed for the improvement of the stability, robustness and limitations for the low rank of tensor factorizations in the СPD format and a tensor ring in applications to the recovery of gaps in data.

Implemented results of research:

The implemented neural network layer for automated image segmentation and searching for objects in images has been incorporated into the machine learning library PyTorch.
We have obtained a certificate of state registration for the computer program «Program for the creation of recommendation systems «Polara», which is a result of intellectual activity as part of the work of the Laboratory. The program is designed for the fast and convenient creation of new recommendation algorithms as well as the comprehensive analysis of the quality of their work.
The Laboratory has obtained the patent «A system for the compression of artificial neural networks based on the iterative application of tensor approximations». This technical solution is related to the system of compression of artificial neural networks based on the use of tensor approximations. The technical result of this invention is efficient neural network compression, which allows to decrease the size of neural networks while preserving their quality of prediction.
New TT decomposition algorithms have been developed that are able to simultaneously correct series of two- and three-core tensors while preserving the cores. The proposed algorithms were designed using MATLAB TENSORBOX.
We have presented T3F, a library based on TensorFlow for ТТ decomposition. T3F supports computations on the GPU, batch processing, automated differentiation and universal functions for the optimization of the Riemannian structure that account for the base manifolds of the structure to create efficient optimization methods. The library simplifies the implementation of machine learning based on ТТ decomposition. T3F features documentation, examples and 94% test coverage.

Education and career development:

We have developed and launched two courses for the master's degree program: «Convex Optimization and Applications», «Tensor Decompositions and Tensor Networks in Artificial Intelligence».
Six courses for the master's and postgraduate degree programs have been modified: «Numerical Linear Algebra», «Machine Learning and Applications», «Deep Learning», «Bayesian Methods of Machine Learning», «Theoretical Foundations of Data Science», «Uncertainty Quantification».
We have prepared two volumes of the monograph on tensor methods «Tensor networks for dimensionality reduction and large scale optimization».
Two seminars on machine learning and tensor decomposition have been prepared: «Numerical tensor methods and machine learning», «Journal Club».
5 Candidate of Science dissertations have been prepared and defended.
The leading scientist and the members of the academic staff of the Laboratory organized the 5th International Conference on Matrix Methods in Mathematics and Applications (Skoltech campus, 19-23 August 2019). The conference featured presentations of recent developments in matrix methods, numerical linear algebra, tensor decompositions both in the fundamental and applied fields. The conference was attended by leading active researchers working in related fields of studies from Russia as well as from Switzerland, Germany, Italy, the United Kingdom, the USA and other countries. Each speaker presented an original work encompassing new theories, methods and importand applications. This seminar gathered leading scientists and engineers who discussed teir latest achievements.
The leading scientist ant the members of the academic staff of the Laboratory organized the international conference «The 1st International Workshop on Tensor Networks and Machine Learning» (Skoltech campus, 24-26 October 2018).The conference featured more than 40 presentations that reflected the modern achievements in deep learning and tensor networks. The event was attended by more than 150 participants.

Collaborations:

Moscow Institute of Physics and Technology, Higher School of Economics (Russia), Max Planck Institute for Mathematics (Germany), ETH Zurich (Switzerland), Hangzhou Dianzi University, Shanghai Jiao Tong University (China PR), Tokyo University of Agriculture and Technology, RIKEN Center for Advanced Intelligence Project (Japan): joint research.
Huawei (Chine PR): research of convolutional networks.
LG Electronics (South Korea): research of the fast tensor contraction of networks.

Among the industrial partners of the Laboratory are major technological companies and leaders of the Russian market: Sberbank, Gazpromneft.

Hide Show full

jin, j., xiao, r., daly, i., (...), wang, x., cichocki, a. internal feature selection method of csp based on l1-norm and dempster-shafer theory

2021, IEEE Transactions on Neural Networks and Learning Systems, 32(11), с. 4814-4825.

ahmadi-asl, s., abukhovich, s., asante-mensah, m.g., (...), tanaka, t., oseledets, i.

Randomized Algorithms for Computation of Tucker Decomposition and Higher Order SVD (HOSVD). 2021, IEEE Access, 9, 9350569, с. 28684-28706.

ulyanov, d., vedaldi, a., lempitsky, v.

Deep Image Prior. 2020, International Journal of Computer Vision, 128(7), с. 1867-1888.

fokina, d., muravleva, e., ovchinnikov, g., oseledets, i.

Microstructure synthesis using style-based generative adversarial networks. 2020, Physical Review E, 101(4), 043308.

zheng, w.-l., liu, w., lu, y., lu, b.-l., cichocki, a.

EmotionMeter: A Multimodal Framework for Recognizing Human Emotions. 2019, IEEE Transactions on Cybernetics, 49(3), 8283814, с. 1110-1122.

zhang, y., wang, y., zhou, g., (...), wang, x., cichocki, a.

Multi-kernel extreme learning machine for EEG classification in brain-computer interfaces. 2018, Expert Systems with Applications, 96, с. 302-310.

garipov, t., izmailov, p., podoprikhin, d., vetrov, d., wilson, a.g.

Loss surfaces, mode connectivity, and fast ensembling of DNNs. 2018, Advances in Neural Information Processing Systems, с. 8789-8798.

khrulkov, v., oseledets, i.

Art of Singular Vectors and Universal Adversarial Perturbations. 2018, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 8578991, с. 8562-8570.

che, m., cichocki, a., wei, y.

Neural networks for computing best rank-one approximations of tensors and its applications. 2017, Neurocomputing, 267, с. 114-133.

deshpande, g., rangaprakash, d., oeding, l., cichocki, a., hu, x.p.

A new generation of brain-computer interfaces driven by discovery of latent EEG-fMRI linkages using tensor decomposition. 2017, Frontiers in Neuroscience, 11, 246.

Hosting organization	Field of studies	City	Invited researcher	Time span of the project
Laboratory «Hybrid modeling and optimization methods in complex systems» Siberian federal University - (SibFU)	Computer and information sciences	Krasnoyarsk	Stanimirović Predrag Stefan Serbia	2022-2024
Laboratory «Research of ultra-low-latency network technologies with ultra-high density based on the extensive use of artificial intelligence for 6G networks» The Bonch-Bruevich Saint Petersburg State University of Telecommunications	Computer and information sciences	St. Petersburg	Abd El-Latif Ahmed Abdelrahim Egypt	2022-2024
Laboratory for Non-linear and Microwave Photonics Ulyanovsk State University - (USU)	Computer and information sciences	Ulyanovsk	Taylor James Roy United Kingdom, Ireland	2021-2023