Font Size: a A A

A Deep Learning Framework Using Vulkan

Posted on:2023-11-18Degree:MasterType:Thesis
Country:ChinaCandidate:G H FuFull Text:PDF
GTID:2558306914960399Subject:Electronic communications and engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of artificial intelligence,there has been a wave of in-depth learning in academia and industry The wave of.The strength of deep learning has left a deep impression on people.More and more people participate in the wave of deep learning.However,the strong coupling between deep learning framework and hardware makes deep learning practitioners have to buy graphics card equipment from designated manufacturers,which limits the development of deep learning to a certain extent.At present,the commonly used deep learning frameworks such as tensorflow and pytorch are bound with NVIDIA graphics card,which is slightly insufficient in terms of cross platform.For example,neither of them can train on ARM GPU,nor can they train neural network on Windows system using AMD graphics card.At the same time,most of the existing deep learning frameworks use float32 floating point numbers for calculation,which occupies a large memory space,This makes it impossible to train larger models on small memory devices.At present,most of the popular deep learning frameworks such as tensorflow and pytorch are provided by NVIDIA CUDA(Compute Unified Device Architecture)library accelerates computing.Fewer frameworks support ROCM(radeon open compute platform)library provided by AMD.The common disadvantage of the two libraries is that they only support their own graphics card devices and cannot provide a general cross platform computing capability for the upper-level deep learning framework.At the same time,the two libraries only support float32 precision neural network training,It does not have the ability to support dynamic low precision floating-point training,which makes it unable to train large-scale networks on low memory devices.Aiming at the above problems,by using Vulkan as the basic computing library and designing and implementing the dynamic floating-point training system,this paper realizes a deep learning framework VTorch which can run across platforms and train large-scale neural networks on small memory devices.The framework is mainly composed of front-end interface,RPC communication,training kernel,GPU acceleration and low precision floating-point system It consists of five parts.In terms of cross platform and memory occupation,it has advantages over the existing tensorflow,pytorch and other frameworks.In terms of cross platform and acceleration,based on Vulkan,the system realizes various commonly used in-depth learning The operator operation of the algorithm enables the system to train neural networks on a variety of operating systems and hardware devices.At the same time,the system adopts cyclic blocking,Winograd and other algorithms to accelerate the time-consuming general matrix multiplication GEMM(general matrix multiplication)and convolution calculation in neural network,which greatly improves the operation speed of DNN and CNN.In terms of memory occupation,in order to reduce the space occupation of neural network model,the system designs and implements a set of low-cost Dynamic floating point training system with high precision.Inspired by hybrid precision training,the system can set different floating-point precision for different network nodes in the process of neural network training,so as to reduce precision redundancy and reduce the occupation of model space.Under the same memory space,it can train a larger scale neural network.Finally,taking iris variety detection algorithm as an example,this paper analyzes the depth detection algorithm based on Vulkan The cross platform ability of the learning framework is verified.At the same time,the handwriting recognition algorithm is used to verify the memory occupation and model accuracy of the low-precision dynamic floatingpoint training system designed in this subject.The test results show that the deep learning framework designed in this subject is better than the popular pytoch in cross platform.At the same time,compared with the neural network with float32 precision,the neural network model obtained by the low-precision dynamic floating-point training system designed in this subject reduces the space occupation by three times when the precision loss is small.
Keywords/Search Tags:deep learning framework, compute shader, Vulkan, cross platform, dynamic float
PDF Full Text Request
Related items