Font Size: a A A

Generative Design Of Trusses Based On Reinforcement Learning

Posted on:2023-01-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:R F LuoFull Text:PDF
GTID:1522307316453994Subject:Civil engineering
Abstract/Summary:PDF Full Text Request
Reinforcement learning is an area of machine learning concerned with how to train computers to think,judge,reason,and make decisions like humans.In 2016,AlphaGo,a Reinforcement learning algorithm,outperformed the human world champion of the board game Go.The algorithm firstly transforms the problem of the optimal drop position into the decision process defined by a Markov decision process(MDP),and then Monte Carlo tree search(MCTS)is used to solve the complex decision-making process effectively.Inspired by this successful case,the thesis finds that the generative design problem in the field of truss design is similar to the Go problem,both of which have a large-scale optimization solution space and complex and diverse decision logic.Therefore,a truss generation decision model is proposed,which abstracts the generative design problem of trusses into an MDP model for reinforcement learning.In the process of modeling,the key elements of MDP(state set,action set,reward mechanism,etc.)are integrated with engineering design logic and structural engineering knowledge.To transform the problem of truss generative design into an optimal sequential decision problem,An MDP model for truss layout design is firstly proposed.This MDP model combines the prior knowledge of civil engineering and engineering design logic and defines the key decision elements in the MDP model,such as the sequence action set(adding nodes,adding bods,selecting bar sections),the design state set(information related to truss nodes and bods),and the reward mechanism(comprehensive evaluation and feedback of decision effects).This MDP model realizes the interaction between Agent and Environment,which provide a model basis for the solution of the MDP model for truss generative design.A heuristic search framework named AlphaTruss is proposed for solving the above MDP of truss layout design,which is an MCTS-based reinforcement learning method.This search framework specifies the execution sequence of three kinds of action sets and the tree search policy(including four basic steps: selection,expansion,simulation,and backpropagation).Based on the trade-off between exploration and exploitation,AlphaTruss can dynamically and heuristically search the decision tree to evaluate the confidence value of different actions.Guided by the modified upper confidence bound formula(T-UCB),the agent can accumulate experience through a large number of repeated MCTS searches,then select the action with the highest confidence value in different states to solve the decision problem.In order to deal with continuous decision variables in the truss design process,a two-stage truss layout design algorithm named AlphaTruss-UD is proposed based on the AlphaTruss decision model.In the process of expanding the decision tree,the candidate action set is generated by means of uniformly discretizing the continuous decision interval,and the original mixed continuous and discrete variables are unified into discrete action sets.In the first stage,the optimal truss topology is obtained under current discretization accuracy.In the second stage,the neighborhood subdivision is made based on the topology obtained in stage one,and the decision for continuous decision variables sampled in the previous step is refined by multiple rounds of MCTS searching.When the design problem is expanded from 2D to 3D,the size of the candidate action set by a uniform discrete strategy will increase rapidly,which makes the design process more difficult to solve.To this end,a new MCTS-based algorithm named AlphaTruss-KR is proposed by introducing the kernel regression strategy and progressive widening strategy.To efficiently sample actions in the continuous decision space,Euler distance is used to measure the similarity between actions,and the Gaussian kernel function is used to regress the similarity nonparametrically.In this sampling strategy,a new upper confidence bound formula(KR-UCB)is proposed to evaluate the confidence value of any action in the continuous decision space including both 2D and 3D cases.In the process of action sampling,the algorithm gradually adds the actions with high confidence value to the candidate action sets.The effectiveness of both AlphaTruss-UD and AlphaTruss-KR in generative design problems is verified by several examples,and the data and core function code of related algorithms are open source.The examples presented in the thesis and the relevant design results can provide a reference for the verification and comparison of other generative design algorithms.
Keywords/Search Tags:intelligent design method, truss layout, generative design, reinforcement learning, Markov decision process, Monte Carlo tree search, kernel regression
PDF Full Text Request
Related items