I am currently a post-doctoral researcher exploring the integration of large language models into engineering design grammars to enhance the adaptability and efficiency of the design process. This could significantly reduce the manual effort required in grammar generation and design synthesis.
Leveraging Design Grammars and LLMs for Automated Engineering Design
I am investigating how large language models (LLMs) can develop design grammars from high-level task specifications. Specifically, I leverage existing grammar metrics like validity, matching, and diversity ratios to iterate on and refine the generation of graph and string grammar rules. Currently, the approach separates grammar generation from design exploration, but future work aims to create a multi-agent LLM framework in which a grammar generation agent and a design synthesis agent collaborate to develop and apply grammars to comprehensive system design.Β
Multi-fidelity RL frameworks rely on hierarchical models, ignoring heterogeneous error distributions across the design space. We introduce ALPHA (Adaptively Learned Policy with Heterogeneous Analyses), a framework that dynamically leverages non-hierarchical, heterogeneous low-fidelity models alongside a high-fidelity model. Specifically, low-fidelity policies and their experience data are dynamically used for efficient targeted learning, guided by their alignment with the high-fidelity policy.
Reinforcement Learning faces challenges in learning stability when complex engineering simulations compose the reward function. This diminishes the practicality of RL for configuration design. To address this challenge, this work integrates configuration design heuristics in a deep RL framework to enhance stability and efficiently converge to high performance solutions. Specifically, we shape the reward based on symmetry, a deep-rooted heuristic that is widely applicable and frequently used in engineering design practice.
Multi-agent Reinforcement Learning frameworks for job scheduling and navigation control of autonomous mobile robots are becoming increasingly common as a means of increasing productivity in manufacturing. Centralized and decentralized frameworks have emerged as the two dominant archetypes for these systems. However, the tradeoffs of these competing archetypes in terms of efficiency, stability, robustness, accuracy, generalizability, and scalability are not well-understood. This work investigates the time efficiency, learning stability, and robustness to operational disruptions of an exemplar decentralized RL framework in comparison to a centralized RL framework.
RL algorithms can autonomously learn to search a design space for high-performance solutions. However, modern engineering often entails the use of computationally intensive simulation, which can lead to slower design timelines. This work provides an RL framework that leverages models of varying fidelity to enable an effective solution search while reducing overall computational needs. High-quality solutions are achieved with substantial reductions in computational expense, showcasing the effectiveness of the framework for design problems where the use of just a high-fidelity model is infeasible.
The design of cyber-physical systems often involves search within high-dimensional design spaces. When evaluating the performance of algorithms in tasks such as these, the patterns of exploration are often informative and can help support algorithm selection. However, accurately representing these patterns in a way that is human understandable while still preserving the nuanced search complexities in the high-dimensional space is nontrivial. This work specifically examines approaches for visualizing the search trajectories of reinforcement learning agents. Specifically, we compare and contrast the visualizations produced using PCA, t-SNE, UMAP, TriMap, and PaCMAP.
tait_example_video.mp4Understanding constraint curricula for safe reinforcement learning
Safe reinforcement learning incorporates safety by formulating constraints through fixed cost limits that guide policy exploration. Motivated by how humans have a variable error tolerance while learning tasks, we introduce a novel curriculum on the cost limits to train agents using the PPO-Lagrangian algorithm. Empirical results using the MetaDrive driving simulator show that some cost curriculum strategies lead to faster reward convergence while maintaining a low episodic cost. Further, we introduce additional metrics and compare the qualitative performance of different cost curriculum strategies. The cost curriculum has the potential of altering policy behavior that could be suitable for self driving scenarios.