To excel at engineering design, generative AI must learn to innovate

From MIT News:

ChatGPT and other deep generative models are proving to be uncanny mimics. These AI supermodels can churn out poems, finish symphonies, and create new videos and images by automatically learning from millions of examples of previous works. These enormously powerful and versatile tools excel at generating new content that resembles everything they’ve seen before.

But as MIT engineers say in a new study, similarity isn’t enough if you want to truly innovate in engineering tasks.

“Deep generative models (DGMs) are very promising, but also inherently flawed,” says study author Lyle Regenwetter, a mechanical engineering graduate student at MIT. “The objective of these models is to mimic a dataset. But as engineers and designers, we often don’t want to create a design that’s already out there.”

He and his colleagues make the case that if mechanical engineers want help from AI to generate novel ideas and designs, they will have to first refocus those models beyond “statistical similarity.”

“The performance of a lot of these models is explicitly tied to how statistically similar a generated sample is to what the model has already seen,” says co-author Faez Ahmed, assistant professor of mechanical engineering at MIT. “But in design, being different could be important if you want to innovate.”

In their study, Ahmed and Regenwetter reveal the pitfalls of deep generative models when they are tasked with solving engineering design problems. In a case study of bicycle frame design, the team shows that these models end up generating new frames that mimic previous designs but falter on engineering performance and requirements.

When the researchers presented the same bicycle frame problem to DGMs that they specifically designed with engineering-focused objectives, rather than only statistical similarity, these models produced more innovative, higher-performing frames.

The team’s results show that similarity-focused AI models don’t quite translate when applied to engineering problems. But, as the researchers also highlight in their study, with some careful planning of task-appropriate metrics, AI models could be an effective design “co-pilot.”

“This is about how AI can help engineers be better and faster at creating innovative products,” Ahmed says. “To do that, we have to first understand the requirements. This is one step in that direction.”

. . . .

As Ahmed and Regenwetter write, DGMs are “powerful learners, boasting unparalleled ability” to process huge amounts of data. DGM is a broad term for any machine-learning model that is trained to learn distribution of data and then use that to generate new, statistically similar content. The enormously popular ChatGPT is one type of deep generative model known as a large language model, or LLM, which incorporates natural language processing capabilities into the model to enable the app to generate realistic imagery and speech in response to conversational queries. Other popular models for image generation include DALL-E and Stable Diffusion.

Because of their ability to learn from data and generate realistic samples, DGMs have been increasingly applied in multiple engineering domains. Designers have used deep generative models to draft new aircraft frames, metamaterial designs, and optimal geometries for bridges and cars. But for the most part, the models have mimicked existing designs, without improving the performance on existing designs.

“Designers who are working with DGMs are sort of missing this cherry on top, which is adjusting the model’s training objective to focus on the design requirements,” Regenwetter says. “So, people end up generating designs that are very similar to the dataset.”

In the new study, he outlines the main pitfalls in applying DGMs to engineering tasks, and shows that the fundamental objective of standard DGMs does not take into account specific design requirements. To illustrate this, the team invokes a simple case of bicycle frame design and demonstrates that problems can crop up as early as the initial learning phase. As a model learns from thousands of existing bike frames of various sizes and shapes, it might consider two frames of similar dimensions to have similar performance, when in fact a small disconnect in one frame — too small to register as a significant difference in statistical similarity metrics — makes the frame much weaker than the other, visually similar frame.

Link to the rest at MIT News