Press "Enter" to skip to content

Dr. Shreyas Subramanian on three generative AI models that can be used for concept design

Concept design is the process of creating a preliminary design for a product or system. This process is often used in the early stages of a product’s development. During concept design, designers and other team members can brainstorm new ideas, or iterate on old concepts. Everything from the design of a new iPhone, to the next version of a car goes through the process of concept design. Although this interview focuses on the automotive industry and car design, the same concepts can be applied to other products in other industries as well.

Car design goes through several stages, from concept art in the form of hand-drawn sketches, 3D rendering and modeling, and finally clay modeling and digital sculpting and fine tuning. For example, see these articles from BMW and Skoda:

Meanwhile, we are in the midst of a generative design revolution, and car companies are beginning to use some form of generative design to redesign cars, through software that already implements some AI models. Generative AI is a rapidly evolving technology that has the potential to transform the way in which we design things. Although the state of the art in generative AI is still in its early stages, we see several, extremely impressive examples of how it can be used to create stunningly realistic designs. In this article, we will look at some ways that Generative AI can be used to design cars, with the help of Shreyas Subramanian, who works in the field of generative design, optimization and AI.

Hi Dr. Subramanian, this is a very interesting topic. Before we dive in, can you tell us a little bit about yourself?

“Of course! Thank you for this opportunity. My name is Shreyas Subramanian, and I currently work on Machine Learning projects at Amazon. My research areas over the past few years have been around large scale Machine Learning and Optimization. During my PhD, I applied Machine Learning techniques to the field of optimization, and applied it to solving multi-objective optimization resulting in more efficient aircraft designs.

That sounds fascinating! How has your work changed since, and what is your focus now?

What I did during my PhD was a good foundation for my current work in applying AI/ML techniques to optimization, and for solving large scale problems. One topic I am still very passionate about is concept design, and how we can​​ AI to design things. Design is a very complex process, and people who do this for a living are extremely talented. Currently I am interested in seeing how generative design techniques can help in this process.

What is the state-of-the-art in generative design today?

Several research and production applications today use generative design techniques, but today, we are at the point where AI can already be applied to this problem. Generative design techniques are an excellent way to automate parts of this process, and assist human designers in generating, pruning and improving ideas. Most designs start as a sketch, either on paper or digitally. Most of the time, the best solutions are those that evolve from sketches to something more refined. The intersection to AI is what you may have seen already with work from leading institutions like OpenAI and StabilityAI on generative AI models. These models can take a text input, and generate an image of a concept.

How exactly does that work?

Deep Learning models can take a set of inputs (here the input is text), and be trained to generate images. The process of “training” is an iterative process of feeding in thousands of text-image pairs to the model, changing parameters of the model to make the output image match the ones in the dataset, and evaluating how close the model’s output is to the input, ground-truth images. The model then learns how to produce images that look like the ones in the ground truth dataset. From this trained model, we can now take a new set of text or “prompts” as inputs and generate new images that correspond to the input text. I’m leaving out a lot of details here, but these concepts are fairly well described in literature on deep learning.

Can you show me an example? How about a model that can be trained to generate car images?

Sure! In fact, there are a few models that are generic enough to be useful for a variety of use-cases including generating car design images. For instance, let’s take a look at Stable Diffusion (https://stability.ai/blog/stable-diffusion-public-release). Stable diffusion is a versatile generative model that can be used to generate images with either pure text inputs, or text and image inputs. It uses a combination of components from other models to achieve this effect – 1) ClipText is a component that generates a text embedding from your input prompt, 2) A Unet model with a custom scheduler is used to successively generate better images by predicting noise amounts (this is the ‘diffusion’ process in Stable Diffusion), and 3) finally an image decoder component generates the image. If you are interested in the details, reading blogs about Stable diffusion is a great start. Let’s try the pure text input version of Stable Diffusion. When we feed Stable Diffusion the prompt “Realistic concept car”, here are some example images that are generated:

Shreyas Subramanian

Wow, those look great! It’s amazing to see prompt text being used to generate images like this. However each of those images look very different from each other. Is this expected?

Yes! The images that are generated by a model like Stable diffusion will always be different. You could generate 10s of 1000s of images and each of those will be different. I don’t recommend doing that though; you could go down a rabbit hole of image generation for days!

What if I wanted to generate something more specific than these random images?

That’s exactly what researchers are working on today – constrained image generation. You give the model some constraints and the model will try to adhere to them, and the resulting images will look like they fit within those constraints. For example, you can mention a color palette that you like and a model will generate images that adhere to the color palette. You can add features to images, or change its theme, or even ask the model to generate an image in the style of a famous artist.

What if I don’t know what to type in to get the desired result?

Well, there’s a model for that (just like ‘there’s an app for that”). Typically you can prompt engineer what you want the model to generate, and it is normal to have multiple rounds of iterations until you get the desired result. The data used to create these kinds of model have well written prompts, and there are other text-to-text models that can be used to generate prompts based on your inputs. For example, using a model like MagicPrompt, you can generate descriptive prompts for the images you want to generate (Try it out in this Huggingface space – https://huggingface.co/Gustavosta/MagicPrompt-Stable-Diffusion)

For the input we used earlier “Realistic concept car”

We get the following outputs as examples:

  • realistic concept car slick 5 0 mm, photorealistic, octane render, 8 k, unreal engine, ultradetailed, cinematic
  • realistic concept car 3 d render sci – fi car in the coronation of napoleon painting and digital billboard with point cloud in the middle, unreal engine 5, keyshot, octane, artstation trending, ultra high detail, ultra realistic, cinematic, 8 k, 1 6 k, in style of zaha hadid, in style of nanospace michael menzelincev, in style of lee souder, blade runner 2 0 4 9 colors, in plastic, dark, tilt shift, depth of field,
  • realistic concept car design : snufkin sashimi, dieselpunk, high – contrast, composition, bold natural volumetric golden rim light, runes, art deco, art nouveau, sticker illustration
  • realistic concept car 6 4 0 / 2 mile tall intricate chrome mass effect robotic electric vehicle with ( ( city skyline ) ) + mountains, foggy storm clouds, sunset, futuristic apocalyptic city architecture by rutkowski, artgerm, jeremy lipkin and michael garmash and rob rey, by tristan eaton, stanley kubrick, tom bagshaw, greg rutkowski, carne griffiths

One image generated corresponding to these inputs is:

Interesting! That imageInteresting! That image is definitely more detailed. How can you take this further? Can you add even more constraints to get something more specific?

Absolutely, here’s where things can get more interesting. You can use a model like Instruct-Pix2Pix to modify an input image with instructions. Instruct-Pix2Pix is a generative model that can be used to edit images (see the original paper for more details – https://arxiv.org/abs/2211.09800). The training data used for this model is itself generated from a large language model (GPT-3) and a text-to-image generation model (Stable diffusion). This allows for a conditional diffusion model that is capable of changing input images according to the instructions provided. Let’s say we started with the detailed concept car image above as input:

Using Instruct-Pix2Pix

Using Instruct-Pix2Pix, here is what the output would look like with different instruction prompts:

 

Prompt Result
Make it black
Paint it ferrari red Paint it ferrari red
Make it more like a ferrari LaFerrari Make it more like a ferrari LaFerrari
Make it a black and yellow Lamborghini Make it a black and yellow Lamborghini
Add a realistic nature background Add a realistic nature background
Make it a pencil sketch Make it a pencil sketch

Those are some very powerful examples of generative AI for design in action. Do you have any final thoughts? What are you going to work on next?

It has been a wonderful experience talking to you about generative AI for design. I hope this inspires others to use these tools for their own conceptual design work. For those interested in generative AI, I highly recommend following the work of leaders in this field like OpenAI, StabilityAI, Meta, Microsoft Research, and Amazon Science. The impact of this work goes beyond car design; from game design and art to media, law, synthetic data generation, code generation and more.

As for me, I am going to continue working on large scale Deep Learning, Optimization techniques, and help as many customers with their AI use cases as possible. This lets me learn and continuously improve my skills.

Thank you Dr. Subramanian! I definitely learned a lot from our conversation here and I am sure our readers will as well.

Follow Dr. Shreyas Subramanian’s work on his project page – https://projectshreyas.org/