Add ability to modify DALL·E 3 generated images #723

jeffpaul · 2024-02-21T22:41:59Z

Is your enhancement related to a problem? Please describe.

With the update in #717, we can now start to take advantage of some of the enhancements in the DALL·E 3 model. Let's look to extend our Generate images modal (e.g. /wp-admin/upload.php?action=classifai-generate-image) to add in a Modify image action link that further accepts textarea input which is then sent back to DALL·E 3.

When we receive the originally generated image(s) back from DALL·E 3 we'll want to ensure we capture the seed ID for each as that'll be needed for the modification step. We'll capture the desired modifications from the user via a textarea, then send that back to DALL·E 3 (e.g. "Modify image [#] with seed [seed-ID]: "), and then render the newly modified image(s) in place of the first image(s).

Designs

In this rough mockup, I'm recommending we first tweak our primary button and secondary action link such that the first two options on generated images are [Insert to Post] and then _Import to Media Library_. A third option here for _Modify image, when clicked, would present a textarea (with "add / remove / change image by..." placeholder text to help guide the user on what to enter) and a [Generate modifications] button. Clicking that button would send the image seed ID and described modifications back to DALL·E 3 and then render in place of the prior image(s) with the newly modified image(s).

Describe alternatives you've considered

There's some consideration for what we do with the original image prompt and modification text (especially if there are multiple modifications made). We currently show the original prompt in the initial textarea in the Generate images modal, so perhaps we continue to amend that text with the modification text? In terms of the alt text that's added to any imported/inserted generated image we could attempt to include that same prompt+modification(s) text (or albeit more lazily / easily stick with the original prompt text).

Code of Conduct

I agree to follow this project's Code of Conduct

The text was updated successfully, but these errors were encountered:

faisal-alvi · 2025-01-29T13:55:30Z

@jeffpaul @dkotter 👋🏻

After reviewing the DALL·E API documentation, I’d like to share some insights and limitations regarding the requested enhancement:

No Seed ID for Generated Images
- Currently, the DALL·E API does not provide any seed ID with generated images. Even if seed IDs were available, I couldn’t find any endpoint in the documentation that allows modifications to a generated image using seed IDs.
Existing Image Modification Options
- However, the API provides a way to edit an existing image using inpainting, but it has specific requirements:
  - The image to edit must be a valid PNG file, less than 4MB, and square.
  - The mask must define the transparent areas where changes should be applied (e.g., fully transparent areas in the mask indicate the editable parts). This mask must also be a valid PNG, less than 4MB, and have the same dimensions as the image.
  - Along with the image and mask, the user can provide a new prompt specifying changes for the transparent areas.
This approach works only for partial edits (inpainting) and is relatively limited in terms of freeform modifications.
Image Variations
- Another available option is generating variations of an image. This requires the original image to be passed as input and results in new variations without user-provided prompts. This is not suitable for implementing user-driven modifications since it doesn’t allow specifying changes.
Iteration Through Prompts
- The last option is iterating through prompts. However, this isn’t a new feature, it’s essentially what users can already do by editing the prompt and clicking the existing generate button.

Given these limitations, it seems that what the enhancement requests (e.g., modifying images directly using seed IDs or without requiring manual upload/masks) is not currently supported by the DALL·E API.

Would love to hear your thoughts or suggestions on how we could approach this, considering the current API constraints.

dkotter · 2025-01-29T17:30:39Z

In addition to the points @faisal-alvi makes, looking at the API documentation again, I think the biggest issue here is that DALL·E 3 only supports image generation, not edits or variations (these are only supported by DALL·E 2 which we no longer support as of #717). So until this is changed by OpenAI, I don't think there is any action we can take here.

That said, I do think there are additional improvements we can make to the image generation process. I've documented those in a separate issue here: #440 (comment) and assigned that to @faisal-alvi

jeffpaul added the type:enhancement label Feb 21, 2024

jeffpaul added this to the 3.1.0 milestone Feb 21, 2024

jeffpaul added this to Open Source Practice Feb 21, 2024

github-project-automation bot moved this to Incoming in Open Source Practice Feb 21, 2024

jeffpaul moved this from Incoming to To Do in Open Source Practice Feb 21, 2024

Sidsector9 self-assigned this Mar 19, 2024

Sidsector9 moved this from To Do to In Progress in Open Source Practice Mar 19, 2024

Sidsector9 removed their assignment Apr 1, 2024

Sidsector9 moved this from In Progress to To Do in Open Source Practice Apr 1, 2024

dkotter modified the milestones: 3.1.0, 3.2.0 Jul 12, 2024

dkotter modified the milestones: 3.2.0, 3.3.0 Dec 11, 2024

jeffpaul removed the type:enhancement label Jan 7, 2025

dkotter mentioned this issue Jan 29, 2025

OpenAI Image Generation Tweaks #440

Closed

1 task

dkotter modified the milestones: 3.3.0, 3.4.0 Feb 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to modify DALL·E 3 generated images #723

Add ability to modify DALL·E 3 generated images #723

jeffpaul commented Feb 21, 2024

faisal-alvi commented Jan 29, 2025 •

edited

Loading

dkotter commented Jan 29, 2025

Add ability to modify DALL·E 3 generated images #723

Add ability to modify DALL·E 3 generated images #723

Comments

jeffpaul commented Feb 21, 2024

Is your enhancement related to a problem? Please describe.

Designs

Describe alternatives you've considered

Code of Conduct

faisal-alvi commented Jan 29, 2025 • edited Loading

dkotter commented Jan 29, 2025

faisal-alvi commented Jan 29, 2025 •

edited

Loading