-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to modify DALL·E 3 generated images #723
Comments
After reviewing the DALL·E API documentation, I’d like to share some insights and limitations regarding the requested enhancement:
Given these limitations, it seems that what the enhancement requests (e.g., modifying images directly using seed IDs or without requiring manual upload/masks) is not currently supported by the DALL·E API. Would love to hear your thoughts or suggestions on how we could approach this, considering the current API constraints. |
In addition to the points @faisal-alvi makes, looking at the API documentation again, I think the biggest issue here is that DALL·E 3 only supports image generation, not edits or variations (these are only supported by DALL·E 2 which we no longer support as of #717). So until this is changed by OpenAI, I don't think there is any action we can take here. That said, I do think there are additional improvements we can make to the image generation process. I've documented those in a separate issue here: #440 (comment) and assigned that to @faisal-alvi |
Is your enhancement related to a problem? Please describe.
With the update in #717, we can now start to take advantage of some of the enhancements in the DALL·E 3 model. Let's look to extend our
Generate images
modal (e.g. /wp-admin/upload.php?action=classifai-generate-image) to add in aModify image
action link that further accepts textarea input which is then sent back to DALL·E 3.When we receive the originally generated image(s) back from DALL·E 3 we'll want to ensure we capture the seed ID for each as that'll be needed for the modification step. We'll capture the desired modifications from the user via a textarea, then send that back to DALL·E 3 (e.g. "Modify image [#] with seed [seed-ID]: "), and then render the newly modified image(s) in place of the first image(s).
Designs
In this rough mockup, I'm recommending we first tweak our primary button and secondary action link such that the first two options on generated images are
[Insert to Post]
and then_Import to Media Library_
. A third option here for_Modify image
, when clicked, would present a textarea (with "add / remove / change image by..." placeholder text to help guide the user on what to enter) and a[Generate modifications]
button. Clicking that button would send the image seed ID and described modifications back to DALL·E 3 and then render in place of the prior image(s) with the newly modified image(s).Describe alternatives you've considered
There's some consideration for what we do with the original image prompt and modification text (especially if there are multiple modifications made). We currently show the original prompt in the initial textarea in the
Generate images
modal, so perhaps we continue to amend that text with the modification text? In terms of the alt text that's added to any imported/inserted generated image we could attempt to include that same prompt+modification(s) text (or albeit more lazily / easily stick with the original prompt text).Code of Conduct
The text was updated successfully, but these errors were encountered: