[`Research Project`] Add AnyText: Multilingual Visual Text Generation And Editing #8998

tolgacangoz · 2024-07-28T18:23:59Z

Thanks for the opportunity to fix #6407!

AnyText comprises a diffusion pipeline with two primary elements: an auxiliary latent module and a text embedding module. The former uses inputs like text glyph, position, and masked image to generate latent features for text generation or editing. The latter employs an OCR model for encoding stroke data as embeddings, which blend with image caption embeddings from the tokenizer to generate texts that seamlessly integrate with the background. We employed text-control diffusion loss and text perceptual loss for training to further enhance writing accuracy.

Paper: AnyText: Multilingual Visual Text Generation And Editing
Repository: https://github.com/tyxsspa/AnyText
Hugging Face Space: modelscope/AnyText

TODOs:
⏳ AuxiliaryLatentModule
✅ AnyTextControlNetModel -> Inherited and adapted from ControlNetModel. The only difference is that using Gylph Block, Position Block, and Fuse Block instead of input_hint_block or controlnet_cond_embedding from an ordinary ControlNet -ControlNetConditioningEmbedding is different. I deactivated the ControlNetConditioningEmbedding part and moved the new blocks into AuxiliaryLatentModule just to comply with the Figure.
⏳ AnyTextPipeline -> Adapted from StableDiffusionControlNetPipeline.
⏳ TextEmbeddingModule -> Replaces the encode_prompt() function. I may transfer what TextEmbeddingModule does into encode_prompt().
✅ convert_anytext_to_diffusers.py
⏳ Verify outputs with the original implementation
⏳ Finish HF integration & upload converted checkpoints to HF
⏳ README.md
⬜ Make it as simple as possible, but not simpler

This commit adds improvements to the modify_prompt method in the AnyTextPipeline class. The method now handles special characters and replaces selected string prompts with a placeholder. Additionally, it includes a check for Chinese text and translation using the trans_pipe.

…se of the transformers library`

…rdingly

…ConditioningEmbedding`

HuggingFaceDocBuilderDev · 2024-11-06T00:50:52Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

github-actions · 2024-11-30T15:04:15Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

…Model

…_anytext.py

…text_controlnet.py

… channels

tolgacangoz added 30 commits July 28, 2024 21:19

Add initial template

6e8088f

Second template

98c2d6e

feat: Add TextEmbeddingModule to AnyTextPipeline

867bbbf

feat: Add AuxiliaryLatentModule template to AnyTextPipeline

8818372

Merge branch 'main' into Add-AnyText

37c46d8

Add bert tokenizer from the anytext repo for now

64c63eb

Fill in the forward pass of AuxiliaryLatentModule

e9c688c

make style && make quality

42a41d0

`chore: Update bert_tokenizer.py with a TODO comment suggesting the u…

9d50f80

…se of the transformers library`

Merge branch 'main' into Add-AnyText

22aa69a

Update error handling to raise and logging

5e1e515

Add create_glyph_lines function into TextEmbeddingModule

2d10f0c

make style

bc197a9

Up

e52d8cc

Up

8c69d83

Up

4a413aa

Up

571608b

Merge branch 'main' into Add-AnyText

2b1b50d

Remove several comments

a7d025f

refactor: Remove ControlNetConditioningEmbedding and update code acco…

d2c5a65

…rdingly

Merge branch 'main' into Add-AnyText

c1f538c

Up

2607b6b

Up

a9fe4a0

up

567f553

refactor: Update AnyTextPipeline to include new optional parameters

a9991d0

Merge branch 'main' into Add-AnyText

e69c51e

up

91252e0

Merge branch 'main' into Add-AnyText

e54f876

feat: Add OCR model and its components

b9164e3

tolgacangoz and others added 15 commits September 27, 2024 18:57

Merge branch 'main' into Add-AnyText

44a3a70

Add attribution and adaptation information to pipeline_anytext.py

56992d1

Update usage example

7ad6865

Will refactor controlnet_cond_embedding initialization

a5edca5

Merge branch 'main' into Add-AnyText

48e88eb

Add AnyTextControlNetConditioningEmbedding template

2f42e40

Refactor organization

670fef5

style

930c37a

Merge branch 'main' into Add-AnyText

923da7b

style

21c0c35

Move custom blocks from AuxiliaryLatentModule to `AnyTextControlNet…

c4db96a

…ConditioningEmbedding`

Merge branch 'main' into Add-AnyText

e2e7160

Merge branch 'main' into Add-AnyText

4335ebd

Follow one-file policy

6bd0b4c

style

b3f98a7

github-actions bot added the stale Issues that haven't received updates label Nov 30, 2024

tolgacangoz added 3 commits December 13, 2024 18:23

Merge branch 'main' into Add-AnyText

cccf0f4

Merge branch 'main' into Add-AnyText

b5856a6

Merge branch 'main' into Add-AnyText

b04d015

github-actions bot removed the stale Issues that haven't received updates label Jan 1, 2025

tolgacangoz and others added 8 commits January 6, 2025 09:24

Merge branch 'main' into Add-AnyText

67f8839

Merge branch 'main' into Add-AnyText

d75508e

[Docs] Update README and pipeline_anytext.py to use AnyTextControlNet…

75a0f1f

…Model

[Docs] Update import statement for AnyTextControlNetModel in pipeline…

d3dcf57

…_anytext.py

[Fix] Update import path for ControlNetModel, ControlNetOutput in any…

963fac0

…text_controlnet.py

Merge branch 'main' of github.com:huggingface/diffusers into Add-AnyText

0c94143

Refactor AnyTextControlNet to use configurable conditioning embedding…

2b6f08b

… channels

Merge branch 'main' into Add-AnyText

971d6ad

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`Research Project`] Add AnyText: Multilingual Visual Text Generation And Editing #8998

[`Research Project`] Add AnyText: Multilingual Visual Text Generation And Editing #8998

tolgacangoz commented Jul 28, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Nov 6, 2024

github-actions bot commented Nov 30, 2024

[Research Project] Add AnyText: Multilingual Visual Text Generation And Editing #8998

Are you sure you want to change the base?

[Research Project] Add AnyText: Multilingual Visual Text Generation And Editing #8998

Conversation

tolgacangoz commented Jul 28, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Nov 6, 2024

github-actions bot commented Nov 30, 2024

[`Research Project`] Add AnyText: Multilingual Visual Text Generation And Editing #8998

[`Research Project`] Add AnyText: Multilingual Visual Text Generation And Editing #8998

tolgacangoz commented Jul 28, 2024 •

edited

Loading