-
Notifications
You must be signed in to change notification settings - Fork 110
Home
ailm32442型机器人 edited this page Nov 13, 2024
·
5 revisions
- You can directly input system prompts and user prompts on the node, or use
system prompt input
anduser prompt input
to input, accepting string type inputs.system input
is generally used to mount mask nodes. Essentially, it is no different from an input box. - The large model node can also accept the output of tool nodes from the
tools
interface and accept string inputs from thefile_content
interface. These inputs will be used as the model's knowledge base, and relevant content will be searched and input into the model based on word vector similarity. - The
is_memory
of the large model node can determine whether the large model has memory. You can changeis_memory
todisable
, then run it, and the model will clear the previous conversation records. Switch back toenable
, and the model will retain your conversation records in subsequent runs. - You can view the model's response in the current round of conversation through
assistant_response
, and you can also view the history of multiple rounds of conversation throughhistory
. - Even if external parameters remain unchanged, the large model node will always run because the large model always has different answers to the same question.
- Input:
-
is_tools_in_sys_prompt
: Determines whether the information of tools will be input into the system prompt. If input into the system prompt, it can unlock tool capabilities for some models without tool capabilities. -
is_memory
: When enabled, the LLM gains memory. If not enabled, it will clear memory each time and start over. -
is_locked
: When you do not change any parameters, it directly returns the result of the last conversation, saving computing power and stabilizing the output results of the LLM. -
main_brain
: Determines whether the large model is the model interfacing with the user. When disabled, the LLM node can be used as a tool for another LLM node. -
conversation_rounds
: Determines the number of conversation rounds for the LLM. When the number of rounds is exceeded, only the most recent conversation rounds will be read. -
historical_record
: Can load previous conversation records into the LLM to continue the last chat. -
tools
: Input is the tool call interface of the LLM, and tool output is the interface for using this LLM node as a tool, generally not used. -
Imgbb api key
is optional. If you use visual functions and do not fill in this key, it will be transmitted to OpenAI in base64 encoding. If you add a key, it will generate a URL after uploading to the image bed and pass the URL to OpenAI. Not filling it will not affect usage, but it will affect the readability of the conversation record.
-
- Output:
-
assistant_response
: Text output of the LLM -
history
: Conversation records of the LLM -
Tool
: Enabled when the LLM is used as a tool for another LLM. In most cases, it can be ignored. -
Image
: Under construction, useful in the future.
-
- The LLM adapts to GPT-4's visual functions. You can input the
imgbb_api_key
into the imgbb API key. After filling it in, your image will be passed to GPT in URL format. If not filled, it will be passed in image encoding format. - The large model node can customize the model name, API_KEY, and base_url. Currently, only OpenAI type API interface calls are supported. It can be combined with One API to connect to any large model API.
- The local LLM loader node has been greatly adjusted, and you no longer need to choose the model type yourself. The
llava
loader node andGGUF
loader node have been re-added. The model type on the local LLM model chain node has been changed to three options:LLM
,VLM-GGUF
, andLLM-GGUF
, corresponding to directly loading LLM models, loading VLM models, and loading GGUF format LLM models. Support for VLM models and GGUF format LLM models has been restored. Now local calls can be compatible with more models! Example workflows:LLM_local
,llava
,GGUF
. - Fill in the model's project folder in
model_name_or_path
, compatible with all models that can be compatible with transformers. You can also fill in the repo ID on Hugging Face to directly pull the model. - The remaining parameters are consistent with the API LLM nodes.
- ckpt_path and clip_path should be filled with the absolute paths of the LLM’s GGUF file and the CLIP’s GGUF file, respectively.
-
max_ctx
is the maximum context length of the LVM model. If this length is exceeded, the model will automatically truncate. -
gpu_layers
is the number of layers of the LVM model on the GPU. -
n_threads
is the number of threads of the LVM model on the CPU.
- Same as above, but only the absolute path of the LLM’s GGUF file needs to be filled in.
- Similar to the LLM local loader, but it only supports models like llama3.2-vision. When using this node to load, you need to set the model type on the model chain to LVM (testing). This loader is still in testing and cannot adapt to many models.
- The
file_content
node can input a string, which will be used as the input of the word embedding model. The model will search this string and return the most relevant text content based on the question. -
k
is the number of paragraphs returned.chuck_size
is the size of each text block when splitting the text, with a default of 200.chuck_overlap
is the overlap size between each text block when splitting the text, with a default of 50. - Input
embedding_path
to call the word embedding model in this folder.
- The path to read the file is in comfyui_LLM_party/file, you can put the file you want to read in this path, and then fill in the file name in this node.
- You can choose absolute path input, in which case path can accept an absolute path.
- The output is a string that contains all the text information in the file.
- The adapted file formats are: ".docx", ".txt", ".pdf", ".xlsx", ".csv", ".py", ".js", ".java", ".c", ".cpp", ".html", ".css", ".sql", ".r", ".swift"
- folder_path can accept an absolute path of a folder, and this node will automatically read all the files in the folder.
- The output is a string that contains all the text information in the folder.
- The adapted file formats are: ".docx", ".txt", ".pdf", ".xlsx", ".csv", ".py", ".js", ".java", ".c", ".cpp", ".html", ".css", ".sql", ".r", ".swift"
- Can convert all web page content in a url into an md format output.
- The output is a string that contains all the text information on the web page.
- Can return all content related to the question in Wikipedia.
- Load model names from
config.ini
to facilitate model selection on the loader.
- Return the most relevant text paragraphs based on the question.
-
k
is the number of paragraphs returned.chuck_size
is the size of each text block when splitting the text, with a default of 200.chuck_overlap
is the overlap size between each text block when splitting the text, with a default of 50.
- Return the content of the Excel file row by row, facilitating row-by-row processing. Each time
comfyui
runs, it will return the next row's content. - Can be used with
comfyui
's auto-execution to iteratively batch process your work. - Input the absolute path of the Excel file you want to process in
path
. -
is_reload
determines whether to reset the returned row count.
- Segment the input content and return it segment by segment, facilitating segment-by-segment processing. Each time
comfyui
runs, it will return the next segment's content. - Can be used with
comfyui
's auto-execution to iteratively batch process your work. - Input the text content you want to process in
file_content
. -
is_reload
determines whether to reset the returned segment count.
- Return images from the folder specified in
folder_path
one by one. - Can be used with
comfyui
's auto-execution to iteratively batch process your work. -
is_reload
determines whether to reset the returned index. - Supports image formats ".png", ".jpg", ".jpeg", ".gif", ".bmp".
- Input your
google_api_key
andcse_id
to use this node to search for relevant content based onkeyword
. - Control
paper_num
to turn pages and view later search results. - In web search mode, this node will return the top 10 URLs and summaries from Google search.
- In image search mode, this node will return the top 10 image URLs from Google search.
- Input your
bing_api_key
to use this node to search for relevant content based onkeyword
. - Control
paper_num
to turn pages and view later search results. - In search mode, this node will return the top 10 URLs and summaries from Bing search.
- In image search mode, this node will return the top 10 image URLs from Bing search.
- You can use this persona node as the system_prompt_input of the LLM node, allowing the large model to have the personality of the persona.
- LLM will classify user_prompt according to the categories described on the classifier persona node.
- Can be used in conjunction with classifier functions to output different categories of text to different workflows.
- prompt is the system_prompt_input that will be input into the LLM node, which can contain some variables, such as: "You are an intelligent customer service about {app}"
- prompt_template contains the corresponding rules for the variables in the prompt, generally in json format, which can be filled in as follows: {"app":"chatgpt"}, at this time, {app} in the prompt will be automatically replaced with chatgpt.
- Can return a preset persona persona, which can be used as the system_prompt_input of the large model, allowing the large model to have the personality of the persona.
- The persona folder contains the persona of the image prompt assistant and DAN. You can add more personas to this folder for your use.
- Translates
language_A
tolanguage_B
, translating theuser_prompt
into the corresponding language and then returning the translated content. -
tone
is the tone, which can be freely specified. -
degree
is the degree of translation, ranging from 0 to 10, with the tone becoming progressively stronger.
- Can split the string processed by the LLM with a classifier persona into multiple strings, which can be used in conjunction with string logic to control the execution of the corresponding workflow.
- A model with a high level of intelligence is required to achieve stable classification output. The author of this node does not recommend using it and suggests using string logic and string extraction nodes instead.
- option contains the following options: "A contain B", "A not contain B", "A relate to B", "A not relate to B", "A equal B", "A not equal B", "A is null", "A is not null" for selection.
- When the condition is true, if will output A string, else will output an empty string, is_true will output true, is_false will output false, otherwise else will output A string, if will output an empty string, is_true will output false, is_false will output true.
- Can directly display the input string on the comfyui interface.
- Send the input string to WeCom, DingTalk, or Feishu. You need to configure the webhook addresses for Feishu, DingTalk, and WeCom in advance.
- Extract the desired content from the input string.
substring
will return the content between the first occurrence ofstart_string
andend_string
. -
remaining_string
will return the other characters after removing the content between the first occurrence ofstart_string
andend_string
. - You can reuse this node to repeatedly extract multiple substrings that meet the criteria from the string.
- Convert the input string to speech. You need to configure the OpenAI
api_key
in advance. -
voice
can control the voice tone. You can refer to the official OpenAI documentation.
- Play the input audio file. The input is the absolute path of an audio file.
- Convert the code output by the Omost model into conditioning and mask.
-
mode
includes three different fusion modes:greedy
is the greedy algorithm,fusion
first fuses conditioning and then fuses the mask by averaging according to weights before decoding, andblock
first fuses conditioning and mask by blocks before decoding. -
strength
is the weight of the mask.
- Facilitate users to output relevant code according to their needs. The output can be copied into the code generated by Omost for replacement.
- Press
press_key
to start listening to audio. Pressrelease_key
to stop listening and return the path of the recorded audio file.
- Convert the input audio file to text. You need to configure the OpenAI
api_key
in advance.
- The node will replace
old_string
ininput_string
withnew_string
and return the replaced string.
- Convert the input string to speech. Although it is a free interface, it only supports Chinese and is prone to errors, so it is not recommended.
- Fill in the URL of the website to be accessed.
- Fill in the
api_key
of the API. - Fill in the parameters of the API. You need to use the parameter dictionary function to input.
- Convert the input
key
andvalue
into a dictionary to facilitate the use of the API function. -
value
can be a string, dictionary, or list. You can construct any JSON dictionary you want by combining the list and dictionary-related nodes in the combination node.
- Convert the input text to string output.
- Connect the two
any
interfaces to any connection line in the workflow. - The node will exit the model from memory when it is executed.
1. Node Input
Input Name | Description |
---|---|
text | Text to be converted to speech |
model_path | Path to the TTS model |
save_path | Path to save the audio file |
seed | Seed for fixed voice tone |
temperature | Effect similar to LLM |
top_P | Effect similar to LLM |
top_K | Effect similar to LLM |
enableRefine | Whether to enable optimization |
oral_param | Parameter to control the degree of orality when optimization is enabled |
laugh_param | Parameter to control the degree of laughter when optimization is enabled |
break_param | Parameter to control the length of pauses when optimization is enabled |
is_enable | Whether to enable this node |
load_mode |
HF: Download model from Huggingface custom: Call model from model_path local: Directly call the model file in the current workspace path (usually the root directory of ComfyUI) |
2. Node Output
Output Name | Description |
---|---|
audio | Path to save the audio file |
- show_json_file: Read the JSON file and output it as a string
- value_by_key: Get the value corresponding to the key in the JSON file by setting the parameter key
Get the value corresponding to the key in the string by setting the parameter key (the string must be in JSON format, otherwise it cannot be parsed)
The node will split the string based on the input character sep and convert the text to a JSON formatted string
Note: There are two modes for reading history 1. auto: Automatically extract messages sent between two runs of this node. If it keeps calling in this mode and the developer backend call passes, it may be because your local time is not synchronized 2. fixed_time_diff: Extract messages within the time_diff_sec before the time of running this node
Prerequisite Work
- Create a Feishu Application
- Create a self-built application at https://open.feishu.cn/app?lang=zh-CN
- Enter the application to get app_id and app_secret
- Add application capabilities -> Enable bot application capabilities
- Permission management -> Messages and groups -> Select the required permissions to open
- Security settings -> Add the IP of the computer running ComfyUI to the whitelist
- Publish the bot to make the application effective
- Get group or user id
- First, add the created bot to the group or private chat
- Find the developer documentation for sending messages on the Feishu development platform
- Click to get the token on the right
- Select receive id type, chat_id corresponds to the group, open_id and user_id correspond to individuals, click to select members, and copy the corresponding id
- If you need the bot to send voice messages, you need to install ffmpeg on your computer
- Used to combine multiple strings into one string.
- These combination nodes can be nested.
- Used to combine multiple tool nodes into one tool node, then input to the large model.
- These combination nodes can be nested.
- Used to combine multiple parameter dictionary functions into one parameter dictionary function, then input into the API function.
- These combination nodes can be nested.
- Used to combine multiple elements into one list, then input into the parameter dictionary function.
- These combination nodes can be nested.
- Used to combine multiple lists into one list, then input into the parameter dictionary function.
- These combination nodes can be nested.
- Used for querying time and weather. The time tool node can change the default time zone for queries, and the weather tool node will also add an option to change the default region in the future (this free weather tool is limited to searching for weather in China).
- The AccuWeather tool node requires an AccuWeather API key to query global weather.
- You can use this node by entering your Google API key and CSE ID.
- This node will return the top 10 URLs and summaries from Google search. You can ask the model to paginate to see later search results.
- You can enter the URL you want to search into this node as the default URL for the model's search. This node will convert all content of this webpage into Markdown format and return it to the model.
- Since requests are not omnipotent, some URLs may not allow crawling, and this project does not provide malicious crawler code.
- You can input an embedding_path, which will call the word embedding model in this folder. The tool will split the webpage content according to chuck_size and chuck_overlap, returning only the text information related to the user's question.
- If there is no embedding_path, it will return all content of the webpage.
- The specific function is the same as the word embedding model node, but this node is a tool for LLM, which can be called when the model thinks it needs to query the knowledge base, using the file_content input into this tool.
- Allows the large model to generate Python code, run it automatically, and obtain the results of the code execution.
- Currently, only supports Python code.
- Allows the large model to do anything. The large model will download the required third-party libraries in a virtual environment and then execute the generated code.
- Please use this tool carefully, as the large model will gain the ability to control your computer to do anything!
- Fill in the URL that needs to be accessed.
- The first text input box is for entering the function of this API tool, for example: a tool for checking the weather.
- The second text input box is for entering the parameters of this API tool, for example: the parameters for the weather-checking tool are: {"city":"beijing"}
- Allows the large model to query content related to Wikipedia.
- You can input an embedding_path, which will call the word embedding model in this folder. The tool will split the Wikipedia content according to chuck_size and chuck_overlap, returning only the text information related to the user's question.
- If there is no embedding_path, it will return the first 1000 characters related to Wikipedia.
- Allows the large model to query relevant research papers on arXiv.
- The default search is for the research direction in the query.
- Use a
comfyui
workflow as a tool. - This node is under construction and may not work properly.
- Allow the large model to access GitHub repositories. You need to input the GitHub
APIkey
.
- Allow the large model to send messages to WeCom, DingTalk, and Feishu. You need to configure the webhook addresses for Feishu, DingTalk, and WeCom in advance.
- Allow the large model to access the database using keyword technology for quick RAG.
-
k
is the number of paragraphs returned.chuck_size
is the size of each text block when splitting the text, with a default of 200.chuck_overlap
is the overlap size between each text block when splitting the text, with a default of 50.
- Interact with JSON files in the
file
folder to achieve text-based interactive games.
- Allow the LLM to interact with the knowledge graph.
- The developer version can query, add, delete, and modify entities and relationships in the knowledge graph. The regular version can only query entities and relationships in the knowledge graph.
- The user version can only query entities and relationships in the knowledge graph and cannot add, delete, or modify entities and relationships in the knowledge graph.
- Allow the large model to query the current day of the week in any time zone.
- You can change the default time zone for the query. The default time zone is Asia/Shanghai.
- You can use these two nodes to define the start and end points of the workflow, placing your workflow in the workflow subfolder of this project folder.
- Click setup_streamlit_app.bat in the project folder, and in the Streamlit interface, click settings and replace it with your workflow.
- This way, you quickly build an AI application with Streamlit as the frontend.
- Both of these nodes have a dialog_id, connecting the dialog_id to make them a dialogue archive point. When you need to loop-connect two large models, although it cannot be achieved in comfyui, you can save the output of the latter model locally, and then pass it to the previous model the next time it runs. You can call comfyui with the comfyui API in other frontends, and as long as you keep calling, you can see the infinite self-dialogue of the two models.
- The start dialogue node has an interface for starting a dialogue, which can serve as a prompt given by the user at the start of a dialogue, guiding the large model to discuss within the topic given by the user.
- Allows your workflow to call other workflows!
- Add start workflow and end workflow nodes to the beginning and end of the workflow you want to embed, and save this workflow as an API in the workflow_api folder of the comfyui_LLM_party project.
- Open another workflow, use the workflow switch node in this workflow, select the workflow you want to embed, and it's done.
- The first time you use the workflow switch node, another 8189 port will be opened, please do not close this new console.
- These nodes have been modified by me so that they will not error out when input is missing, but will be bypassed instead. I think this feature is very important, but I don't know why comfyui doesn't write this feature. This feature can control which part of the workflow is executed by controlling whether there is input to the workflow.