You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After deployed ui-tars-2B model with Ollama locally according to GUI Model Deployment Guide, I run python script and got garbled text like: and,: , and orc Uran is, c Australia? C : ? True ,否则.
It seems this problem will also cause the output 0 problem when using ui-tars-desktop according to the log from app:
[2025-01-23 16:02:56.695] [info] (main) [vlmParams_images_len]: 1
[2025-01-23 16:02:56.697] [info] (main) [resizeFactor] maxPixels 1058400 currentPixels 1821369 resizeFactor 0.7623000448470658
[2025-01-23 16:02:56.846] [info] (main) [preprocessResizeImage] width: 1301 height: 813 size: 62.60KB
[2025-01-23 16:02:56.847] [info] (main) vlmBaseUrl http://localhost:11434/v1 vlmApiKey ollama
[2025-01-23 16:03:12.460] [info] (main) [vlm_invoke_time_cost]: 15612ms
[2025-01-23 16:03:12.460] [info] (main) [ui_tars_vlm_response_content] 懒陷入了, on right onceberry, ...(Omission)
[2025-01-23 16:03:12.460] [info] (main) [nl2Command] body {"prediction":" 懒陷入了, on right onceberry, ...(Omission)
[2025-01-23 16:03:12.461] [info] (main) [nl2Command] parsed []
[2025-01-23 16:03:12.461] [info] (main) [emitData] status running
[2025-01-23 16:03:12.461] [info] (main) ======data======
[] { size: { width: 1707, height: 1067 } } {
from: 'gpt',
value: '懒陷入了, on right onceberry, this you right right him right,, right,趁更何况,, right when,你耻, ...(Omission)
timing: { start: 1737619376694, end: 1737619392461, cost: 15767 },
reflections: []
}
========
[2025-01-23 16:03:12.462] [info] (main) [parsed] [] [parsed_length] 0
and this is the python script I used:
importbase64fromopenaiimportOpenAIdeployment="ollama"instruction="click the start menu"screenshot_path=r"C:\Gianmeng\Code\Mass\UI-TARS\Screenshots\screenshot.jpg"assertdeploymentin ["ollama", "hf"]
ifdeployment=="ollama":
client=OpenAI(
base_url="http://127.0.0.1:11434/v1/",
api_key="ollama", # not used
)
# the model name created via ollama CLI, you can check it via command: `ollama list`model="ui-tars"else:
client=OpenAI(base_url="<endpoint url>", api_key="<huggingface access tokens>")
model="tgi"prompt="You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task. \n\n ## Output Format\n ```\n Action_Summary: ...\n Action: ...\n ```\n\n ## Action Space\n click(start_box=‘<|box_start|>(x1,y1)<|box_end|>’)\nlong_press(start_box=‘<|box_start|>(x1,y1)<|box_end|>’, time=‘’)\ntype(content=‘’)\nscroll(direction=‘down or up or right or left’)\nopen_app(app_name=‘’)\nnavigate_back()\nnavigate_home()\nWAIT()\nfinished() # Submit the task regardless of whether it succeeds or fails.\n\n ## Note\n - Use English in `Action_Summary` part.\n\n\n ## User Instruction\n"withopen(screenshot_path, "rb") asimage_file:
encoded_string=base64.b64encode(image_file.read()).decode("utf-8")
response=client.chat.completions.create(
model=model,
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": prompt+instruction},
{"type": "image_url", "image_url": {"url": f"data:image/jpg;base64,{encoded_string}"}},
],
},
],
)
print(response.choices[0].message.content)
依据GUI Model Deployment Guide,我在本地使用Ollama部署了ui-tars-2B,但是在进行第3步进行验证的时候,发现python脚本的输出都是乱码。可以从ui-tars-desktop的log中看出,这个问题好像会导致运行ui-tars-desktop的时候会输出0。
The text was updated successfully, but these errors were encountered:
After deployed ui-tars-2B model with Ollama locally according to GUI Model Deployment Guide, I run python script and got garbled text like:
and,: , and orc Uran is, c Australia? C : ? True ,否则
.It seems this problem will also cause the output 0 problem when using ui-tars-desktop according to the log from app:
and this is the python script I used:
依据GUI Model Deployment Guide,我在本地使用Ollama部署了ui-tars-2B,但是在进行第3步进行验证的时候,发现python脚本的输出都是乱码。可以从ui-tars-desktop的log中看出,这个问题好像会导致运行ui-tars-desktop的时候会输出0。
The text was updated successfully, but these errors were encountered: