Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Could not run DirectML models on Intel laptop #70

Open
gyagp opened this issue Jan 4, 2025 · 6 comments
Open

[BUG] Could not run DirectML models on Intel laptop #70

gyagp opened this issue Jan 4, 2025 · 6 comments
Assignees
Labels
🐛bug Something isn't working 😎enhancement help wanted Extra attention is needed

Comments

@gyagp
Copy link

gyagp commented Jan 4, 2025

Describe the bug
My laptop doesn't have any discrete GPU, but an Intel integrated GPU (Meteor Lake). However, I have 32G system memory, and Intel iGPU may use up to half of it. The application disallows to download DirectML models.

To Reproduce
Steps to reproduce the behavior:

  1. Go to 'Samples'
  2. All the DirectML based models couldn't be downloaded

Expected behavior
DirectML based models could run on integrated GPUs.

Screenshots
Image

Please complete the following information:
Self-built from GitHub

Additional context
NA

@gyagp gyagp added the 🐛bug Something isn't working label Jan 4, 2025
@Pinguin2001
Copy link
Contributor

Pinguin2001 commented Jan 7, 2025

+1 from my side

This is caused by your BIOS only preallocating ~64MB of RAM to your iGPU. All other memory is being allocated dynamically.

I have an RX6400 with 4GB of VRAM, yet the AI Dev Gallery detects it as 3.9GB of VRAM, thus the check fails

For now, you can patch the check yourself by inputting a fake value

public static ulong GetVram()

After patching this single file, Phi 3 Mini runs fine on my system.

@nmetulev
Copy link
Member

nmetulev commented Jan 7, 2025

I agree we need to enable DML models on integrated GPUs. We won't have time to work on this this month and can add it for next month, but if anyone wants to pick it up before then please feel free to submit a PR.

@nmetulev nmetulev added help wanted Extra attention is needed 😎enhancement labels Jan 7, 2025
@BobLd
Copy link

BobLd commented Jan 7, 2025

Related to #47 I guess

Having a "download anyway" button could be an easy fix

@Pinguin2001
Copy link
Contributor

@nmetulev
What do you think of the idea to allow shared memory? Instead of calculating the maximum allowed dedicated video memory, we check the maximum allowed shared video memory.

Unless the dev is using a dGPU with enough VRAM dedicated, the system always has to use to shared memory anyway.

If this Idea is fine with the team, I will create a prImage

@nmetulev
Copy link
Member

nmetulev commented Jan 8, 2025

Good point. I had decided to only check dedicated vram because I noticed that language models have a degraded performance when they overflow to shared memory, and in some cases, it would cause blue screens. This is likely a DML bug.

However, that should not be the case when using just the shared vram on integrated gpu, so we should be able to just check for either one, and if no dGPU, then check the shared vram.

And I like the suggestion from @BobLd of having a "download anyway" option where the user is presented with a warning and they agree to the risks.

Thoughts?

@hansmbakker
Copy link

hansmbakker commented Jan 17, 2025

when they overflow to shared memory, and in some cases, it would cause blue screens. This is likely a DML bug.

I created microsoft/DirectML#683. Feel free to add to it.

@nmetulev Is this bug reported and tracked in Microsoft already? If not, can you please ensure that it is?

Lots of models can't be run on the GPU due to this issue (because dedicated VRAM is often too small) and CPU is a factor slower, so having this fixed would be really welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛bug Something isn't working 😎enhancement help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

5 participants