Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keras_patch GPU instead of TPU #9

Open
vince-lynch opened this issue Nov 6, 2019 · 0 comments
Open

Keras_patch GPU instead of TPU #9

vince-lynch opened this issue Nov 6, 2019 · 0 comments

Comments

@vince-lynch
Copy link

Hi,
I've been using the generation GCE setup you recommended (Works wonderfully with Python 2.7), but I haven't been able to get finetuning to work. I appreciate thats a totally different setup.

But, on Google Compute with 2x16gb P100's 8x CPU 30GB ram with Nick Walton's MultiGPU pull request to salesforce/ctrl appears like it should work.

I'm still getting OOM issues,

They've fixed the python3 not being supported in recent changes to ctrl, so I've tried reinstalling all the setup, with python3, 3.5 and 2.7.

019-11-06 11:33:42.236866: I tensorflow/core/common_runtime/bfc_allocator.cc:818] total_region_allocated_bytes_: 15928269056 memory_limit_: 15928269210 available bytes: 154 curr_region_allocation_bytes_: 31856538624
2019-11-06 11:33:42.236876: I tensorflow/core/common_runtime/bfc_allocator.cc:824] Stats:
Limit: 15928269210
InUse: 15928269056
MaxInUse: 15928269056
NumAllocs: 4013
MaxAllocSize: 1262254080

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104 Driver Version: 410.104 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... Off | 00000000:00:04.0 Off | 0 |
| N/A 45C P0 28W / 250W | 0MiB / 16280MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla P100-PCIE... Off | 00000000:00:05.0 Off | 0 |
| N/A 48C P0 30W / 250W | 0MiB / 16280MiB | 7% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

Anyway, more asking for you help than anything, how come I can get generation to work on the same setup that wont allow finetuning?

Many thanks
Vince.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant