Memory Pool #270

kaushikcfd · 2019-01-02T03:57:26Z

This PR implements a memory pool for ViennaCL's OpenCL backend.

Brief overview of the implementation details:

The code for defining the memory pool is mostly pulled from PyOpenCL's memory pool implementation with appropriate licences.
Classes that allocate a memory like vector_base, now take in an additional template parameter that determines the handle type of the memory allocated. The handle type can be either viennacl::ocl::mem_handle<cl_mem> or viennacl::ocl::pooled_clmem_handle . This breaks backward compatibility, but I expect there wouldn't be many instances in which a user-facing code would deal with memory handles.
The few changes that are needed to be propagated across PETSc can be seen in this commit.
The temporaries involved in linalg/vector_operations.hpp are now allocated through a pooled handle.

These allocation calls for the temporaries were substantial when vector operations were called in PETSc. The following table shows the timings(in ms) before and after implementation of the pooled memory handle.

	Before	After
`VecNorm`	0.294	0.036
`VecMDot`	0.741	0.506

Details of the test:

VecNorm on vector of length (2^{18}),
VecMDot on a vectors of length (2^{18}) involving 30 inner products.
Tests run on a machine with Nvidia Titan V.

I have attached the files for the tests, along with their makefiles.

Attachments: timings.tar.gz

…t myself out of the handle<->context fwd decl. issue.

…debug debug

…s to actually work.

…pooled handle

…ilation)

…fault

kaushikcfd · 2019-01-02T03:58:22Z

cc @inducer

karlrupp · 2019-01-02T04:49:35Z

Thanks, @kaushikcfd !

While a memory pool is one way to solve the issue you have encountered with PETSc, I'm afraid this PR is way too intrusive (yet backend-specific) for achieving the actual goal of eliminating the impact of temporaries you've seen in PETSc.

Please let me spend some more thoughts on a more concise (yet equally powerful) fix that immediately carries over to the CUDA backend as well.

kaushikcfd · 2019-01-02T05:09:49Z

@karlrupp: Thanks for taking a look. This can be extended for CUDA backend as well. PyCUDA and PyOpenCL share the same memory pool implementation and hence we would also need to just add another memory pool allocator class, which would involve minimal changes.

inducer

@karlrupp I agree that we should preserve backend independence. One thing I would like to know is what you think of memory pools in general. You mention a different way of being able to avoid expensive temporary allocation--I would be curious to hear what you have in mind.

inducer · 2019-01-02T14:20:50Z

viennacl/backend/memory.hpp

@@ -466,8 +502,8 @@ namespace backend


  /** @brief Copies data of the provided 'DataType' from 'handle_src' to 'handle_dst' and converts the data if the binary representation of 'DataType' among the memory domains differs. */
-  template<typename DataType>
-  void typesafe_memory_copy(mem_handle const & handle_src, mem_handle & handle_dst)
+  template<typename DataType, typename H = viennacl::ocl::handle<cl_mem> >


How come this defaults to the OpenCL handle?

inducer · 2019-01-02T14:22:00Z

viennacl/detail/vector_def.hpp

@@ -100,19 +100,19 @@ struct zero_vector : public scalar_vector<NumericT>
  *
  * @tparam NumericT   The floating point type, either 'float' or 'double'
  */
-template<class NumericT, typename SizeT /* see forwards.h for default type */, typename DistanceT /* see forwards.h for default type */>
+template<class NumericT, typename OCLHandle, typename SizeT /* see forwards.h for default type */, typename DistanceT /* see forwards.h for default type */>


As in the definition, OCLHandle -> HandleImpl?

inducer · 2019-01-02T14:22:35Z

viennacl/forwards.h

@@ -69,6 +69,8 @@
 #include "viennacl/meta/enable_if.hpp"
 #include "viennacl/version.hpp"

+#include "CL/cl.h"


Can this be avoided?

inducer · 2019-01-02T14:28:22Z

viennacl/backend/mem_handle.hpp

@@ -86,6 +86,7 @@ inline memory_types default_memory_type(memory_types new_memory_type) { return d
 * Instead, this class collects all the necessary conditional compilations.
 *
 */
+template <typename OCLHandle>


Suggest renaming OCLHandle -> HandleImpl?

Maybe

template <typename OCLHandle = #if defined(VIENNACL_WITH_OPENCL) stuff #elif defined(VIENNACL_WITH_CUDA) stuff #endif > class mem_handle_with_impl { ... }; typedef mem_handle_with_impl<> mem_handle;

This might also avoid having to add many of the angle brackets?

inducer · 2019-01-02T14:33:57Z

viennacl/ocl/mempool/mempool.hpp

+
+    private:
+      typedef uint32_t bin_nr_t;
+      typedef std::vector<cl_mem> bin_t;


The original PyOpenCL memory pool was not backend-specific--in fact, it was used identically (same header file!) in PyCUDA. How come cl_mem can't be obtained from the allocator type as in the original pool?

inducer · 2019-01-02T14:35:07Z

viennacl/ocl/mempool/mempool_utils.hpp

@@ -0,0 +1,137 @@
+// Various odds and ends


This shouldn't be necessary. Particularly, we shouldn't add our own error classes (but use VCL's existing ones instead).

kaushikcfd added 30 commits November 11, 2018 17:04

WIP: Starts adding mempool to context. Need to figure out a way to ge…

2dc8d7c

…t myself out of the handle<->context fwd decl. issue.

added more helper functions for the memory pool.

8c0af88

Compiles successfully. Still giving seg fault however, need to debug …

b68b068

…debug debug

Code compiling + running for non-mempool allocations.

28452cd

Got rid of the segmentation fault, need to do exhaustive testing

38f1ed9

destructor is a bit troublesome as of now

c05e4ce

memory pool implementation works, need to test for petsc

fe1a7ba

enforces linear operations to use memory pool

cb2dcab

wraps the debug statements in VIENNACL_DEBUG_ALL

0e658c8

confirm run on petsc, still some issues with firedrake

89c5848

some more minor changes

a875a1a

added a file structure which makes sense

a1897fd

adds debugging helpers

7be285e

makes the destructor virtual

a794982

introduced pooled_handle as a sub-class of handle<cl_mem>

3f2cff3

still some issues with destructing

b1218db

minor comments

ebcc1c2

First phase of making the mem_handle a templated class

66c53cc

some more progress on making mem_handle a templated class

c31ad5c

added the OCLHandle as another template parameter

87c8500

more generalizations of the templating thigies

4bca0f8

dont track ctags files

d99ad52

more changes to adjust to the template parameter for the memory handle

49268b6

changes the ordering of the template parameters of vector_base

f191ef0

5% of changes to handle the new handle. Needs a lot more work for thi…

95e6e1b

…s to actually work.

reverts the exclusion of kernel in program

c3f1dd4

forward declares the new fast_copy

1b49ea1

added the extra bits in the memory pool, now need to call those from …

b6b7902

…pooled handle

added the global reference counting

447c14d

corrects template parameters for fast_copy

c4db5de

kaushikcfd added 16 commits December 27, 2018 10:44

Retains the memory after allocating(passes one test :D)

8269f94

WIP: added handle as a tempalte parameter

73f0564

more changes towards adding a template for the vector operations

763d669

more functon support to handle the changes in handle

e660250

adds function support for the host based functions(necessary for comp…

d0c0176

…ilation)

removes debugging distruptions.

690a079

formatting to decrease the diff

4d32d3d

WIP: adds handle parameter to matrix_base, matrix class

8c4f8ae

more changes for the new handle

6ec8928

wraps the debug statements within debug macros

e912aa8

makes the template parameter of any hanging backend::mem_handle to de…

1ebd59d

…fault

makes the template parameter of any hanging backend::mem_handle to de…

362615c

…fault

adds license, minor docs

4447bf4

include one allocator for one device

6394e0e

asserts that the memory pool picks up something

f579ccd

adds the ability to output the result of VecMDot to pooled vector

eb66115

adds space between consecuative right angular brackets

61b5c27

kaushikcfd force-pushed the master branch from d09e411 to 61b5c27 Compare January 2, 2019 05:20

inducer reviewed Jan 2, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Pool #270

Memory Pool #270

kaushikcfd commented Jan 2, 2019

kaushikcfd commented Jan 2, 2019

karlrupp commented Jan 2, 2019

kaushikcfd commented Jan 2, 2019

inducer left a comment

inducer Jan 2, 2019

inducer Jan 2, 2019

inducer Jan 2, 2019

inducer Jan 2, 2019

inducer Jan 2, 2019

inducer Jan 2, 2019

Memory Pool #270

Are you sure you want to change the base?

Memory Pool #270

Conversation

kaushikcfd commented Jan 2, 2019

kaushikcfd commented Jan 2, 2019

karlrupp commented Jan 2, 2019

kaushikcfd commented Jan 2, 2019

inducer left a comment

Choose a reason for hiding this comment

inducer Jan 2, 2019

Choose a reason for hiding this comment

inducer Jan 2, 2019

Choose a reason for hiding this comment

inducer Jan 2, 2019

Choose a reason for hiding this comment

inducer Jan 2, 2019

Choose a reason for hiding this comment

inducer Jan 2, 2019

Choose a reason for hiding this comment

inducer Jan 2, 2019

Choose a reason for hiding this comment