If you already know how member pointers work and why they're useful,
skip this paragraph and continue at the workaround. If not, then please consider the
following example: we have a vector whose members are of type
SpaceMarine
. We want to write a function which changes
some property, e.g. resets the armor value. It's tempting to write
the code in Listing 1.
While this is all nice and dandy, the obvious downside to this is that we'd end up writing a new function for each attribute that we'd want to modify, e.g. the ammunition or the bolter. A more flexible approach would be the snippet from Listing 2. It decouples the operation from which member and which type exactly to work on.
class Bolter {}; class SpaceMarine { public: double armor; int ammunition; Bolter bolter; }; void set_armor( std::vector<SpaceMarine>& vec, double newArmor) { for (auto& i: vec) vec.armor = newArmor; }
template<typename MEMBER> void set( std::vector<SpaceMarine>& vec, MEMBER SpaceMarine:: *member, const MEMBER& newValue) { for (auto& i: vec) vec.*member = newValue; }
Of course you can pass such a member pointer
to a CUDA kernel, or should, at least in theory. But you're bound to
run into an error message mentioning undefined references. The bug in
nvcc isn't exactly new and has been extensively in this
post from 2014. That post also shows a workaround which suggest to
wrap the member pointer in a template parameter of a helper class But
that only works with built-in types. In our example we'd be good with
&SpaceMarine::armor
and
&SpaceMarine::ammunition
, but
&SpaceMarine::bolter
will fail. The error message is
different from the first attempt and will be along the lines of
error: SpaceMarine is not a member of Bolter
, which is
confusing because the code doesn't state that anywhere. It's
apparently a glitch of the code generator inside nvcc writes the glue
code to bridge host and device. The solution in Listing 3
adds another template parameter to the helper code.
The example above may appear contrived, but that's just because I wanted a short and simple code to illustrate the principle. In LibGeoDecomp's IO subsystem I'm using member pointers to allow users to output arbitrary members of they simulation models without having to provide getter/setter functions or adapter classes. This functionality is encapsuled by the class Selector, which in turn relies on various filters (these differ in whether they support array members or scalar members or if they can access CUDA device memory or just host memory). An example can be studied in this test.
//========== GENERIC WRAPPER CODE =============== template<typename MEMBER, typename MEMBER_POINTER> class Wrapper { public: MEMBER_POINTER val; Wrapper(MEMBER_POINTER val) : val(val) {} }; template<typename M, typename T> Wrapper<M, char T::*> wrap(M T:: *member) { return Wrapper<M, char T::*>( reinterpret_cast<char T::*>(member)); } template<typename MEMBER, typename T> __device__ MEMBER T:: *unwrap( Wrapper<MEMBER, char T::*> w) { return reinterpret_cast<MEMBER T::*>(w.val); } //========== USER CODE ========================== class Bolter {}; class SpaceMarine { public: double armor; int ammunition; Bolter bolter; }; template<typename MEMBER> __global__ void set( SpaceMarine *vec, int num, Wrapper<MEMBER, char SpaceMarine:: *> wrapper, const MEMBER& newVal) { int index = threadIdx.x + blockIdx.x * blockDim.x; if (index < num) vec[index].*unwrap(wrapper) = newVal; } int main(int argc, char **argv) { SpaceMarine *vec; set<<<1, 1>>>(vec, 0, wrap(&SpaceMarine::bolter), Bolter()); return 0; }