SG++-Doxygen-Documentation
|
Kernel that provide the transposed MultiEval operation \(v':= B v\) for a single OpenCL device. More...
#include <KernelMultTranspose.hpp>
Public Member Functions | |
double | getBuildDuration () |
KernelMultTranspose (std::shared_ptr< base::OCLDevice > device, size_t dims, std::shared_ptr< base::OCLManagerMultiPlatform > manager, json::Node &kernelConfiguration, std::shared_ptr< base::QueueLoadBalancerOpenMP > queueBalancerMultTranspose) | |
Constructs a new KernelMultTranspose object. More... | |
double | multTranspose (std::vector< T > &level, std::vector< T > &index, std::vector< T > &dataset, std::vector< T > &source, std::vector< T > &result, const size_t start_index_data, const size_t end_index_data) |
Perform the transposed MultiEval operator \(v':= B v\) with the device this kernel manages. More... | |
~KernelMultTranspose () | |
Destructor. More... | |
Kernel that provide the transposed MultiEval operation \(v':= B v\) for a single OpenCL device.
This class manages the OpenCL data structures required for a OpenCL kernel invocation. To that end, it makes heavy use of OpenCL buffer abstraction. It makes use of a queue of work packages to find out whether still device has any remaining work available. For the creation of the device-side compute kernel code, a code generator is used.
|
inline |
Constructs a new KernelMultTranspose object.
device | The OpenCL device this kernel instance manages |
dims | Dimensionality of the problem |
manager | The OpenCL manager to reduce OpenCL boilerplate |
kernelConfiguration | The configuration of this specific device |
queueBalancerMultTranspose | Load balance for query work from for the device |
References json::Node::getBool(), and json::Node::getUInt().
|
inline |
Destructor.
|
inline |
References dataset, python.statsfileInfo::i, sgpp::base::OCLBufferWrapperSD< T >::intializeTo(), and level.
|
inline |
Perform the transposed MultiEval operator \(v':= B v\) with the device this kernel manages.
Has additional, currently unused parameters to enable further MPI parallelization in the future.
level | Vector containing the d-dimensional levels of the grid, the order matches the index vector |
index | Vector containing the d-dimensional indices of the grid, the order matches the level vector |
dataset | Vector containing the d-dimensional data points |
source | Vector \(v\) |
result | The results vector \(v'\) of the operation |
start_index_data | start of range of data points to work on, currently not used |
end_index_data | end of range of data points to work on, currently not used |
References sgpp::datadriven::StreamingOCLMultiPlatform::SourceBuilderMultTranspose< real_type >::generateSource(), sgpp::base::OCLBufferWrapperSD< T >::getBuffer(), sgpp::base::OCLBufferWrapperSD< T >::getHostPointer(), python.statsfileInfo::i, sgpp::base::OCLBufferWrapperSD< T >::readFromBuffer(), and python.leja::start.