SG++-Doxygen-Documentation
sgpp::datadriven::CSVFileSampleProvider Class Reference

CSVFileSampleProvider allows reading data in CSV format into a sgpp::datadriven::Dataset object. More...

#include <CSVFileSampleProvider.hpp>

Inheritance diagram for sgpp::datadriven::CSVFileSampleProvider:
sgpp::datadriven::FileSampleProvider sgpp::datadriven::SampleProvider

Public Member Functions

SampleProviderclone () const override
 Clone Pattern to allow copying of derived classes. More...
 
 CSVFileSampleProvider (DataShufflingFunctor *shuffling=nullptr)
 Default constructor. More...
 
DatasetgetAllSamples () override
 Asks to return all available samples. More...
 
size_t getDim () const override
 Returns the maximal dimensionality of the data. More...
 
DatasetgetNextSamples (size_t howMany) override
 Lets the user request a certain amount of samples. More...
 
size_t getNumSamples () const override
 Returns the number of samples availible or throws if not possible. More...
 
void readFile (const std::string &filePath, bool hasTargets, size_t readinCutoff=-1, std::vector< size_t > readinColumns=std::vector< size_t >(), std::vector< double > readinClasses=std::vector< double >()) override
 Open an existing CSV file, parse it and store its contents inside this class. More...
 
void readString (const std::string &input, bool hasTargets, size_t readinCutoff=-1, std::vector< size_t > readinColumns=std::vector< size_t >(), std::vector< double > readinClasses=std::vector< double >()) override
 Currently not implemented. More...
 
void reset () override
 Resets the state of the sample provider (e.g. More...
 
- Public Member Functions inherited from sgpp::datadriven::SampleProvider
SampleProvideroperator= (const SampleProvider &rhs)=default
 
SampleProvideroperator= (SampleProvider &&rhs)=default
 
 SampleProvider ()=default
 
 SampleProvider (const SampleProvider &rhs)=default
 
 SampleProvider (SampleProvider &&rhs)=default
 
virtual ~SampleProvider ()=default
 

Detailed Description

CSVFileSampleProvider allows reading data in CSV format into a sgpp::datadriven::Dataset object.

Data can currently only be a file containing CSV data with the first line containing column titles (is skipped).

Constructor & Destructor Documentation

sgpp::datadriven::CSVFileSampleProvider::CSVFileSampleProvider ( DataShufflingFunctor shuffling = nullptr)
explicit

Default constructor.

Parameters
shufflingfunctor to permute the training data indexes

Member Function Documentation

SampleProvider * sgpp::datadriven::CSVFileSampleProvider::clone ( ) const
overridevirtual

Clone Pattern to allow copying of derived classes.

Returns
a Pointer to a new instance of sgpp::datadriven::CSVFileSampleProvider with copied state. Caller owns the new object.

Implements sgpp::datadriven::SampleProvider.

Dataset * sgpp::datadriven::CSVFileSampleProvider::getAllSamples ( )
overridevirtual

Asks to return all available samples.

This functionality is designed for returning all available samples from an entire file.

Returns
sgpp::datadriven::Dataset* Pointer to a new sgpp::datadriven::Dataset object. This object is owned by the caller.

Implements sgpp::datadriven::SampleProvider.

References sgpp::datadriven::Dataset::getDimension(), getNextSamples(), and sgpp::datadriven::Dataset::getNumberInstances().

size_t sgpp::datadriven::CSVFileSampleProvider::getDim ( ) const
overridevirtual

Returns the maximal dimensionality of the data.

Returns
dimensionality of the sgpp::datadriven::Dataset.

Implements sgpp::datadriven::SampleProvider.

References sgpp::datadriven::Dataset::getDimension().

Referenced by python.uq.dists.SGDEdist.SGDEdist::__str__(), and python.uq.dists.KDEDist.KDEDist::getBandwidths().

Dataset * sgpp::datadriven::CSVFileSampleProvider::getNextSamples ( size_t  howMany)
overridevirtual

Lets the user request a certain amount of samples.

This functionality is is designed for streaming algorithms where data is processed in batches.

Parameters
howManynumber requested amount of samples. The amount of actually provided samples can be smaller, if there is not sufficient data.
Returns
sgpp::datadriven::Dataset* Pointer to a new sgpp::datadriven::Dataset object containing at most the requested amount of samples. This object is owned by the caller.

Implements sgpp::datadriven::SampleProvider.

References sgpp::datadriven::Dataset::getDimension().

Referenced by getAllSamples().

size_t sgpp::datadriven::CSVFileSampleProvider::getNumSamples ( ) const
overridevirtual

Returns the number of samples availible or throws if not possible.

Returns
the number of samples availible

Implements sgpp::datadriven::SampleProvider.

References sgpp::datadriven::Dataset::getDimension(), and sgpp::datadriven::Dataset::getNumberInstances().

void sgpp::datadriven::CSVFileSampleProvider::readFile ( const std::string &  filePath,
bool  hasTargets,
size_t  readinCutoff = -1,
std::vector< size_t >  readinColumns = std::vector<size_t>(),
std::vector< double >  readinClasses = std::vector<double>() 
)
overridevirtual

Open an existing CSV file, parse it and store its contents inside this class.

Throws if file can not be opened or parsed.

Parameters
filePathPath to an existing file.
hasTargetswhether the file has targest (i.e. supervised learning)
readinCutoffsee FileSampleProvider.hpp
readinColumnssee FileSampleProvider.hpp
readinClassessee FileSampleProvider.hpp

Implements sgpp::datadriven::FileSampleProvider.

References sgpp::datadriven::CSVTools::readCSVFromFile().

void sgpp::datadriven::CSVFileSampleProvider::readString ( const std::string &  input,
bool  hasTargets,
size_t  readinCutoff = -1,
std::vector< size_t >  readinColumns = std::vector<size_t>(),
std::vector< double >  readinClasses = std::vector<double>() 
)
overridevirtual

Currently not implemented.

Parameters
inputstring containing information in CSV file format
hasTargetswhether the file has targest (i.e. supervised learning)
readinCutoffsee FileSampleProvider.hpp
readinColumnssee FileSampleProvider.hpp
readinClassessee FileSampleProvider.hpp

Implements sgpp::datadriven::FileSampleProvider.

References sgpp::datadriven::Dataset::getData(), sgpp::datadriven::Dataset::getDimension(), sgpp::datadriven::Dataset::getNumberInstances(), sgpp::datadriven::Dataset::getTargets(), python.statsfileInfo::i, and sgpp::base::DataMatrix::setRow().

void sgpp::datadriven::CSVFileSampleProvider::reset ( )
overridevirtual

Resets the state of the sample provider (e.g.

to start a new epoch)

Implements sgpp::datadriven::SampleProvider.


The documentation for this class was generated from the following files: