SG++-Doxygen-Documentation
sgpp::datadriven::ArffFileSampleProvider Class Reference

ArffFileSampleProvider allows reading data in ARFF format into a sgpp::datadriven::Dataset object. More...

#include <ArffFileSampleProvider.hpp>

Inheritance diagram for sgpp::datadriven::ArffFileSampleProvider:
sgpp::datadriven::FileSampleProvider sgpp::datadriven::SampleProvider

Public Member Functions

 ArffFileSampleProvider (DataShufflingFunctor *shuffling=nullptr)
 Default constructor. More...
 
SampleProviderclone () const override
 Clone Pattern to allow copying of derived classes. More...
 
DatasetgetAllSamples () override
 Asks to return all available samples. More...
 
size_t getDim () const override
 Returns the maximal dimensionality of the data. More...
 
DatasetgetNextSamples (size_t howMany) override
 Lets the user request a certain amount of samples. More...
 
size_t getNumSamples () const override
 Returns the number of samples availible or throws if not possible. More...
 
void readFile (const std::string &filePath, bool hasTargets, size_t readinCutoff=-1, std::vector< size_t > readinColumns=std::vector< size_t >(), std::vector< double > readinClasses=std::vector< double >()) override
 Open an existing ARFF file, parse it and store its contents inside this class. More...
 
void readString (const std::string &input, bool hasTargets, size_t readinCutoff=-1, std::vector< size_t > readinColumns=std::vector< size_t >(), std::vector< double > readinClasses=std::vector< double >()) override
 Parse contents of a string containing information in ARFF format, parse it and store its contents inside this class. More...
 
void reset () override
 Resets the state of the sample provider (e.g. More...
 
- Public Member Functions inherited from sgpp::datadriven::SampleProvider
SampleProvideroperator= (const SampleProvider &rhs)=default
 
SampleProvideroperator= (SampleProvider &&rhs)=default
 
 SampleProvider ()=default
 
 SampleProvider (const SampleProvider &rhs)=default
 
 SampleProvider (SampleProvider &&rhs)=default
 
virtual ~SampleProvider ()=default
 

Detailed Description

ArffFileSampleProvider allows reading data in ARFF format into a sgpp::datadriven::Dataset object.

Data can currently be either be a string formatted in ARFF or a file containing ARFF data.

Constructor & Destructor Documentation

◆ ArffFileSampleProvider()

sgpp::datadriven::ArffFileSampleProvider::ArffFileSampleProvider ( DataShufflingFunctor shuffling = nullptr)
explicit

Default constructor.

Parameters
shufflingfunctor to permute the training data indexes

Member Function Documentation

◆ clone()

SampleProvider * sgpp::datadriven::ArffFileSampleProvider::clone ( ) const
overridevirtual

Clone Pattern to allow copying of derived classes.

Returns
a Pointer to a new instance of sgpp::datadriven::ArffFileSampleProvider with copied state. Caller owns the new object.

Implements sgpp::datadriven::SampleProvider.

◆ getAllSamples()

Dataset * sgpp::datadriven::ArffFileSampleProvider::getAllSamples ( )
overridevirtual

Asks to return all available samples.

This functionality is designed for returning all available samples from an entire file.

Returns
sgpp::datadriven::Dataset* Pointer to a new sgpp::datadriven::Dataset object. This object is owned by the caller.

Implements sgpp::datadriven::SampleProvider.

References sgpp::datadriven::Dataset::getDimension(), getNextSamples(), and sgpp::datadriven::Dataset::getNumberInstances().

◆ getDim()

size_t sgpp::datadriven::ArffFileSampleProvider::getDim ( ) const
overridevirtual

Returns the maximal dimensionality of the data.

Returns
dimensionality of the sgpp::datadriven::Dataset.

Implements sgpp::datadriven::SampleProvider.

References sgpp::datadriven::Dataset::getDimension().

Referenced by python.uq.dists.SGDEdist.SGDEdist::__str__(), and python.uq.dists.KDEDist.KDEDist::getBandwidths().

◆ getNextSamples()

Dataset * sgpp::datadriven::ArffFileSampleProvider::getNextSamples ( size_t  howMany)
overridevirtual

Lets the user request a certain amount of samples.

This functionality is is designed for streaming algorithms where data is processed in batches.

Parameters
howManynumber requested amount of samples. The amount of actually provided samples can be smaller, if there is not sufficient data.
Returns
sgpp::datadriven::Dataset* Pointer to a new sgpp::datadriven::Dataset object containing at most the requested amount of samples. This object is owned by the caller.

Implements sgpp::datadriven::SampleProvider.

References sgpp::datadriven::Dataset::getDimension().

Referenced by getAllSamples().

◆ getNumSamples()

size_t sgpp::datadriven::ArffFileSampleProvider::getNumSamples ( ) const
overridevirtual

Returns the number of samples availible or throws if not possible.

Returns
the number of samples availible

Implements sgpp::datadriven::SampleProvider.

References sgpp::datadriven::Dataset::getDimension(), and sgpp::datadriven::Dataset::getNumberInstances().

◆ readFile()

void sgpp::datadriven::ArffFileSampleProvider::readFile ( const std::string &  filePath,
bool  hasTargets,
size_t  readinCutoff = -1,
std::vector< size_t >  readinColumns = std::vector<size_t>(),
std::vector< double >  readinClasses = std::vector<double>() 
)
overridevirtual

Open an existing ARFF file, parse it and store its contents inside this class.

Throws if file can not be opened or parsed.

Parameters
filePathPath to an existing file.
hasTargetswhether the file has targest (i.e. supervised learning)
readinCutoffsee FileSampleProvider.hpp
readinColumnssee FileSampleProvider.hpp
readinClassessee FileSampleProvider.hpp

Implements sgpp::datadriven::FileSampleProvider.

References sgpp::datadriven::ARFFTools::readARFFFromFile().

◆ readString()

void sgpp::datadriven::ArffFileSampleProvider::readString ( const std::string &  input,
bool  hasTargets,
size_t  readinCutoff = -1,
std::vector< size_t >  readinColumns = std::vector<size_t>(),
std::vector< double >  readinClasses = std::vector<double>() 
)
overridevirtual

Parse contents of a string containing information in ARFF format, parse it and store its contents inside this class.

Throws if string can not be parsed.

Parameters
inputstring containing information in ARFF file format
hasTargetswhether the file has targest (i.e. supervised learning)
readinCutoffsee FileSampleProvider.hpp
readinColumnssee FileSampleProvider.hpp
readinClassessee FileSampleProvider.hpp

Implements sgpp::datadriven::FileSampleProvider.

References sgpp::datadriven::Dataset::getData(), sgpp::datadriven::Dataset::getDimension(), sgpp::datadriven::Dataset::getNumberInstances(), sgpp::datadriven::Dataset::getTargets(), python.statsfileInfo::i, sgpp::datadriven::ARFFTools::readARFFFromString(), and sgpp::base::DataMatrix::setRow().

◆ reset()

void sgpp::datadriven::ArffFileSampleProvider::reset ( )
overridevirtual

Resets the state of the sample provider (e.g.

to start a new epoch)

Implements sgpp::datadriven::SampleProvider.


The documentation for this class was generated from the following files: