SG++-Doxygen-Documentation
sgpp::datadriven::FileSampleProvider Class Referenceabstract

sgpp::datadriven::FileSampleProvider is an specialization of sgpp::datadriven::SampleProvider and provides an interface for all sample providers that get their samples from files. More...

#include <FileSampleProvider.hpp>

Inheritance diagram for sgpp::datadriven::FileSampleProvider:
sgpp::datadriven::SampleProvider sgpp::datadriven::ArffFileSampleProvider sgpp::datadriven::CSVFileSampleProvider sgpp::datadriven::FileSampleDecorator

Public Member Functions

virtual void readFile (const std::string &filePath, bool hasTargets, size_t readinCutoff=-1, std::vector< size_t > readinColumns=std::vector< size_t >(), std::vector< double > readinClasses=std::vector< double >())=0
 Read the contents of the file at the given path. More...
 
virtual void readString (const std::string &input, bool hasTargets, size_t readinCutoff=-1, std::vector< size_t > readinColumns=std::vector< size_t >(), std::vector< double > readinClasses=std::vector< double >())=0
 Read the contents of a string, for example a deflated archive. More...
 
- Public Member Functions inherited from sgpp::datadriven::SampleProvider
virtual SampleProviderclone () const =0
 Clone pattern for polymorphic cloning (mainly interresting for copy constructors). More...
 
virtual DatasetgetAllSamples ()=0
 Asks to return all available samples. More...
 
virtual size_t getDim () const =0
 Returns the maximal dimensionality of the data. More...
 
virtual DatasetgetNextSamples (size_t howMany)=0
 Lets the user request a certain amount of samples. More...
 
virtual size_t getNumSamples () const =0
 Returns the number of samples availible or throws if not possible. More...
 
SampleProvideroperator= (const SampleProvider &rhs)=default
 
SampleProvideroperator= (SampleProvider &&rhs)=default
 
virtual void reset ()=0
 Resets the state of the sample provider (e.g. More...
 
 SampleProvider ()=default
 
 SampleProvider (const SampleProvider &rhs)=default
 
 SampleProvider (SampleProvider &&rhs)=default
 
virtual ~SampleProvider ()=default
 

Detailed Description

sgpp::datadriven::FileSampleProvider is an specialization of sgpp::datadriven::SampleProvider and provides an interface for all sample providers that get their samples from files.

Member Function Documentation

◆ readFile()

virtual void sgpp::datadriven::FileSampleProvider::readFile ( const std::string &  filePath,
bool  hasTargets,
size_t  readinCutoff = -1,
std::vector< size_t >  readinColumns = std::vector< size_t >(),
std::vector< double >  readinClasses = std::vector< double >() 
)
pure virtual

Read the contents of the file at the given path.

Has to throw an exception if file can not be opened or parsed. Results of parsing can be optained via sgpp::datadriven::SampleProvider member functions.

Parameters
filePathvalid path to an existing file.
hasTargetswhether the file has targets (i.e. supervised learning)
readinCutoffdata line number after which to stop reading. Default: MAX_UINT - 1
readinColumnsspecifies a subset of columns (dimensions). Only these columns are read in Order sensitive. Default: empty which means all columns are considered
readinClassesspecifies a subset of classes. Only data lines with one of these classes is read in. Default: empty which means all classes are considered

Implemented in sgpp::datadriven::FileSampleDecorator, sgpp::datadriven::ArffFileSampleProvider, and sgpp::datadriven::CSVFileSampleProvider.

◆ readString()

virtual void sgpp::datadriven::FileSampleProvider::readString ( const std::string &  input,
bool  hasTargets,
size_t  readinCutoff = -1,
std::vector< size_t >  readinColumns = std::vector< size_t >(),
std::vector< double >  readinClasses = std::vector< double >() 
)
pure virtual

Read the contents of a string, for example a deflated archive.

Has to throw an exception if string can not be parsed. Results of parsing can be optained via sgpp::datadriven::SampleProvider member functions.

Parameters
inputthe raw string input to parse
hasTargetswhether the file has targest (i.e. supervised learning)
readinCutoffdata line number after which to stop reading. Default: MAX_UINT - 1
readinColumnsspecifies a subset of columns (dimensions). Only these columns are read in Order sensitive. Default: empty which means all columns are considered
readinClassesspecifies a subset of classes. Only data lines with one of these classes is read in. Default: empty which means all classes are considered

Implemented in sgpp::datadriven::FileSampleDecorator, sgpp::datadriven::ArffFileSampleProvider, and sgpp::datadriven::CSVFileSampleProvider.


The documentation for this class was generated from the following file: