SG++-Doxygen-Documentation

Configuration structure used for all kinds of SampleProviders including default values. More...

#include <DataSourceConfig.hpp>

## Public Attributes

size_t batchSize = 0

size_t epochs = 1
The number of epochs to train on. More...

std::string filePath = ""
Valid path to a file on disk. More...

DataSourceFileType fileType = DataSourceFileType::NONE
Which type of input file are we dealing with? NONE for auto detection or generated artificial datasets. More...

bool hasTargets = true
whether the file has targets (i.e. More...

bool isCompressed = false
The dataset is gzip compressed. More...

size_t numBatches = 1
How many batches should the dataset be split into for batch learning - if 1, take the entire dataset. More...

int64_t randomSeed = -1
Seed for the shuffling prng. More...

std::vector< double > readinClasses = std::vector<double>()
Specifies the set of classes (targets) to be read-in from the data file Any line with a class not contained in this vector is skipped If hasTargets=false this is ignored If empty then all classes/targets are considered (default) More...

std::vector< size_t > readinColumns = std::vector<size_t>()
Specifies the set of columns (dimensions) to be read-in from the data file Starts at 0, order matters; Any column not contained in this vector is ignored as a dimension If empty, then all columns are read in (default) More...

After how many (valid) lines of the sourcefile to stop reading. More...

DataSourceShufflingType shuffling = DataSourceShufflingType::sequential
The type of shuffling to be applied to the data. More...

double validationPortion = 0.3

## Detailed Description

Configuration structure used for all kinds of SampleProviders including default values.

## Member Data Documentation

The number of epochs to train on.

Valid path to a file on disk.

Empty for generated artificial datasets

Which type of input file are we dealing with? NONE for auto detection or generated artificial datasets.

whether the file has targets (i.e.

supervised learning)

How many batches should the dataset be split into for batch learning - if 1, take the entire dataset.

Seed for the shuffling prng.

Specifies the set of classes (targets) to be read-in from the data file Any line with a class not contained in this vector is skipped If hasTargets=false this is ignored If empty then all classes/targets are considered (default)

Specifies the set of columns (dimensions) to be read-in from the data file Starts at 0, order matters; Any column not contained in this vector is ignored as a dimension If empty, then all columns are read in (default)