Template Class ChunkDataset
Defined in File chunk.h
Page Contents
Inheritance Relationships
Base Type
public torch::data::datasets::StatefulDataset< ChunkDataset< ChunkReader, samplers::RandomSampler, samplers::RandomSampler >, ChunkReader::BatchType, size_t >
(Template Class StatefulDataset)
Class Documentation
-
template<typename ChunkReader, typename ChunkSampler = samplers::RandomSampler, typename ExampleSampler = samplers::RandomSampler>
class ChunkDataset : public torch::data::datasets::StatefulDataset<ChunkDataset<ChunkReader, samplers::RandomSampler, samplers::RandomSampler>, ChunkReader::BatchType, size_t> A stateful dataset that support hierarchical sampling and prefetching of entre chunks.
Unlike regular dataset, chunk dataset require two samplers to operate and keeps an internal state.
ChunkSampler
selects, which chunk to load next, while theExampleSampler
determins the order of Examples that are returned in eachget_batch
call. The hierarchical sampling approach used here is inspired by this paper http://martin.zinkevich.org/publications/nips2010.pdfPublic Types
-
using BatchType = std::optional<typename ChunkReader::BatchType>
-
using UnwrappedBatchType = typename ChunkReader::BatchType
-
using ChunkSamplerType = ChunkSampler
-
using ExampleSamplerType = ExampleSampler
Public Functions
-
inline ChunkDataset(ChunkReader chunk_reader, ChunkSampler chunk_sampler, ExampleSampler example_sampler, ChunkDatasetOptions options, std::function<void(UnwrappedBatchType&)> preprocessing_policy = std::function<void(UnwrappedBatchType&)>())
-
inline BatchType get_batch(size_t batch_size) override
Default get_batch method of BatchDataset.
This method returns Example batches created from the preloaded chunks. The implemenation is dataset agnostic and does not need overriding in different chunk datasets.
-
inline BatchType get_batch()
Helper method around get_batch as
batch_size
is not strictly necessary.
-
inline virtual void reset() override
This will clear any internal state and starts the internal prefetching mechanism for the chunk dataset.
-
inline ChunkSamplerType &chunk_sampler()
-
inline virtual void save(serialize::OutputArchive &archive) const override
Saves the statefulDataset’s state to OutputArchive.
-
inline virtual void load(serialize::InputArchive &archive) override
Deserializes the statefulDataset’s state from the
archive
.
-
using BatchType = std::optional<typename ChunkReader::BatchType>