Template Class ChunkDataset¶
Defined in File chunk.h
Page Contents
Inheritance Relationships¶
Base Type¶
public torch::data::datasets::StatefulDataset< ChunkDataset< ChunkReader, samplers::RandomSampler, samplers::RandomSampler >, ChunkReader::BatchType, size_t >
(Template Class StatefulDataset)
Class Documentation¶
-
template<typename ChunkReader, typename ChunkSampler = samplers::RandomSampler, typename ExampleSampler = samplers::RandomSampler>
class ChunkDataset : public torch::data::datasets::StatefulDataset<ChunkDataset<ChunkReader, samplers::RandomSampler, samplers::RandomSampler>, ChunkReader::BatchType, size_t>¶ A stateful dataset that support hierarchical sampling and prefetching of entre chunks.
Unlike regular dataset, chunk dataset require two samplers to operate and keeps an internal state.
ChunkSampler
selects, which chunk to load next, while theExampleSampler
determins the order of Examples that are returned in eachget_batch
call. The hierarchical sampling approach used here is inspired by this paper http://martin.zinkevich.org/publications/nips2010.pdfPublic Types
-
using BatchType = std::optional<typename ChunkReader::BatchType>¶
-
using UnwrappedBatchType = typename ChunkReader::BatchType¶
-
using BatchRequestType = size_t¶
-
using ChunkSamplerType = ChunkSampler¶
-
using ExampleSamplerType = ExampleSampler¶
Public Functions
-
inline ChunkDataset(ChunkReader chunk_reader, ChunkSampler chunk_sampler, ExampleSampler example_sampler, ChunkDatasetOptions options, std::function<void(UnwrappedBatchType&)> preprocessing_policy = std::function<void(UnwrappedBatchType&)>())¶
-
inline ~ChunkDataset() override¶
-
inline BatchType get_batch(size_t batch_size) override¶
Default get_batch method of BatchDataset.
This method returns Example batches created from the preloaded chunks. The implemenation is dataset agnostic and does not need overriding in different chunk datasets.
-
inline BatchType get_batch()¶
Helper method around get_batch as
batch_size
is not strictly necessary.
-
inline virtual void reset() override¶
This will clear any internal state and starts the internal prefetching mechanism for the chunk dataset.
-
inline virtual std::optional<size_t> size() const override¶
size is not used for chunk dataset.
-
inline ChunkSamplerType &chunk_sampler()¶
-
inline virtual void save(serialize::OutputArchive &archive) const override¶
Saves the statefulDataset’s state to OutputArchive.
-
inline virtual void load(serialize::InputArchive &archive) override¶
Deserializes the statefulDataset’s state from the
archive
.
-
using BatchType = std::optional<typename ChunkReader::BatchType>¶