Turi Create  4.0
turi::general_ifstream Class Reference

#include <core/storage/fileio/general_fstream.hpp>

Public Member Functions

 general_ifstream (std::string filename)
 
 general_ifstream (std::string filename, bool gzip_compressed)
 
size_t file_size ()
 
size_t get_bytes_read ()
 
std::string filename () const
 
std::shared_ptr< std::istream > get_underlying_stream ()
 

Detailed Description

A generic input file stream interface that provides unified access to local filesystem, HDFS, S3, in memory files, and can automatically perform gzip decoding.

Usage:

general_ifstream fin("file");
// after which fin behaves like a regular std::ifstream object.

file can be:

  • local filesystem
  • S3 (in which case the filename must be of the form s3://... (see below)
  • HDFS (filename must be of the form hdfs://...)
  • In memory / disk paged (filename must be of the form cache://...)

In all filesystems, random seek is allowed.

If the file is gzip compressed, it will automatically be decoded on the fly, but random seeks will be disabled.

S3 access keys are mediated by having the filename be of the form s3://[access_key_id]:[secret_key]:[endpoint/][bucket]/[object_name]

Endpoint URLs however, are set globally via the global variable S3_ENDPOINT.

Definition at line 48 of file general_fstream.hpp.

Constructor & Destructor Documentation

◆ general_ifstream() [1/2]

turi::general_ifstream::general_ifstream ( std::string  filename)

Constructs a general ifstream object when opens the filename specified. The file may be on HDFS and may be gzip compressed. If the file is gzip compressed, the file must be have the ".gz" suffix for it to be properly identified.

Throw an std::io_base::failure exception if failing to contruct the stream.

◆ general_ifstream() [2/2]

turi::general_ifstream::general_ifstream ( std::string  filename,
bool  gzip_compressed 
)

Constructs a general ifstream object when opens the filename specified. The file may be on HDFS and may be gzip compressed. This overloaded constructor allows you to explicitly specify if the file was gzip compressed regardless of the filename.

Throw an std::io_base::failure exception if failing to contruct the stream.

Member Function Documentation

◆ file_size()

size_t turi::general_ifstream::file_size ( )

Returns the file size of the opened file. Returns (size_t)(-1) if there is no file opened, or if there is an error obtaining the file size.

◆ filename()

std::string turi::general_ifstream::filename ( ) const

Returns the local file name used by the stream.

◆ get_bytes_read()

size_t turi::general_ifstream::get_bytes_read ( )

Returns the number of bytes read from disk so far. Due to file compression and buffering this can be very different from how many bytes were read from the stream.

◆ get_underlying_stream()

std::shared_ptr<std::istream> turi::general_ifstream::get_underlying_stream ( )

Returns the underlying stream object


The documentation for this class was generated from the following file: