turicreate.config.set_runtime_config

turicreate.config.set_runtime_config(name, value)

Configures system behavior at runtime. These configuration values are also read from environment variables at program startup if available. See turicreate.config.get_runtime_config() to get the current values for each variable.

Note that defaults may change across versions and the names of performance tuning constants may also change as improved algorithms are developed and implemented.

Parameters:
name : string

A string referring to runtime configuration variable.

value

The value to set the variable to.

Raises:
RuntimeError

If the key does not exist, or if the value cannot be changed to the requested value.

Notes

The following section documents all the Turi Create environment variables that can be configured.

Basic Configuration Variables

  • TURI_NUM_GPUS: Number of GPUs to use when applicable. Set to 0 to force CPU use in all situations.
  • TURI_CACHE_FILE_LOCATIONS: The directory in which intermediate SFrames/SArray are stored. For instance “/var/tmp”. Multiple directories can be specified separated by a colon (ex: “/var/tmp:/tmp”) in which case intermediate SFrames will be striped across both directories (useful for specifying multiple disks). Defaults to /var/tmp if the directory exists, /tmp otherwise.
  • TURI_FILEIO_MAXIMUM_CACHE_CAPACITY: The maximum amount of memory which will be occupied by all intermediate SFrames/SArrays. Once this limit is exceeded, SFrames/SArrays will be flushed out to temporary storage (as specified by TURI_CACHE_FILE_LOCATIONS). On large systems increasing this as well as TURI_FILEIO_MAXIMUM_CACHE_CAPACITY_PER_FILE can improve performance significantly. Defaults to 2147483648 bytes (2GB).
  • TURI_FILEIO_MAXIMUM_CACHE_CAPACITY_PER_FILE: The maximum amount of memory which will be occupied by any individual intermediate SFrame/SArray. Once this limit is exceeded, the SFrame/SArray will be flushed out to temporary storage (as specified by TURI_CACHE_FILE_LOCATIONS). On large systems, increasing this as well as TURI_FILEIO_MAXIMUM_CACHE_CAPACITY can improve performance significantly for large SFrames. Defaults to 134217728 bytes (128MB).

S3 Configuration

  • TURI_S3_ENDPOINT: The S3 Endpoint to connect to. If not specified AWS S3 is assumed.
  • TURI_S3_REGION: The S3 Region to connect to. If this environment variable if not set, AWS_DEFAULT_REGION will be loaded if available. Otherwise, S3 region is then inferred by looking up the commonly used url-to-region mappings in our codebase. If no url is matched, no region information is set and AWS will do the best guess for the region.

SSL Configuration

  • TURI_FILEIO_ALTERNATIVE_SSL_CERT_FILE: The location of an SSL certificate file used to validate HTTPS / S3 connections. Defaults to the the Python certifi package certificates.
  • TURI_FILEIO_ALTERNATIVE_SSL_CERT_DIR: The location of an SSL certificate directory used to validate HTTPS / S3 connections. Defaults to the operating system certificates.
  • TURI_FILEIO_INSECURE_SSL_CERTIFICATE_CHECKS: If set to a non-zero value, disables all SSL certificate validation. Defaults to False.

Sort Performance Configuration

  • TURI_SFRAME_SORT_PIVOT_ESTIMATION_SAMPLE_SIZE: The number of random rows to sample from the SFrame to estimate the sort pivots used to partition the sort. Defaults to 2000000.
  • TURI_SFRAME_SORT_BUFFER_SIZE: The maximum estimated memory consumption sort is allowed to use. Increasing this will increase the size of each sort partition, and will increase performance with increased memory consumption. Defaults to 2GB.

Join Performance Configuration

  • TURI_SFRAME_JOIN_BUFFER_NUM_CELLS: The maximum number of cells to buffer in memory. Increasing this will increase the size of each join partition and will increase performance with increased memory consumption. If you have very large cells (very long strings for instance), decreasing this value will help decrease memory consumption. Defaults to 52428800.

Groupby Aggregate Performance Configuration

  • TURI_SFRAME_GROUPBY_BUFFER_NUM_ROWS: The number of groupby keys cached in memory. Increasing this will increase performance with increased memory consumption. Defaults to 1048576.

Advanced Configuration Variables

  • TURI_SFRAME_FILE_HANDLE_POOL_SIZE: The maximum number of file handles to use when reading SFrames/SArrays. Once this limit is exceeded, file handles will be recycled, reducing performance. This limit should be rarely approached by most SFrame/SArray operations. Large SGraphs however may create a large a number of SFrames in which case increasing this limit may improve performance (You may also need to increase the system file handle limit with “ulimit -n”). Defaults to 128.