turicreate.SArray.split_datetime

SArray.split_datetime(column_name_prefix='X', limit=None, timezone=False)

Splits an SArray of datetime type to multiple columns, return a new SFrame that contains expanded columns. A SArray of datetime will be split by default into an SFrame of 6 columns, one for each year/month/day/hour/minute/second element.

Column Naming

When splitting a SArray of datetime type, new columns are named: prefix.year, prefix.month, etc. The prefix is set by the parameter “column_name_prefix” and defaults to ‘X’. If column_name_prefix is None or empty, then no prefix is used.

Timezone Column If timezone parameter is True, then timezone information is represented as one additional column which is a float shows the offset from GMT(0.0) or from UTC.

Parameters:
column_name_prefix: str, optional

If provided, expanded column names would start with the given prefix. Defaults to “X”.

limit: list[str], optional

Limits the set of datetime elements to expand. Possible values are ‘year’,’month’,’day’,’hour’,’minute’,’second’, ‘weekday’, ‘isoweekday’, ‘tmweekday’, and ‘us’. If not provided, only [‘year’,’month’,’day’,’hour’,’minute’,’second’] are expanded.

  • ‘year’: The year number
  • ‘month’: A value between 1 and 12 where 1 is January.
  • ‘day’: Day of the months. Begins at 1.
  • ‘hour’: Hours since midnight.
  • ‘minute’: Minutes after the hour.
  • ‘second’: Seconds after the minute.
  • ‘us’: Microseconds after the second. Between 0 and 999,999.
  • ‘weekday’: A value between 0 and 6 where 0 is Monday.
  • ‘isoweekday’: A value between 1 and 7 where 1 is Monday.
  • ‘tmweekday’: A value between 0 and 7 where 0 is Sunday
timezone: bool, optional

A boolean parameter that determines whether to show timezone column or not. Defaults to False.

Returns:
out : SFrame

A new SFrame that contains all expanded columns

Examples

To expand only day and year elements of a datetime SArray

>>> sa = SArray(
   [datetime(2011, 1, 21, 7, 7, 21, tzinfo=GMT(0)),
    datetime(2010, 2, 5, 7, 8, 21, tzinfo=GMT(4.5)])
>>> sa.split_datetime(column_name_prefix=None,limit=['day','year'])
   Columns:
       day   int
       year  int
   Rows: 2
   Data:
   +-------+--------+
   |  day  |  year  |
   +-------+--------+
   |   21  |  2011  |
   |   5   |  2010  |
   +-------+--------+
   [2 rows x 2 columns]

To expand only year and timezone elements of a datetime SArray with timezone column represented as a string. Columns are named with prefix: ‘Y.column_name’.

>>> sa.split_datetime(column_name_prefix="Y",limit=['year'],timezone=True)
    Columns:
        Y.year  int
        Y.timezone float
    Rows: 2
    Data:
    +----------+---------+
    |  Y.year  | Y.timezone |
    +----------+---------+
    |    2011  |  0.0    |
    |    2010  |  4.5    |
    +----------+---------+
    [2 rows x 2 columns]