Data

class entities.data.Data(asset, df, date_start=None, date_end=None, trading_hours_start=datetime.time(0, 0), trading_hours_end=datetime.time(23, 59), timestep='minute', quote=None, timezone=None)

Bases: object

Input and manage Pandas dataframes for backtesting.

Parameters
  • asset (Asset Object) – Asset object to which this data is attached.

  • df (dataframe) – Pandas dataframe containing OHLCV etc trade data. Loaded by user from csv. Index is date and must be pandas datetime64. Columns are strictly [“open”, “high”, “low”, “close”, “volume”]

  • date_start (Datetime or None) – Starting date for this data, if not provided then first date in the dataframe.

  • date_end (Datetime or None) – Ending date for this data, if not provided then last date in the dataframe.

  • trading_hours_start (datetime.time or None) – If not supplied, then default is 0001 hrs.

  • trading_hours_end (datetime.time or None) – If not supplied, then default is 2359 hrs.

  • timestep (str) – Either “minute” (default) or “day”

  • localize_timezone (str or None) – If not None, then localize the timezone of the dataframe to the given timezone as a string. The values can be any supported by tz_localize, e.g. “US/Eastern”, “UTC”, etc.

asset

Asset object to which this data is attached.

Type

Asset Object

sybmol

The underlying or stock symbol as a string.

Type

str

df

Pandas dataframe containing OHLCV etc trade data. Loaded by user from csv. Index is date and must be pandas datetime64. Columns are strictly [“open”, “high”, “low”, “close”, “volume”]

Type

dataframe

date_start

Starting date for this data, if not provided then first date in the dataframe.

Type

Datetime or None

date_end

Ending date for this data, if not provided then last date in the dataframe.

Type

Datetime or None

trading_hours_start

If not supplied, then default is 0001 hrs.

Type

datetime.time or None

trading_hours_end

If not supplied, then default is 2359 hrs.

Type

datetime.time or None

timestep

Either “minute” (default) or “day”

Type

str

datalines

Keys are column names like datetime or close, values are numpy arrays.

Type

dict

iter_index

Datetime in the index, range count in values. Used to retrieve the current df iteration for this data and datetime.

Type

Pandas Series

set_times()

Sets the start and end time for the data.

repair_times_and_fill()

After all time series merged, adjust the local dataframe to reindex and fill nan’s.

columns()

Adjust date and column names to lower case.

set_date_format()

Ensure datetime in local datetime64 format.

set_dates()

Set start and end dates.

trim_data()

Trim the dataframe to match the desired backtesting dates.

to_datalines()

Create numpy datalines from existing date index and columns.

get_iter_count()

Returns the current index number (len) given a date.

check_data(wrapper)

Validates if the provided date, length, timeshift, and timestep will return data. Runs function if data, returns None if no data.

get_last_price()

Gets the last price from the current date.

_get_bars_dict()

Returns bars in the form of a dict.

get_bars()

Returns bars in the form of a dataframe.

MIN_TIMESTEP = 'minute'
TIMESTEP_MAPPING = [{'timestep': 'day', 'representations': ['1D', 'day']}, {'timestep': 'minute', 'representations': ['1M', 'minute']}]
check_data()
columns(df)
get_bars(dt, length=1, timestep='minute', timeshift=0, exchange=None)

Returns a dictionary of the data.

Parameters
  • dt (datetime.datetime) – The datetime to get the data.

  • length (int) – The number of periods to get the data.

  • timestep (str) – The frequency of the data to get the data. Only minute and day are supported.

  • timeshift (int) – The number of periods to shift the data.

Returns

Return type

pandas.DataFrame

get_iter_count(dt)
get_last_price(*args, **kwargs)
is_tradable(*args, **kwargs)
repair_times_and_fill(idx)
set_date_format(df)
set_dates(date_start, date_end)
set_times(trading_hours_start, trading_hours_end)

Set the start and end times for the data. The default is 0001 hrs to 2359 hrs.

Parameters
  • trading_hours_start (datetime.time) – The start time of the trading hours.

  • trading_hours_end (datetime.time) – The end time of the trading hours.

Returns

  • trading_hours_start (datetime.time) – The start time of the trading hours.

  • trading_hours_end (datetime.time) – The end time of the trading hours.

to_datalines()
trim_data(df, date_start, date_end, trading_hours_start, trading_hours_end)