Data#
- class entities.data.Data(asset, df, date_start=None, date_end=None, trading_hours_start=datetime.time(0, 0), trading_hours_end=datetime.time(23, 59), timestep='minute', quote=None, timezone=None)#
Bases:
object
Input and manage Pandas dataframes for backtesting.
- Parameters:
asset (Asset Object) – Asset to which this data is attached.
df (dataframe) – Pandas dataframe containing OHLCV etc. trade data. Loaded by user from csv. Index is date and must be pandas datetime64. Columns are strictly [“open”, “high”, “low”, “close”, “volume”]
quote (Asset Object) – The quote asset for this data. If not provided, then the quote asset will default to USD.
date_start (Datetime or None) – Starting date for this data, if not provided then first date in the dataframe.
date_end (Datetime or None) – Ending date for this data, if not provided then last date in the dataframe.
trading_hours_start (datetime.time or None) – If not supplied, then default is 0001 hrs.
trading_hours_end (datetime.time or None) – If not supplied, then default is 2359 hrs.
timestep (str) – Either “minute” (default) or “day”
localize_timezone (str or None) – If not None, then localize the timezone of the dataframe to the given timezone as a string. The values can be any supported by tz_localize, e.g. “US/Eastern”, “UTC”, etc.
- asset#
Asset object to which this data is attached.
- Type:
Asset Object
- sybmol#
The underlying or stock symbol as a string.
- Type:
str
- df#
Pandas dataframe containing OHLCV etc trade data. Loaded by user from csv. Index is date and must be pandas datetime64. Columns are strictly [“open”, “high”, “low”, “close”, “volume”]
- Type:
dataframe
- date_start#
Starting date for this data, if not provided then first date in the dataframe.
- Type:
Datetime or None
- date_end#
Ending date for this data, if not provided then last date in the dataframe.
- Type:
Datetime or None
- trading_hours_start#
If not supplied, then default is 0001 hrs.
- Type:
datetime.time or None
- trading_hours_end#
If not supplied, then default is 2359 hrs.
- Type:
datetime.time or None
- timestep#
Either “minute” (default) or “day”
- Type:
str
- datalines#
Keys are column names like datetime or close, values are numpy arrays.
- Type:
dict
- iter_index#
Datetime in the index, range count in values. Used to retrieve the current df iteration for this data and datetime.
- Type:
Pandas Series
- set_times()#
Sets the start and end time for the data.
- repair_times_and_fill()#
After all time series merged, adjust the local dataframe to reindex and fill nan’s.
- columns()#
Adjust date and column names to lower case.
- set_date_format()#
Ensure datetime in local datetime64 format.
- set_dates()#
Set start and end dates.
- trim_data()#
Trim the dataframe to match the desired backtesting dates.
- to_datalines()#
Create numpy datalines from existing date index and columns.
- get_iter_count()#
Returns the current index number (len) given a date.
- check_data(wrapper)#
Validates if the provided date, length, timeshift, and timestep will return data. Runs function if data, returns None if no data.
- get_last_price()#
Gets the last price from the current date.
- _get_bars_dict()#
Returns bars in the form of a dict.
- get_bars()#
Returns bars in the form of a dataframe.
- MIN_TIMESTEP = 'minute'#
- TIMESTEP_MAPPING = [{'representations': ['1D', 'day'], 'timestep': 'day'}, {'representations': ['1M', 'minute'], 'timestep': 'minute'}]#
- check_data()#
- columns(df)#
- get_bars(dt, length=1, timestep='minute', timeshift=0)#
Returns a dataframe of the data.
- Parameters:
dt (datetime.datetime) – The datetime to get the data.
length (int) – The number of periods to get the data.
timestep (str) – The frequency of the data to get the data. Only minute and day are supported.
timeshift (int) – The number of periods to shift the data.
- Return type:
pandas.DataFrame
- get_bars_between_dates(timestep='minute', exchange=None, start_date=None, end_date=None)#
Returns a dataframe of all the data available between the start and end dates.
- Parameters:
timestep (str) – The frequency of the data to get the data. Only minute and day are supported.
exchange (str) – The exchange to get the data for.
start_date (datetime.datetime) – The start date to get the data for.
end_date (datetime.datetime) – The end date to get the data for.
- Return type:
pandas.DataFrame
- get_iter_count(dt)#
- get_last_price(*args, **kwargs)#
- get_quote(*args, **kwargs)#
- repair_times_and_fill(idx)#
- set_date_format(df)#
- set_dates(date_start, date_end)#
- set_times(trading_hours_start, trading_hours_end)#
Set the start and end times for the data. The default is 0001 hrs to 2359 hrs.
- Parameters:
trading_hours_start (datetime.time) – The start time of the trading hours.
trading_hours_end (datetime.time) – The end time of the trading hours.
- Returns:
trading_hours_start (datetime.time) – The start time of the trading hours.
trading_hours_end (datetime.time) – The end time of the trading hours.
- to_datalines()#
- trim_data(df, date_start, date_end, trading_hours_start, trading_hours_end)#