Data¶
- class entities.data.Data(asset, df, date_start=None, date_end=None, trading_hours_start=datetime.time(0, 0), trading_hours_end=datetime.time(23, 59), timestep='minute', quote=None, timezone=None)¶
Bases:
object
Input and manage Pandas dataframes for backtesting.
- Parameters:
asset (Asset Object) – Asset to which this data is attached.
df (dataframe) – Pandas dataframe containing OHLCV etc. trade data. Loaded by user from csv. Index is date and must be pandas datetime64. Columns are strictly [“open”, “high”, “low”, “close”, “volume”]
quote (Asset Object) – The quote asset for this data. If not provided, then the quote asset will default to USD.
date_start (Datetime or None) – Starting date for this data, if not provided then first date in the dataframe.
date_end (Datetime or None) – Ending date for this data, if not provided then last date in the dataframe.
trading_hours_start (datetime.time or None) – If not supplied, then default is 0001 hrs.
trading_hours_end (datetime.time or None) – If not supplied, then default is 2359 hrs.
timestep (str) – Either “minute” (default) or “day”
localize_timezone (str or None) – If not None, then localize the timezone of the dataframe to the given timezone as a string. The values can be any supported by tz_localize, e.g. “US/Eastern”, “UTC”, etc.
- asset¶
Asset object to which this data is attached.
- Type:
Asset Object
- sybmol¶
The underlying or stock symbol as a string.
- Type:
str
- df¶
Pandas dataframe containing OHLCV etc trade data. Loaded by user from csv. Index is date and must be pandas datetime64. Columns are strictly [“open”, “high”, “low”, “close”, “volume”]
- Type:
dataframe
- date_start¶
Starting date for this data, if not provided then first date in the dataframe.
- Type:
Datetime or None
- date_end¶
Ending date for this data, if not provided then last date in the dataframe.
- Type:
Datetime or None
- trading_hours_start¶
If not supplied, then default is 0001 hrs.
- Type:
datetime.time or None
- trading_hours_end¶
If not supplied, then default is 2359 hrs.
- Type:
datetime.time or None
- timestep¶
Either “minute” (default) or “day”
- Type:
str
- datalines¶
Keys are column names like datetime or close, values are numpy arrays.
- Type:
dict
- iter_index¶
Datetime in the index, range count in values. Used to retrieve the current df iteration for this data and datetime.
- Type:
Pandas Series
- set_times()¶
Sets the start and end time for the data.
- repair_times_and_fill()¶
After all time series merged, adjust the local dataframe to reindex and fill nan’s.
- columns()¶
Adjust date and column names to lower case.
- set_date_format()¶
Ensure datetime in local datetime64 format.
- set_dates()¶
Set start and end dates.
- trim_data()¶
Trim the dataframe to match the desired backtesting dates.
- to_datalines()¶
Create numpy datalines from existing date index and columns.
- get_iter_count()¶
Returns the current index number (len) given a date.
- check_data(wrapper)¶
Validates if the provided date, length, timeshift, and timestep will return data. Runs function if data, returns None if no data.
- get_last_price()¶
Gets the last price from the current date.
- _get_bars_dict()¶
Returns bars in the form of a dict.
- get_bars()¶
Returns bars in the form of a dataframe.
- MIN_TIMESTEP = 'minute'¶
- TIMESTEP_MAPPING = [{'representations': ['1D', 'day'], 'timestep': 'day'}, {'representations': ['1M', 'minute'], 'timestep': 'minute'}]¶
- check_data()¶
- columns(df)¶
- get_bars(dt, length=1, timestep='minute', timeshift=0)¶
Returns a dataframe of the data.
- Parameters:
dt (datetime.datetime) – The datetime to get the data.
length (int) – The number of periods to get the data.
timestep (str) – The frequency of the data to get the data. Only minute and day are supported.
timeshift (int) – The number of periods to shift the data.
- Return type:
pandas.DataFrame
- get_bars_between_dates(timestep='minute', exchange=None, start_date=None, end_date=None)¶
Returns a dataframe of all the data available between the start and end dates.
- Parameters:
timestep (str) – The frequency of the data to get the data. Only minute and day are supported.
exchange (str) – The exchange to get the data for.
start_date (datetime.datetime) – The start date to get the data for.
end_date (datetime.datetime) – The end date to get the data for.
- Return type:
pandas.DataFrame
- get_iter_count(dt)¶
- get_last_price(*args, **kwargs)¶
- get_quote(*args, **kwargs)¶
- repair_times_and_fill(idx)¶
- set_date_format(df)¶
- set_dates(date_start, date_end)¶
- set_times(trading_hours_start, trading_hours_end)¶
Set the start and end times for the data. The default is 0001 hrs to 2359 hrs.
- Parameters:
trading_hours_start (datetime.time) – The start time of the trading hours.
trading_hours_end (datetime.time) – The end time of the trading hours.
- Returns:
trading_hours_start (datetime.time) – The start time of the trading hours.
trading_hours_end (datetime.time) – The end time of the trading hours.
- to_datalines()¶
- trim_data(df, date_start, date_end, trading_hours_start, trading_hours_end)¶