Data#
- class lumibot.entities.data.Data(asset, df, date_start=None, date_end=None, trading_hours_start=datetime.time(0, 0), trading_hours_end=datetime.time(23, 59), timestep='minute', quote=None, timezone=None)
Bases:
object
Input and manage Pandas dataframes for backtesting.
- Parameters:
asset (Asset Object) – Asset to which this data is attached.
df (dataframe) – Pandas dataframe containing OHLCV etc. trade data. Loaded by user from csv. Index is date and must be pandas datetime64. Columns are strictly [“open”, “high”, “low”, “close”, “volume”]
quote (Asset Object) – The quote asset for this data. If not provided, then the quote asset will default to USD.
date_start (Datetime or None) – Starting date for this data, if not provided then first date in the dataframe.
date_end (Datetime or None) – Ending date for this data, if not provided then last date in the dataframe.
trading_hours_start (datetime.time or None) – If not supplied, then default is 0001 hrs.
trading_hours_end (datetime.time or None) – If not supplied, then default is 2359 hrs.
timestep (str) – Either “minute” (default) or “day”
localize_timezone (str or None) – If not None, then localize the timezone of the dataframe to the given timezone as a string. The values can be any supported by tz_localize, e.g. “US/Eastern”, “UTC”, etc.
- asset
Asset object to which this data is attached.
- Type:
Asset Object
- sybmol
The underlying or stock symbol as a string.
- Type:
str
- df
Pandas dataframe containing OHLCV etc trade data. Loaded by user from csv. Index is date and must be pandas datetime64. Columns are strictly [“open”, “high”, “low”, “close”, “volume”]
- Type:
dataframe
- date_start
Starting date for this data, if not provided then first date in the dataframe.
- Type:
Datetime or None
- date_end
Ending date for this data, if not provided then last date in the dataframe.
- Type:
Datetime or None
- trading_hours_start
If not supplied, then default is 0001 hrs.
- Type:
datetime.time or None
- trading_hours_end
If not supplied, then default is 2359 hrs.
- Type:
datetime.time or None
- timestep
Either “minute” (default) or “day”
- Type:
str
- datalines
Keys are column names like datetime or close, values are numpy arrays.
- Type:
dict
- iter_index
Datetime in the index, range count in values. Used to retrieve the current df iteration for this data and datetime.
- Type:
Pandas Series
- set_times()
Sets the start and end time for the data.
- repair_times_and_fill()
After all time series merged, adjust the local dataframe to reindex and fill nan’s.
- columns()
Adjust date and column names to lower case.
- set_date_format()
Ensure datetime in local datetime64 format.
- set_dates()
Set start and end dates.
- trim_data()
Trim the dataframe to match the desired backtesting dates.
- to_datalines()
Create numpy datalines from existing date index and columns.
- get_iter_count()
Returns the current index number (len) given a date.
- check_data(wrapper)
Validates if the provided date, length, timeshift, and timestep will return data. Runs function if data, returns None if no data.
- get_last_price()
Gets the last price from the current date.
- _get_bars_dict()
Returns bars in the form of a dict.
- get_bars()
Returns bars in the form of a dataframe.
- MIN_TIMESTEP = 'minute'
- TIMESTEP_MAPPING = [{'representations': ['1D', 'day'], 'timestep': 'day'}, {'representations': ['1M', 'minute'], 'timestep': 'minute'}]
- check_data()
- columns(df)
- get_bars(dt, length=1, timestep='minute', timeshift=0)
Returns a dataframe of the data.
- Parameters:
dt (datetime.datetime) – The datetime to get the data.
length (int) – The number of periods to get the data.
timestep (str) – The frequency of the data to get the data. Only minute and day are supported.
timeshift (int) – The number of periods to shift the data.
- Return type:
pandas.DataFrame
- get_bars_between_dates(timestep='minute', exchange=None, start_date=None, end_date=None)
Returns a dataframe of all the data available between the start and end dates.
- Parameters:
timestep (str) – The frequency of the data to get the data. Only minute and day are supported.
exchange (str) – The exchange to get the data for.
start_date (datetime.datetime) – The start date to get the data for.
end_date (datetime.datetime) – The end date to get the data for.
- Return type:
pandas.DataFrame
- get_iter_count(dt)
- get_last_price(*args, **kwargs)
- get_quote(*args, **kwargs)
- repair_times_and_fill(idx)
- set_date_format(df)
- set_dates(date_start, date_end)
- set_times(trading_hours_start, trading_hours_end)
Set the start and end times for the data. The default is 0001 hrs to 2359 hrs.
- Parameters:
trading_hours_start (datetime.time) – The start time of the trading hours.
trading_hours_end (datetime.time) – The end time of the trading hours.
- Returns:
trading_hours_start (datetime.time) – The start time of the trading hours.
trading_hours_end (datetime.time) – The end time of the trading hours.
- to_datalines()
- trim_data(df, date_start, date_end, trading_hours_start, trading_hours_end)