Data#
- class lumibot.entities.data.Data(asset, df, date_start=None, date_end=None, trading_hours_start=datetime.time(0, 0), trading_hours_end=datetime.time(23, 59), timestep='minute', quote=None, timezone=None)
- Bases: - object- Input and manage Pandas dataframes for backtesting. - Parameters:
- asset (Asset Object) – Asset to which this data is attached. 
- df (dataframe) – Pandas dataframe containing OHLCV etc. trade data. Loaded by user from csv. Index is date and must be pandas datetime64. Columns are strictly [“open”, “high”, “low”, “close”, “volume”] 
- quote (Asset Object) – The quote asset for this data. If not provided, then the quote asset will default to USD. 
- date_start (Datetime or None) – Starting date for this data, if not provided then first date in the dataframe. 
- date_end (Datetime or None) – Ending date for this data, if not provided then last date in the dataframe. 
- trading_hours_start (datetime.time or None) – If not supplied, then default is 0001 hrs. 
- trading_hours_end (datetime.time or None) – If not supplied, then default is 2359 hrs. 
- timestep (str) – Either “minute” (default) or “day” 
- localize_timezone (str or None) – If not None, then localize the timezone of the dataframe to the given timezone as a string. The values can be any supported by tz_localize, e.g. “US/Eastern”, “UTC”, etc. 
 
 - asset
- Asset object to which this data is attached. - Type:
- Asset Object 
 
 - sybmol
- The underlying or stock symbol as a string. - Type:
- str 
 
 - df
- Pandas dataframe containing OHLCV etc trade data. Loaded by user from csv. Index is date and must be pandas datetime64. Columns are strictly [“open”, “high”, “low”, “close”, “volume”] - Type:
- dataframe 
 
 - date_start
- Starting date for this data, if not provided then first date in the dataframe. - Type:
- Datetime or None 
 
 - date_end
- Ending date for this data, if not provided then last date in the dataframe. - Type:
- Datetime or None 
 
 - trading_hours_start
- If not supplied, then default is 0001 hrs. - Type:
- datetime.time or None 
 
 - trading_hours_end
- If not supplied, then default is 2359 hrs. - Type:
- datetime.time or None 
 
 - timestep
- Either “minute” (default) or “day” - Type:
- str 
 
 - datalines
- Keys are column names like datetime or close, values are numpy arrays. - Type:
- dict 
 
 - iter_index
- Datetime in the index, range count in values. Used to retrieve the current df iteration for this data and datetime. - Type:
- Pandas Series 
 
 - set_times()
- Sets the start and end time for the data. 
 - repair_times_and_fill()
- After all time series merged, adjust the local dataframe to reindex and fill nan’s. 
 - columns()
- Adjust date and column names to lower case. 
 - set_date_format()
- Ensure datetime in local datetime64 format. 
 - set_dates()
- Set start and end dates. 
 - trim_data()
- Trim the dataframe to match the desired backtesting dates. 
 - to_datalines()
- Create numpy datalines from existing date index and columns. 
 - get_iter_count()
- Returns the current index number (len) given a date. 
 - check_data(wrapper)
- Validates if the provided date, length, timeshift, and timestep will return data. Runs function if data, returns None if no data. 
 - get_last_price()
- Gets the last price from the current date. 
 - _get_bars_dict()
- Returns bars in the form of a dict. 
 - get_bars()
- Returns bars in the form of a dataframe. 
 - MIN_TIMESTEP = 'minute'
 - TIMESTEP_MAPPING = [{'representations': ['1D', 'day'], 'timestep': 'day'}, {'representations': ['1M', 'minute'], 'timestep': 'minute'}]
 - check_data()
 - columns(df)
 - get_bars(dt, length=1, timestep='minute', timeshift=0)
- Returns a dataframe of the data. - Parameters:
- dt (datetime.datetime) – The datetime to get the data. 
- length (int) – The number of periods to get the data. 
- timestep (str) – The frequency of the data to get the data. Only minute and day are supported. 
- timeshift (int) – The number of periods to shift the data. 
 
- Return type:
- pandas.DataFrame 
 
 - get_bars_between_dates(timestep='minute', exchange=None, start_date=None, end_date=None)
- Returns a dataframe of all the data available between the start and end dates. - Parameters:
- timestep (str) – The frequency of the data to get the data. Only minute and day are supported. 
- exchange (str) – The exchange to get the data for. 
- start_date (datetime.datetime) – The start date to get the data for. 
- end_date (datetime.datetime) – The end date to get the data for. 
 
- Return type:
- pandas.DataFrame 
 
 - get_iter_count(dt)
 - get_last_price(*args, **kwargs)
 - get_quote(*args, **kwargs)
 - repair_times_and_fill(idx)
 - set_date_format(df)
 - set_dates(date_start, date_end)
 - set_times(trading_hours_start, trading_hours_end)
- Set the start and end times for the data. The default is 0001 hrs to 2359 hrs. - Parameters:
- trading_hours_start (datetime.time) – The start time of the trading hours. 
- trading_hours_end (datetime.time) – The end time of the trading hours. 
 
- Returns:
- trading_hours_start (datetime.time) – The start time of the trading hours. 
- trading_hours_end (datetime.time) – The end time of the trading hours. 
 
 
 - to_datalines()
 - trim_data(df, date_start, date_end, trading_hours_start, trading_hours_end)