DataBento Backtesting
*********************

DataBento is a premium financial data provider that offers high-quality, clean market data for backtesting. Lumibot integrates with DataBento to provide reliable historical data for stocks, futures, options, and other instruments.

Overview
========

DataBento provides:

- **High-quality historical data** with minimal gaps or errors
- **Multiple timeframes** from tick-level to daily data
- **Extensive instrument coverage** including stocks, futures, and options
- **Clean data processing** with corporate action adjustments
- **API-based access** for automated data retrieval

Setting Up DataBento
====================

1. **Get DataBento API Key**
   
   Visit `DataBento <https://databento.com>`_ to sign up and get your API key.

2. **Install Dependencies**
   
   DataBento support is included with Lumibot, but you may need to install additional dependencies:

   .. code-block:: bash

       pip install databento

3. **Configure API Key**
   
   Set your DataBento API key in your environment or strategy:

   .. code-block:: python

       import os
       os.environ['DATABENTO_API_KEY'] = 'your_api_key_here'

   Or create a ``.env`` file:

   .. code-block:: bash

       DATABENTO_API_KEY=your_api_key_here

Basic Usage
===========

Here's how to use DataBento for backtesting:

.. code-block:: python

    from lumibot.strategies import Strategy
    from lumibot.entities import Asset
    from lumibot.backtesting import DataBentoDataBacktesting

    class MyStrategy(Strategy):
        def initialize(self):
            # Use continuous futures for clean backtesting
            self.asset = Asset("MES", asset_type=Asset.AssetType.CONT_FUTURE)
        
        def on_trading_iteration(self):
            # Get historical data
            bars = self.get_historical_prices(self.asset, 20, "minute")
            if bars and not bars.df.empty:
                # Your strategy logic here
                pass

    # Run backtest with DataBento
    if __name__ == "__main__":
        results = MyStrategy.backtest(
            DataBentoDataBacktesting,
            benchmark_asset=Asset("SPY", Asset.AssetType.STOCK)
        )

Supported Assets
================

DataBento supports a wide range of instruments:

**Stocks**

.. code-block:: python

    # Major stocks
    aapl = Asset("AAPL", asset_type=Asset.AssetType.STOCK)
    msft = Asset("MSFT", asset_type=Asset.AssetType.STOCK)
    googl = Asset("GOOGL", asset_type=Asset.AssetType.STOCK)

**Futures**

.. code-block:: python

    # Equity index futures (continuous)
    es = Asset("ES", asset_type=Asset.AssetType.CONT_FUTURE)  # S&P 500
    nq = Asset("NQ", asset_type=Asset.AssetType.CONT_FUTURE)  # NASDAQ 100
    rty = Asset("RTY", asset_type=Asset.AssetType.CONT_FUTURE)  # Russell 2000
    
    # Micro futures
    mes = Asset("MES", asset_type=Asset.AssetType.CONT_FUTURE)  # Micro S&P 500
    mnq = Asset("MNQ", asset_type=Asset.AssetType.CONT_FUTURE)  # Micro NASDAQ 100
    m2k = Asset("M2K", asset_type=Asset.AssetType.CONT_FUTURE)  # Micro Russell 2000
    
    # Commodity futures
    cl = Asset("CL", asset_type=Asset.AssetType.CONT_FUTURE)   # Crude Oil
    gc = Asset("GC", asset_type=Asset.AssetType.CONT_FUTURE)   # Gold
    ng = Asset("NG", asset_type=Asset.AssetType.CONT_FUTURE)   # Natural Gas

Futures-Specific Features
--------------------------

When backtesting futures with DataBento, Lumibot provides several specialized features:

**Automatic Multiplier Detection:**

Futures contract multipliers are automatically fetched from DataBento's definition schema:

.. code-block:: python

    # MES multiplier is automatically detected as 5
    mes = Asset("MES", asset_type=Asset.AssetType.CONT_FUTURE)

    # When you trade MES:
    # - 1 point move = $5 P&L per contract
    # - 10 contracts at +2 points = +$100 total P&L

Lumibot fetches contract specifications from DataBento including:

- Contract multiplier (e.g., 5 for MES, 50 for ES)
- Tick size and value
- Contract unit of measure
- Settlement type

This information is cached to avoid repeated API calls.

**Mark-to-Market Accounting:**

DataBento backtests use mark-to-market accounting that matches real futures trading:

.. code-block:: python

    # Example: Trading 1 MES contract
    # Starting capital: $100,000

    # BUY 1 MES @ $5,000
    # - Initial margin deducted: ~$1,300
    # - Cash: $98,700

    # Price moves to $5,010 (up 10 points)
    # - Mark-to-market: +10 points × $5 = +$50
    # - Cash: $98,750 (includes unrealized P&L)

    # SELL 1 MES @ $5,010
    # - Margin released: +$1,300
    # - Final P&L already in cash
    # - Cash: $100,050

Key accounting features:

1. **Entry**: Initial margin is deducted from cash (not full notional value)
2. **During Trade**: Cash is updated every iteration with unrealized P&L changes
3. **Exit**: Margin is released and final P&L settlement applied

This ensures:

- Cash always shows available buying power
- Portfolio value = Cash (includes all unrealized P&L)
- Leverage tracking is accurate
- Results match real broker accounting

For more details on futures accounting, see the :doc:`futures` documentation.

**Symbol Resolution:**

DataBento automatically handles symbol resolution for continuous futures:

.. code-block:: python

    # You specify the root symbol
    mes = Asset("MES", asset_type=Asset.AssetType.CONT_FUTURE)

    # DataBento resolves to actual contracts:
    # - For Jan 2024: MESH4 (March 2024 expiry)
    # - For Apr 2024: MESM4 (June 2024 expiry)
    # - Seamless rollover handling

This makes backtesting across multiple years seamless without managing contract expirations.

**Options** (when supported)

.. code-block:: python

    from datetime import date
    
    # Stock options
    aapl_call = Asset(
        symbol="AAPL",
        asset_type=Asset.AssetType.OPTION,
        expiration=date(2025, 12, 19),
        strike=150,
        right="CALL"
    )

Time Frames
===========

DataBento supports multiple timeframes:

.. code-block:: python

    class DataStrategy(Strategy):
        def on_trading_iteration(self):
            # Different timeframes
            minute_data = self.get_historical_prices(self.asset, 100, "minute")
            hour_data = self.get_historical_prices(self.asset, 24, "hour") 
            daily_data = self.get_historical_prices(self.asset, 30, "day")
            
            # Use the data for analysis
            if minute_data and not minute_data.df.empty:
                # High-frequency analysis
                latest_price = minute_data.df['close'].iloc[-1]

Advanced Configuration
========================

You can configure DataBento backtesting with additional parameters:

.. code-block:: python

    from datetime import datetime
    from lumibot.backtesting import DataBentoDataBacktesting

    # Custom backtest configuration
    backtest_start = datetime(2024, 1, 1)
    backtest_end = datetime(2024, 12, 31)

    results = MyStrategy.backtest(
        DataBentoDataBacktesting,
        start=backtest_start,
        end=backtest_end,
        benchmark_asset=Asset("SPY", Asset.AssetType.STOCK),
        show_plot=True,
        show_tearsheet=True,
        save_tearsheet=True
    )

Data Quality Features
========================

DataBento provides several data quality features:

**Corporate Actions**
- Automatic dividend adjustments
- Stock split adjustments
- Merger and acquisition handling

**Data Cleaning**
- Outlier detection and removal
- Gap filling for missing data
- Timestamp normalization

**Market Hours**
- Proper market hour filtering
- Pre-market and after-hours data
- Holiday schedule handling

Caching
=======

Lumibot automatically caches DataBento data to improve performance:

.. code-block:: python

    # Data is automatically cached locally
    # Subsequent requests for the same data will be faster
    bars = self.get_historical_prices(asset, 100, "minute")

Cache files are stored in the Lumibot cache directory and are automatically managed.

Best Practices
==============

1. **Use Continuous Futures**
   
   For futures backtesting, always use continuous contracts for seamless data across expiration rollovers.

2. **Batch Data Requests**
   
   Request larger chunks of data rather than making many small requests.

3. **Monitor API Limits**
   
   DataBento has API rate limits. Avoid excessive requests in short time periods.

4. **Cache Management**
   
   Let Lumibot handle caching automatically. Clear cache only when needed.

5. **Data Validation**
   
   Always check that data is available before using it in your strategy.

Example: Multi-Asset Strategy
==============================

Here's a complete example using multiple assets with DataBento:

.. code-block:: python

    from lumibot.strategies import Strategy
    from lumibot.entities import Asset, Order
    from lumibot.backtesting import DataBentoDataBacktesting
    import pandas as pd

    class MultiAssetStrategy(Strategy):
        def initialize(self):
            # Portfolio of futures contracts
            self.assets = [
                Asset("MES", asset_type=Asset.AssetType.CONT_FUTURE),  # Micro S&P 500
                Asset("MNQ", asset_type=Asset.AssetType.CONT_FUTURE),  # Micro NASDAQ 100
                Asset("M2K", asset_type=Asset.AssetType.CONT_FUTURE),  # Micro Russell 2000
            ]
            self.lookback_period = 20
            
        def on_trading_iteration(self):
            for asset in self.assets:
                # Get data for each asset
                bars = self.get_historical_prices(asset, self.lookback_period, "day")
                
                if bars and len(bars.df) >= self.lookback_period:
                    # Calculate momentum
                    returns = bars.df['close'].pct_change().dropna()
                    momentum = returns.tail(5).mean()  # 5-day average return
                    
                    position = self.get_position(asset)
                    
                    # Long momentum strategy
                    if momentum > 0.001:  # Positive momentum threshold
                        if position is None or position.quantity <= 0:
                            order = self.create_order(asset, 1, "buy")
                            self.submit_order(order)
                    
                    # Short momentum strategy  
                    elif momentum < -0.001:  # Negative momentum threshold
                        if position is None or position.quantity >= 0:
                            if position and position.quantity > 0:
                                # Close long first
                                close_order = self.create_order(asset, position.quantity, "sell")
                                self.submit_order(close_order)
                            # Then go short
                            order = self.create_order(asset, 1, "sell")
                            self.submit_order(order)

    if __name__ == "__main__":
        results = MultiAssetStrategy.backtest(
            DataBentoDataBacktesting,
            benchmark_asset=Asset("SPY", Asset.AssetType.STOCK)
        )

Error Handling
==============

Handle common DataBento issues gracefully:

.. code-block:: python

    class RobustStrategy(Strategy):
        def on_trading_iteration(self):
            try:
                bars = self.get_historical_prices(self.asset, 20, "minute")
                
                if bars is None or bars.df.empty:
                    self.log_message("No data available", color="yellow")
                    return
                
                # Your strategy logic here
                
            except Exception as e:
                self.log_message(f"Data error: {e}", color="red")
                return

Performance Optimization
===========================

Tips for optimizing DataBento performance:

1. **Minimize Data Requests**
   
   Request data once and reuse it within the same iteration.

2. **Use Appropriate Timeframes**
   
   Don't request minute data if you only need daily signals.

3. **Leverage Caching**
   
   Repeated backtests will be faster due to automatic caching.

4. **Batch Processing**
   
   Process multiple assets efficiently in loops.

Troubleshooting
==================

**Common Issues:**

1. **"No DataBento API key found"**
   
   - Set the ``DATABENTO_API_KEY`` environment variable
   - Check your .env file configuration

2. **"Rate limit exceeded"**
   
   - Reduce the frequency of data requests
   - Use longer timeframes when possible
   - Add delays between requests if needed

3. **"No data available for symbol"**
   
   - Verify the symbol is correct
   - Check if DataBento supports the instrument
   - Ensure the date range is valid

4. **"Connection timeout"**
   
   - Check your internet connection
   - Verify DataBento service status
   - Retry the request

Cost Considerations
=====================

DataBento is a premium service with costs based on:

- **Data volume** (number of symbols and timeframes)
- **Historical depth** (how far back you request data)
- **API usage** (number of requests)

For cost-effective backtesting:

- Use continuous futures instead of multiple expiry contracts
- Request appropriate timeframes (don't use minute data for daily strategies)
- Leverage caching to avoid repeated requests
- Focus on the symbols you actually need

DataBento provides excellent value for professional strategy development due to its data quality and reliability.