Forecasting

GroveStreams Help Center
Forecasting (Beta)

Overview

Welcome to the GroveStreams Forecasting Help Page! This guide provides detailed information on how to use GroveStreams forecasting and correlation detection tools to analyze and predict trends in your time-series data. By integrating with the Darts library, GroveStreams offers a robust suite of forecasting models accessible through an intuitive Model Forecasting Builder Wizard and a Correlation Wizard.

Why Integrate with Darts?

GroveStreams leverages Darts, a powerful Python library, to provide advanced forecasting capabilities. This integration enables users to utilize state-of-the-art models without requiring deep programming knowledge, making forecasting accessible and efficient.

Supported Forecasting Models

GroveStreams supports the following Darts models (Model details below):

TFTModel (Temporal Fusion Transformer - Deep Learning)
NBEATSModel (Deep Learning)
ARIMA
Prophet
TCNModel (Temporal Convolutional Network - Deep Learning)
TransformerModel (Deep Learning)
ExponentialSmoothing
RNNModel (Recurrent Neural Network - Deep Learning)

Forecasting Model Builder

GroveStreams has a model builder, along with wizards, to assist with training, forecasting and analyzing results. The simplest way to create a model is to use the Model Forecasting Wizard.

Creating a Forecast Model

Select the Tools tab and then right clicking on Forecasting Models, or a model folder, and choosing New - Model from Forecasting Wizard:
GroveStreams - Models

The first step of the wizard is the selection of the stream to be used for training:
GroveStreams - Models

The wizard will ask you questions to help determine which Dart's model should be used for forecasting. It will then prompt for more information. Follow the wizard steps and then click Finish. The model will be trained and a forecast stream will be created within the same component as the target stream.

The new model will appear under Tools - Forecast Models. Right-click it and choose Edit to fine-tune the new model. The Forecast Model Builder will appear.

The Forecast Model Builder

The wizard is a great way to get started if you're new to model building. Experienced users can skip the wizard and work directly with the Model Builder:
GroveStreams - Models

Forecasting requires 'training' or 'fitting' a model by utilizing existing stream data. A trained model is used to 'forecast' or 'predict' future stream values. A trained model can optionaly be saved within a GroveStreams component file stream and used to forecast many other streams based on their historical values and other influencers.

The above model builder has three 'execute' options:

Train and Forecast: Will train a model and then use the model, in memory, to forecast stream values. The result is a stream similar to the target training stream with "- Forecast" appended onto the name.
Train Only: Only trains a model. An existing Dart's model, stored within a file stream can be used to extend training (for Dart's models that support it). The resulting stream can be saved, within a file stream, and used for future training or for predicting.
Forecast Only:Uses an existing Dart's model, stored within a file stream, to forecast results.

Supported Stream Value Types:
All of the listed forecasting models require streams (both target series and any covariates) to be numerical values, such as integers or floats. They do not support text strings directly in the time-series data. If categorical data is involved (e.g., as static covariates in some deep learning models), it must be encoded numerically beforehand.

Configure General Settings:

Name: Enter a unique name for your model.
Description: Add an optional description.
Forecast Model: Select a model from the dropdown (e.g., TFTModel, ARIMA).
Time Alignment Cycle: Choose an interval cycle for data alignment (required for most models except Prophet).
Maximum Allowed Runtime: Set a timeout in seconds (0 for no limit). Depending on the model training inputs, runtimes can sometimes take hours. This setting will cancel training for long runs.

Train (Fit) Tab:

Training Range: Specify the number of intervals to train from and the start date (e.g., end of target stream or a fixed date).
Hyperparameters: Adjust model-specific settings in the Hyperparameter Grid (e.g., n_epochs, input_chunk_length). See the Darts website for more information on each Model's hyperparameters.
Target Streams: Select the primary time series to forecast.
Past Covariates: Add historical data streams that influence the target.
Future Covariates: Include known future data to enhance predictions. A good example is a holiday indicator stream which can influence other streams such as energy consumption or store sales.
Static Covariates: Add fixed attributes (e.g., location) for context. Each static covariate must be associated with a target stream. Choose the target stream and then enter the static covariate value.
Input/Output Darts Model: Optionally load an existing Darts model (for further training) or specify where to save the trained model. There will actually be two files. One for model weights (.pt file) and a checkpoint file (.ckpt). Models are stored as component file streams. Enter the component ID of the file stream. The IDs of the file stream must be trainedModelPt and trainedModelCkpt

Forecast (Predict) Tab:

Target Stream Intervals: Set the historical data range for prediction.
Forecast Horizon: Define how many intervals to forecast.
Forecast From: Choose the starting point (e.g., training end date or specific date).
Trained Darts Model: Load the trained model for prediction.
Target Validation Stream: Optionally add validation streams to evaluate forecast accuracy.
Result Stream Location: Decide where to store forecast results.

Stream Gap Filling:

Some models cannot handle gaps in stream data. Configure how to handle gaps in stream data (e.g., fill with a value, previous value, next value, average, min, max, or spline interpolation).

Scheduling and Emailing:

A Schedule can be created to periodically run any of the three 'execute' options above so that models can continuously learn and forecast as data arrives.
An email can be configured to email results when an 'Execute' option has finished.

Fine-tuning and Experimenting

Over time you'll want to experiment with many models and compare results to get the best forecast. The best way to do this is to copy, tweak, and run each model while creating a folder structure for different sets of runs. Right-clicking a model and choosing copy will prompt you for the number of models you'd like to copy. Everything is copied except results and file streams holding Darts models.

Additional Tips and Best Practices

Data Preparation:
- Ensure continuous data by using Stream Gap Filling options (e.g., NONE, VALUE, SPLINE).
- Most models (except Prophet) require a forecast cycle for alignment.
Model Selection:
- Use ARIMA or ExponentialSmoothing for simple, univariate forecasts.
- Choose TFTModel or TransformerModel for complex, multi-variable scenarios.
Target Streams:
- Some models only support one target stream whereas others support many.
- One or more Stream Groups can optionally be selected as targets. Stream groups are very useful during production as new streams will automatically be picked up during runs.
Covariates:
- Every covariate will be associated with a target stream. It will only be associated with streams that match the Target Stream ID if one was entered.
- If the target stream's component has a stream matching the ID of the selected covariate stream, it will be used instead of the explicitly selected covariate stream. For example, assume we selected a Temperature future covariate stream. If the target stream's component has a Temperature stream (with the same ID), then that stream will be associated with its sibling target stream. If the component does not have a weather stream, then the explicit weather stream (probably shared by many components) will be associated to all target streams in the model.
Hyperparameter Tuning:
- Start with default values provided with new models.
- Adjust parameters like input_chunk_length or n_epochs based on data size and complexity.

Validation:

FWhen performing a forecast, you can optionally provide validation streams to compute accuracy metrics calculated per stream and averaged across streams:

Metric	Calculation	Interpretation
MAE (Mean Absolute Error)	Average of the absolute differences between predicted and actual values.	Lower values indicate better accuracy. It is straightforward but doesn't account for large errors.
MSE (Mean Squared Error)	Average of the squared differences between predicted and actual values.	Lower values are better. It penalizes large errors more than MAE.
RMSE (Root Mean Squared Error)	Square root of the MSE, providing error in the same units as the data.	Lower values are better. It is useful for understanding error magnitude in the context of the data.
MAPE (Mean Absolute Percentage Error)	Average of the absolute percentage differences between predicted and actual values.	Lower percentages are better. It is useful for comparing accuracy across different scales.
SMAPE (Symmetric Mean Absolute Percentage Error)	A modified version of MAPE that is symmetric and handles zero values better.	Lower percentages are better. It is less sensitive to outliers and zero values.
R² (R-squared)	Proportion of variance in the dependent variable that is predictable from the independent variable(s).	Values closer to 1 indicate a better fit. It shows how well the model explains the variability of the data.

Runs
- Train Only: Trains a model using target streams and optional covariates. The trained model can be saved to a file stream for later use if the model supports native saving. The entered component ID for saved models must have to file streams with IDs: trainedModelPt and trainedModelCkpt
- Forecast Only: Loads an existing model from a file stream and uses it to forecast future values based on provided target streams and covariates. Requires a component ID that includes the trained model filestreams trainedModelPt and trainedModelCkpt.
- Train & Forecast Only: Trains a model (optionally loading an existing model from a filestream) and immediately generates forecasts for a specified horizon (n) using the trained model.
Fine-Tuning and Versioning
- Fine-tuning a forecasting model often requires multiple experimental runs with varying configurations, and Grovestreams.com makes this process seamless through its copying feature. To create a duplicate of a model or folder, simply right-click on it and select copy. This creates an independent version that you can modify without affecting the original. Similarly, components (with its streams) can be copied in the same way, enabling experimentation with both models and their underlying stream data. This technique is a powerful tool for iterative refinement, allowing you to test new ideas, track progress, and optimize performance systematically. Here’s how it works in practice:
  - Test Different Parameters: After copying a model, you can adjust its parameters—such as the forecasting horizon, learning rate, or input weights—and run it to compare results against the original. For example, if you’re working on a temperature forecasting model, you could duplicate it and tweak the prediction window (e.g., short-term vs. long-term forecasts) to evaluate which performs better.
  - Experiment with Data Preprocessing: Copying components lets you refine your data without altering the source. You might apply techniques like smoothing, normalization, or outlier removal by creating an expression derived stream and assess how these changes impact the model’s accuracy or reliability.
  - Try Alternative Approaches: If Grovestreams.com supports multiple algorithms or model architectures, you can use a copied model to test a different approach—such as switching from a linear regression to a neural network—while keeping the original intact for comparison.
  - Track Versions: Each copy serves as a distinct version of your model or data pipeline. This built-in versioning helps you document your experimentation process, revert to earlier setups if needed, and identify the most effective configuration over time.
- This copying technique offers several key benefits:
  - Non-Destructive Editing: Your original models and data remain unchanged, preserving a reliable baseline.
  - Iterative Improvement: You can refine your forecasts step-by-step through successive copies and adjustments.
  - Easy Comparison: Running multiple versions side-by-side lets you directly compare their outputs and select the best performer.
  For example, imagine you have a sales forecasting model. You could copy it, adjust the seasonal weighting in the duplicate, and test it against historical data. At the same time, you might copy the associated time series stream, apply a new filtering method, and feed it into the modified model. By comparing the results of these variations with the original, you can fine-tune both the model and its data for optimal accuracy—all without risking your initial setup.
Monitoring Runs
- Retrain models periodically with new data to maintain performance.
- Models are run asynchronously within the process queue. Large runs will be detected and run as a GroveStreams Job.
- Running models can be cancelled from within the process queue or Jobs monitoring windows.
- Check the Model Results tab for runtime, cost (free for now), and errors. Today, training models is free, but that may change if model usage becomes large. We will then charge by model runtime CPU usage.
- The last run's error will be reported within the model builder on the Last Error tab at the bottom of the builder.
- Check System notifications for errors and Job notifications (if it ran as a Job).

The Correlation Wizard

The Correlation Wizard helps identify relationships between streams. Open the wizard by choosing Admin - Stream Correlation Wizard:

Select Streams: Choose multiple time series to analyze.
Analyze Relationships: Compute correlation coefficients to measure strength and direction.
Visualize: View results as a heatmap to understand interactions.

This tool is invaluable for selecting covariates, identifying leading indicators, or exploring data relationships.

Model Descriptions and Use Cases

Below are detailed descriptions of each supported model, including their strengths, best use cases, key features, and configurable hyperparameters. Much more information can be found on the Darts website.

TFTModel (Temporal Fusion Transformer)

Description: A cutting-edge model combining transformers and LSTMs to handle complex temporal patterns and multi-horizon forecasting.
Best Used For: Long-term forecasts requiring multiple time steps with rich covariate data (e.g., predicting sales with weather and promotional data).
Key Features:
- Supports multiple target streams (multivariate forecasting).
- Handles static, past, and future covariates.
- Can forecast multiple target streams simultaneously.
- Requires a forecast cycle for time alignment.
Hyperparameters:
- input_chunk_length (int, default: 50): Length of historical data used as input.
- output_chunk_length (int, default: 24): Number of future intervals to predict.
- n_epochs (int, default: 10): Number of training iterations.
- hidden_size (int, default: 16): Size of hidden layers.
- lstm_layers (int, default: 1): Number of LSTM layers.
- num_attention_heads (int, default: 1): Number of attention heads.
- dropout (float, default: 0.1): Dropout rate to prevent overfitting.
- By default, add_relative_index=True is enabled to improve temporal feature encoding unless overridden by add_encoders in the hyperparameters.

NBEATSModel

Description: A deep learning model using stacked fully connected layers to decompose time series into trends and seasonality.
Best Used For: Univariate forecasting with strong trend and seasonal components (e.g., monthly sales data).
Key Features:
- Supports multiple target streams.
- Supports past covariates.
- Requires a forecast cycle.
Hyperparameters:
- input_chunk_length (int, default: 50): Length of input sequences.
- output_chunk_length (int, default: 24): Length of forecast horizon.
- n_epochs (int, default: 10): Training iterations.
- num_stacks (int, default: 2): Number of stacks in the architecture.
- num_blocks (int, default: 1): Blocks per stack.
- num_layers (int, default: 4): Layers per block.
- layer_widths (int, default: 512): Width of hidden layers.

ARIMA

Description: A traditional statistical model (AutoRegressive Integrated Moving Average) for analyzing and forecasting time series data.
Best Used For: Short-term forecasting of single time series with clear trends and seasonality (e.g., daily temperature readings).
Key Features:
- Does not support multiple target streams.
- Supports future covariates.
- Requires a forecast cycle.
Hyperparameters:
- p (int, default: 1): Order of autoregressive terms.
- d (int, default: 1): Order of differencing to make the series stationary.
- q (int, default: 1): Order of moving average terms.

Prophet

Description: A model developed by Facebook for forecasting time series with strong seasonal patterns and robustness to missing data.
Best Used For: Time series with daily, weekly, or yearly seasonality, especially with gaps or outliers (e.g., website traffic).
Key Features:
- Does not support multiple target streams.
- Supports future covariates.
- No forecast cycle required.
Hyperparameters:
- n_changepoints (int, default: 25): Number of potential trend changepoints.
- yearly_seasonality (boolean, default: true): Include yearly seasonality.
- weekly_seasonality (boolean, default: true): Include weekly seasonality.
- daily_seasonality (boolean, default: true): Include daily seasonality.
- growth (string, default: "linear"): Growth type ("linear" or "logistic").

TCNModel (Temporal Convolutional Network)

Description: A neural network using convolutional layers to capture long-range temporal dependencies.
Best Used For: Forecasting with long-term dependencies (e.g., energy consumption influenced by historical patterns).
Key Features:
- Supports multiple target streams.
- Handles past covariates.
- Can forecast multiple targets simultaneously.
- Requires a forecast cycle.
Hyperparameters:
- input_chunk_length (int, default: 50): Length of input sequences.
- output_chunk_length (int, default: 24): Forecast horizon length.
- n_epochs (int, default: 10): Training iterations.
- kernel_size (int, default: 3): Size of convolutional kernel.
- num_filters (int, default: 4): Number of filters in convolutional layers.
- dilation_base (int, default: 2): Base for dilation in convolutions.

TransformerModel

Description: Utilizes the transformer architecture for sequence-to-sequence forecasting tasks.
Best Used For: Complex forecasting with intricate patterns and dependencies (e.g., multi-variable financial data).
Key Features:
- Supports multiple target streams.
- Handles past covariates.
- Can forecast multiple targets simultaneously.
- Requires a forecast cycle.
Hyperparameters:
- input_chunk_length (int, default: 50): Length of input sequences.
- output_chunk_length (int, default: 24): Forecast horizon length.
- n_epochs (int, default: 10): Training iterations.
- d_model (int, default: 64): Dimension of the model.
- nhead (int, default: 4): Number of attention heads.
- num_encoder_layers (int, default: 3): Number of encoder layers.
- num_decoder_layers (int, default: 3): Number of decoder layers.

ExponentialSmoothing

Description: A statistical method applying weighted averages to past data, emphasizing recent observations.
Best Used For: Simple forecasts with clear trends and seasonality where efficiency is key (e.g., inventory levels).
Key Features:
- Does not support multiple target streams or covariates.
- Requires a forecast cycle.
Hyperparameters:
- seasonal_periods (int, default: 12): Length of the seasonal cycle.
- trend (string, default: "add"): Trend type ("add", "mul", or "none").
- seasonal (string, default: "add"): Seasonal type ("add", "mul", or "none").
- Trend and seasonal options (trend and seasonal) are mapped to Darts' ModelMode and SeasonalityMode: add → Additive mul → Multiplicative none → None

RNNModel (Recurrent Neural Network)

Description: A neural network designed for sequential data, capturing temporal dependencies with recurrent connections.
Best Used For: Sequential data with dependencies (e.g., stock prices or weather patterns).
Key Features:
- Supports multiple target streams.
- Handles future covariates.
- Can forecast multiple targets simultaneously.
- Requires a forecast cycle.
Hyperparameters:
- input_chunk_length (int, default: 50): Length of input sequences.
- output_chunk_length (int, default: 24): Forecast horizon length.
- n_epochs (int, default: 10): Training iterations.
- hidden_dim (int, default: 25): Size of hidden layers.
- n_rnn_layers (int, default: 1): Number of RNN layers.
- model (string, default: "LSTM"): RNN type ("RNN", "LSTM", or "GRU").

By following these guidelines and utilizing GroveStreams’ tools, you can create effective forecasting models to support data-driven decisions.