Modify Optimization Range Dialog
The Modify Optimization Range Dialog allows you to modify the specific dates and distribution of data used for optimizing a field or training a prediction.
Ä Note: Fields that are currently being used by the Solution Service cannot be modified.
There are no definitive rules for how much data should be used for training or optimizing fields. Including more data is useful for improving results since it provides additional examples for training. However, older financial data may have been produced under different market conditions than more recent data, causing the training to focus on outdated or overly generalized principles.
How the Optimization Range is Used for a Prediction
The date range associated with a prediction is used for several different things. Part of the data is used for training the neural network model. Part of the data is used as a validation step to make sure the training is improving the results. And, part of the data is left out of the training process to be used for only for testing.
The same date ranges are used for the optimization of the prediction settings and the postprocessing of predicted signals. By default, the training data is used for determining the fitness of each test. The cross validation data is used for determine which generation has provided the best general results. The testing data is typically not used during optimization.
How the Optimization Range is Used for a Signal Optimization
The date range associated with the optimization of committees and entry/exit systems is use the same as it is for predictions. By default, the training data is used for determining the fitness of each test. The cross validation data is used for determine which generation has provided the best general results. The testing data is typically not used during optimization.
& For help with predictions, see Predicting and Modeling Financial Data.
& For help with optimizing fields, see Optimizing Signals and Predictions.
Modifying the Date Range
The following settings are available for specifying the start and end or the date range.
¨ Start of Range / End of Range
These values specify the start and end of the data range. Use the date controls to increase or decrease these values. As these values are updated, the size will be updated automatically.
& For help using the date controls, see the help for Date Selection Controls.
Ä Note: For fields being defined for an entire group or fields that use external data as inputs, the data range is limited to the common date range for all of the data being used. For example, if the group for which you are defining a field has a data series with only one year of data, the field will only be able to use one year of data, even if the other data series in the group all have more data.
You can increase a limited data range by either importing more data into the shorter data series, or removing the data series from the field. You can quickly determine the first date available in your data series by adding the first date as a display field in the Portfolio View.
& For help with display fields, see Using Display Fields.
¨ Overall Size
This value specifies the number of samples, days, or other length of data in the selected range. If this value is changed, the range will be updated to match the new size. If the End of Range is set to the last date, the Start of Range will be adjusted; otherwise, the End of Range will be adjusted.
Ä Note: Size in samples is not available when creating group fields since the number of samples may be different for the various data series in the group.
Modifying the Date Range Distribution
The following settings are available for assigning the distribution of the date range. See the text above for more information on how these ranges are used.
p Training Set
This value indicates the percentage of the date range to use for the actual training or optimization process. For prediction training, this data is used to determine the error for updating the neural network weights. For optimization, this data is used to determine the fitness of the current settings.
Ä Note: This value equals the data remaining after the cross validation and testing ranges have been removed.
þ Cross Validation
This setting indicates the percentage of the date range to use for verifying that the new settings are producing better general settings with continued iterations. If the error or fitness for this range stops improving, the training or optimization stops after a given number of attempts and the settings which yielded the best results for the cross validation set are used.
þ Accuracy Testing
This setting indicates the percentage of the date range to set aside to not be used during training or optimization. This allows you to set aside data which is not used to improve the results so that you can verify the new values will work with new data.
¨ Ordering
This setting indicates the order in which the date range is divided into training, cross validation, and testing sets.
Ä Note: The testing set is typically kept last so that it is closest to any new data which will occur.
If an Overall Size is available, an Approximate Size will be displayed for each of these percentages to show the approximate number of bars that will be in each range.
Ä Note: The actual number of bars in each range may be different, depending on if there are bars which need to be excluded due to null values, conditional inputs, or other criteria.
What Do I Do Next?
When you are done modifying the optimization range, press the OK button. If you would prefer to exit this dialog without making modifications, press the Cancel button.
How Did I Get Here?
The Modify Optimization Range Dialog is displayed for predictions when you press the Training Range… button on the Predict a Value Wizard: Select Options page or the Modify Field Dialog: Training Settings page.
The Modify Optimization Range Dialog is displayed for optimizable fields when you press the Adjust Range… button on the Signal Optimization Settings Dialog.