Questions about Predictions
Questions about Defining Predictions
Why does my group-level prediction have a limited date range available?
Predictions that are defined for all of the data series in a group are restricted to using only the dates that are available in all of the data series in that group.
This can be a problem if a group contains data series that have less data available than others. A quick way to view the start and end dates of all of the data series in a group is to add the Start Date and the End Date as display fields in the portfolio view. These values can be sorted to quickly identify individual stocks that may be limiting the available range.
& For help with display fields, see Using Display Fields.
How do I set my prediction date range to allow for a "production set"?
A "production set" is a secondary testing set which can be used for additional out-of-sample analysis of a prediction. By default, TradingSolutions includes the data up to the end of the currently known values in its predictions and sets aside the last portion of this for normal out-of-sample testing.
To use two ranges of data for out-of-sample testing, any of the following options can be used:
· Analyze the Testing Set as two ranges.
While only one portion of the optimization range is set aside for out-of-sample testing, you can perform signal analysis on any date range. Therefore, you could limit your initial usage of the testing set to a portion of the data available.
· Manually adjust the Optimization Range.
The optimization range associated with a prediction or optimization defaults to use the most recent data. However, this range can be adjusted from the Modify Optimization Range Dialog to exclude recent data so that it can be used as a production set.
· Delay the importing recent data.
Since the optimization range associated with a prediction is limited to data available in your portfolio, you can exclude recent data from the optimization range by delaying importing it until you are ready to use it for testing. You can modify the date range being imported from data services on the Import Data Wizard: Select Data to Download page.
Questions about Training
Why did my prediction stop training?
The training phase will stop either when the maximum number of epochs has been reached or when the error on the cross validation set does not improve after a given number of epochs. It can also be stopped manually by pressing the Stop Training button on the Status Dialog during the training phase.
Typically, this question is asked when the training stops before 100% of the status has completed. This occurs when the error on the cross validation set increases after it does not improve after a given number of epochs. This is normally an indication that the training of the model is beginning to over-specialize on the training set, making it less effective at generalizing values outside of the training set. The number of epochs to wait for an improvement can be set on the Modify Training Settings Dialog.
Why did my prediction remain at 0.0000% improvement, but not stop training?
The improvement displayed on the Status Dialog is the percent change in the cross validation error over each epoch. Since this value is rounded to four decimal places, subtle improvement may still be taking place.
In addition, training is only stopped when the error on the cross validation set increases. If the error remains unchanged, training will continue since the training is not over-specializing on the data in the training set and further training may produce improvements.
Why do the training passes have different learning curves for the same data?
Each training pass is started with a random set of weights in the neural network. These weights are the values that are adjusted during the training of the network. The learning curve is a measure of the error produced by the current set of weights in the network. Since each pass begins with a random set of weights, the learning curves will appear different. This includes not only the starting position, but the overall shape since the training is approaching reducing the error from different initial conditions.
How can I view the learning curve after the training phase has completed?
The learning curve is available on the Learning Curve sub-page of the Modify Field Dialog: Prediction Analysis page.
How can I change, retrain, or optimize a prediction without losing my current training weights?
If you would like to improve an existing model but would also like to keep the current training weights in case you are not able to improve upon them, there are several ways to accomplish this.
The best way to do this is to make a copy of the prediction you would like to modify and then make your changes to the copy. To make a copy of a prediction, create a new prediction field and select to Copy Another Prediction on the Predict a Value Wizard: Select Desired Outputs page. This will duplicate the settings associated with the original prediction, but will not copy the weights. Therefore, any changes or new training should take place on this new copy. However, note that if you have other fields that use the results of the original prediction, they will continue to use that field unless they are modified to use the copy.
Another way to save the current training weights of a prediction is to create a trading solution based on the prediction. When you create a trading solution, you are given the option to save the weights associated with the predicted fields in the solution. Once the trading solution is created, you can modify the original prediction directly. The original prediction can be restored by applying the trading solution. However, note that when you apply a trading solution, it will create a new field with the original weights. It will not overwrite the original prediction field.
& For help with trading solutions, see Working with Complete Trading Solutions.
How can I view the actual values of the input and desired outputs being used for training the neural models?
The preprocessed values of the inputs and desired outputs for a prediction can be exported from TradingSolutions from the Modify Field Dialog: Training Settings page. The Export Training Data for use in NeuroSolutions… button can be used to create files containing the preprocessed values of the training, cross validation, and testing sets in comma-separated text files.
Questions about Genetic Optimization
Why is the optimization phase not evaluating all of the chromosomes (tests) in a generation?
When the genetic algorithm used in the genetic optimization phase produces a new generation, it is possible that chromosomes will be produced that are the same as those in the previous generation. If a chromosome has been evaluated in a previous generation, it is not re-evaluated.
The likelihood of duplicate chromosomes being produced can be reduced by increasing the mutation rate on the Modify Genetic Algorithm Settings Dialog. This will make it more likely that random elements will be introduced into the population. However, this can also make it more difficult to evolve better chromosomes from previous best chromosomes.
How can I resume genetic optimization after I use the Stop Genetic button?
If you select Re-optimize neural network inputs and settings on save. from the Modify Field Dialog: Training Settings page and restart the training, the current model will be used to generate the initial generation of the genetic optimization. This is similar to resuming the genetic optimization, except that the remainder of the last genetic population will not be recreated.
Ä If you press the Resume Training button on the Modify Field Dialog: Overview page or the Training Analysis: Overview sub-page, only the best training from the genetic optimization will be resumed.
How can I view the genetic optimization graph after the optimization phase has completed?
The genetic optimization graph and statistics are available on the Training Optimization sub-page of the Modify Field Dialog: Prediction Analysis page.
Why does the graph of the genetic optimization worst cost contain spikes with a value of 1?
When the genetic algorithm used in the genetic optimization phase produces a chromosome that is invalid, it technically has an infinite cost. This cost is reduced to 1 for the purposes of evaluation and graphing. This typically appears when genetic optimization of prediction inputs is being used and an offspring is produced which has no inputs enabled.
Questions about Prediction Analysis
Why is the word "failed" next to the name of my prediction in the selection trees?
The word "failed" appears next to prediction fields when the field cannot be trained due to an error. There are several reasons that predictions can fail. For more detailed information on why this prediction failed, see the Modify Field Dialog: Prediction Analysis page.
Questions about Predicted Values
Why did my prediction not produce any values?
There are several things you will want to examine to determine why a prediction did not produce any values.
First, make sure that the prediction has been trained for this data series. This can be checked quickly from the Modify Data Series Dialog: Data Fields page. If the prediction has the word "failed" next to the name of the field, the field could not be trained due to an error. There are several reasons that predictions can fail. For more detailed information on why this prediction failed, see the Modify Field Dialog: Prediction Analysis page.
If the prediction has the word "deferred" next to the name of the field, either the training or the prediction was deferred or cancelled. To resume the processing of deferred fields, press the Calculate Pending Fields button at the bottom of the Modify Data Series Dialog: Data Fields page. You can also resume deferred processing by selecting Resume Deferred Processing… from the Tools menu or by selecting Resume Deferred Processing from the toolbar.
If no messages are listed next to the field name, the next thing to check is the current inputs to the prediction. Use the Modify Fields Dialog: Prediction Inputs page to view which fields are required as inputs to the prediction. After verifying that these are the desired inputs, display these fields in the spreadsheet view to ensure that these fields have values. If any of the input fields to a prediction do not have values, the prediction will not have a value.
Why are there empty cells at the start of my predicted fields?
Empty cells represent null values. Null values are typically produced when there is not sufficient data to calculate a value. This can be due to one of the inputs to a calculation or prediction being null, or a calculation or prediction requiring a previous value that is not yet available. Predictions require all of their inputs to be non-null to produce a value. If you are using the change in a value as an input, it is important to note that this requires the current value and the previous value to be calculated.
Ä Note: This restriction only applies to the inputs to the neural network. The memory included in some neural network topologies will produce values as long as all of the current inputs are available. However, it is important to be aware that random values may be used in the memory at the beginning of the data.
Why does my predicted value appear to lag the actual price by a few bars?
When predicting a desired output with change or percent change preprocessing, the neural network may find that the least error it can produce with the given inputs is to predict no change in price. This causes the predicted value to appear to lag the actual price by the number of bars associated with the prediction since it is predicting that no change will occur from the previous value. This is typically a sign that the neural network does not have enough information to produce a good model. Better results may be achieved with additional relevant inputs.