This statement creates a local data
The CREATE MINING MODEL statement creates a new mining model based on the column definition list. Each column is described by content flags in the column definition. These flags provide additional information to the mining algorithm concerning the content of the training data or model. No more than one flag from a particular group can be used (that is, flags within a flag type group are exclusive of each other) and they must be placed in their correct order. The flag type groups and correct orders for the content flags are listed in the following table.
Flag type | Flag name | Description |
---|---|---|
Distribution | NORMAL | The values of the column appear in a normal distribution. |
LOG NORMAL | The values of the column appear in a log normal distribution. | |
UNIFORM | The values of the column appear in a uniform distribution. | |
Content Type | KEY | The column is discrete and is a key. Key columns will not have any other flags except in the case of a nested table with no attribute columns. |
CONTINUOUS | The column contains values in a continuous range, such as Age or Salary. | |
DISCRETE | The column contains a discrete set of values, such as Gender. | |
DISCRETIZED() | The column contains a continuous set of values that should be converted to buckets. | |
ORDERED | The column contains a discrete set of values that are ordered, such as Salary Level. | |
CYCLICAL | The column contains an ordered discrete set of values that are cyclical, such as Day of Week or Month. | |
SEQUENCE TIME | The column contains time measurement units. | |
Modeling | MODEL_EXISTENCE_ONLY | The column should be modeled as having two states, missing and nonmissing, regardless of the values in the column. This is particularly useful for columns in a nested table, where values are sparse across cases. |
NOT NULL | The column cannot accept NULL values. | |
Special Property | PROBABILITY | The value in this column is the probability (0-1) of the associated value. |
VARIANCE | The value in this column is value variance of the associated value. | |
STD | The value in this column is the standard deviation of the associated value. | |
PROBABILITY VARIANCE | The value in this column is the variance of the probability associated with the associated value. | |
PROBABILITY STD | The value in this column is the standard deviation of the probability associated with the associated value. | |
SUPPORT | The value in this column is the weight (case replication factor) of the associated value. |
Column relations are described in one of the following ways.
<Column relation> clause | Description |
---|---|
OF | This form is restricted to use for columns with Special Property content flags, for example, ProbGender Double PROBABILITY OF Gender. |
RELATED TO | This form indicates a value hierarchy. The target of a related to column can be a key column in a nested table, a discretely valued column on the case row, or another column with a RELATED TO clause (indicating a deeper hierarchy). |
The following flags are used to describe how a prediction column functions.
<Prediction flag> clause | Description |
---|---|
PREDICT | This column can be predicted by the model and it can be supplied in input cases to predict the value of other predictable columns. |
PREDICT_ONLY | This column can be predicted by the model, but its values cannot be used in input cases to predict the value of other predictable columns. |