What is sequence modeling?
When output from one model is fed as a part of the input in the other model for prediction purpose then we call it as a sequence model. This concept arises when there are more than 2 or more parameters which you want to predict simultaneously. This idea indirectly also indicates something about causality relationship between the inter-parameter relationships. This means that which parameter needs to be predicted first should be pre-determined and similarly for the other parameters. The sequential model outputs should be representative of how the operations or the process happens in the real time. When we speak about causation we mostly think about the manipulative and the controlled variables and not the responding variables in the “Line of action”. But what If there are more than 2 parameters which could be treated as responding variables in the process and you don’t know which one to predict and use it as a final metric. Under such circumstances we can use the sequence model and try to predict one of the responding variables first as a function of the independent parameters and then use this as the input to the other model. The architecture of such model could be very simple as a linear chain or could be as complex as a network of blocks.
And When Sequence Modeling?
This approach could be applied to all such processes where one outcome could be somehow transformed and utilized/consumed as an input in the other model. In does not necessarily mean that in the real practical scenario the outcome from the first is responsible for the outcome from the second model as being a data-driven approach, its volume and variability plays a vital role. A good example to understand this could be through its implementation on the connected processes in the downstream operations, where pump is connected to distillation columns, then re-boiler and condenser in parallel and so on. This process clearly indicates that the output from the pump will surely impact the distillation performance and the other successive process, which makes a complete sense to implement the sequence model here so that the performance of the entire process line could be gauged in a single thread of analytics with multiple knots, identified as a model. So here the first model will be for pump that will predict the output parameter which will be the input for the next operation which is distillation column, and from here two discrete models will run in parallel, one each for re-boiler and condenser. The representation of this entire non-linear modeling plane could get visualized in a single dashboard, where each box could represent a unit –operation/process.
Many a times it’s possible that your responding variables are not directly associated with each other, but the nature of data over time brings a reason for you to utilize this concept. Yes it happens!! The behavior/trend of the data could also be used to identify and arrange the output parameters in the models sequentially. One such use case could be when your controller is really a bad performer (Meaning that you are mostly off-spec and at sub-optimal operations). Always you see a huge difference in the set and the real indicated values. In such cases you can try to create a ML/AI model (First model) that uses the independent process parameters to predict the real values, and then this predicted real value can be used in a second model that predicts the set value as the output. Essentially it prescribes the engineer by telling him what value to set next, by mapping the process and the controller non-linear behavior.
Written by,
ParthPrasoon Sinha
Sr. Data Scientist
Tridiagonal Solutions