fbpx
Wikipedia

Stochastic programming

For the context of control theory, see Stochastic control.

In the field of mathematical optimization, stochastic programming is a framework for modeling optimization problems that involve uncertainty. A stochastic program is an optimization problem in which some or all problem parameters are uncertain, but follow known probability distributions. This framework contrasts with deterministic optimization, in which all problem parameters are assumed to be known exactly. The goal of stochastic programming is to find a decision which both optimizes some criteria chosen by the decision maker, and appropriately accounts for the uncertainty of the problem parameters. Because many real-world decisions involve uncertainty, stochastic programming has found applications in a broad range of areas ranging from finance to transportation to energy optimization.

Contents

The basic idea of two-stage stochastic programming is that (optimal) decisions should be based on data available at the time the decisions are made and cannot depend on future observations. The two-stage formulation is widely used in stochastic programming. The general formulation of a two-stage stochastic programming problem is given by:

min x X { g ( x ) = f ( x ) + E ξ [ Q ( x , ξ ) ] } {\displaystyle \min _{x\in X}\{g(x)=f(x)+E_{\xi }[Q(x,\xi )]\}}
where Q ( x , ξ ) {\displaystyle Q(x,\xi )} is the optimal value of the second-stage problem
min y { q ( y , ξ ) | T ( ξ ) x + W ( ξ ) y = h ( ξ ) } . {\displaystyle \min _{y}\{q(y,\xi )\,|\,T(\xi )x+W(\xi )y=h(\xi )\}.}

The classical two-stage linear stochastic programming problems can be formulated as

min x R n g ( x ) = c T x + E ξ [ Q ( x , ξ ) ] subject to A x = b x 0 {\displaystyle {\begin{array}{llr}\min \limits _{x\in \mathbb {R} ^{n}}&g(x)=c^{T}x+E_{\xi }[Q(x,\xi )]&\\{\text{subject to}}&Ax=b&\\&x\geq 0&\end{array}}}

where Q ( x , ξ ) {\displaystyle Q(x,\xi )} is the optimal value of the second-stage problem

min y R m q ( ξ ) T y subject to T ( ξ ) x + W ( ξ ) y = h ( ξ ) y 0 {\displaystyle {\begin{array}{llr}\min \limits _{y\in \mathbb {R} ^{m}}&q(\xi )^{T}y&\\{\text{subject to}}&T(\xi )x+W(\xi )y=h(\xi )&\\&y\geq 0&\end{array}}}

In such formulation x R n {\displaystyle x\in \mathbb {R} ^{n}} is the first-stage decision variable vector, y R m {\displaystyle y\in \mathbb {R} ^{m}} is the second-stage decision variable vector, and ξ ( q , T , W , h ) {\displaystyle \xi (q,T,W,h)} contains the data of the second-stage problem. In this formulation, at the first stage we have to make a "here-and-now" decision x {\displaystyle x} before the realization of the uncertain data ξ {\displaystyle \xi } , viewed as a random vector, is known. At the second stage, after a realization of ξ {\displaystyle \xi } becomes available, we optimize our behavior by solving an appropriate optimization problem.

At the first stage we optimize (minimize in the above formulation) the cost c T x {\displaystyle c^{T}x} of the first-stage decision plus the expected cost of the (optimal) second-stage decision. We can view the second-stage problem simply as an optimization problem which describes our supposedly optimal behavior when the uncertain data is revealed, or we can consider its solution as a recourse action where the term W y {\displaystyle Wy} compensates for a possible inconsistency of the system T x h {\displaystyle Tx\leq h} and q T y {\displaystyle q^{T}y} is the cost of this recourse action.

The considered two-stage problem is linear because the objective functions and the constraints are linear. Conceptually this is not essential and one can consider more general two-stage stochastic programs. For example, if the first-stage problem is integer, one could add integrality constraints to the first-stage problem so that the feasible set is discrete. Non-linear objectives and constraints could also be incorporated if needed.

Distributional assumption

The formulation of the above two-stage problem assumes that the second-stage data ξ {\displaystyle \xi } can be modeled as a random vector with a known probability distribution (not just uncertain). This would be justified in many situations. For example, ξ {\displaystyle \xi } could be information derived from historical data and the distribution does not significantly change over the considered period of time. In such situations one may reliably estimate the required probability distribution and the optimization on average could be justified by the law of large numbers. Another example is that ξ {\displaystyle \xi } could be realizations of a simulation model whose outputs are stochastic. The empirical distribution of the sample could be used as an approximation to the true but unknown output distribution.

Discretization

To solve the two-stage stochastic problem numerically, one often needs to assume that the random vector ξ {\displaystyle \xi } has a finite number of possible realizations, called scenarios, say ξ 1 , , ξ K {\displaystyle \xi _{1},\dots ,\xi _{K}} , with respective probability masses p 1 , , p K {\displaystyle p_{1},\dots ,p_{K}} . Then the expectation in the first-stage problem's objective function can be written as the summation:

E [ Q ( x , ξ ) ] = k = 1 K p k Q ( x , ξ k ) {\displaystyle E[Q(x,\xi )]=\sum \limits _{k=1}^{K}p_{k}Q(x,\xi _{k})}
and, moreover, the two-stage problem can be formulated as one large linear programming problem (this is called the deterministic equivalent of the original problem, see section § Deterministic equivalent of a stochastic problem).

When ξ {\displaystyle \xi } has an infinite (or very large) number of possible realizations the standard approach is then to represent this distribution by scenarios. This approach raises three questions, namely:

  1. How to construct scenarios, see § Scenario construction;
  2. How to solve the deterministic equivalent. Optimizers such as CPLEX, GLPK and Gurobi can solve large linear/nonlinear problems. The NEOS Server, hosted at the University of Wisconsin, Madison, allows free access to many modern solvers. The structure of a deterministic equivalent is particularly amenable to apply decomposition methods, such as Benders' decomposition or scenario decomposition;
  3. How to measure quality of the obtained solution with respect to the "true" optimum.

These questions are not independent. For example, the number of scenarios constructed will affect both the tractability of the deterministic equivalent and the quality of the obtained solutions.

A stochastic linear program is a specific instance of the classical two-stage stochastic program. A stochastic LP is built from a collection of multi-period linear programs (LPs), each having the same structure but somewhat different data. The k t h {\displaystyle k^{th}} two-period LP, representing the k t h {\displaystyle k^{th}} scenario, may be regarded as having the following form:

Minimize f T x + g T y + h k T z k subject to T x + U y = r V k y + W k z k = s k x , y , z k 0 {\displaystyle {\begin{array}{lccccccc}{\text{Minimize}}&f^{T}x&+&g^{T}y&+&h_{k}^{T}z_{k}&&\\{\text{subject to}}&Tx&+&Uy&&&=&r\\&&&V_{k}y&+&W_{k}z_{k}&=&s_{k}\\&x&,&y&,&z_{k}&\geq &0\end{array}}}

The vectors x {\displaystyle x} and y {\displaystyle y} contain the first-period variables, whose values must be chosen immediately. The vector z k {\displaystyle z_{k}} contains all of the variables for subsequent periods. The constraints T x + U y = r {\displaystyle Tx+Uy=r} involve only first-period variables and are the same in every scenario. The other constraints involve variables of later periods and differ in some respects from scenario to scenario, reflecting uncertainty about the future.

Note that solving the k t h {\displaystyle k^{th}} two-period LP is equivalent to assuming the k t h {\displaystyle k^{th}} scenario in the second period with no uncertainty. In order to incorporate uncertainties in the second stage, one should assign probabilities to different scenarios and solve the corresponding deterministic equivalent.

Deterministic equivalent of a stochastic problem

With a finite number of scenarios, two-stage stochastic linear programs can be modelled as large linear programming problems. This formulation is often called the deterministic equivalent linear program, or abbreviated to deterministic equivalent. (Strictly speaking a deterministic equivalent is any mathematical program that can be used to compute the optimal first-stage decision, so these will exist for continuous probability distributions as well, when one can represent the second-stage cost in some closed form.) For example, to form the deterministic equivalent to the above stochastic linear program, we assign a probability p k {\displaystyle p_{k}} to each scenario k = 1 , , K {\displaystyle k=1,\dots ,K} . Then we can minimize the expected value of the objective, subject to the constraints from all scenarios:

Minimize f x + g y + p 1 h 1 z 1 + p 2 h 2 T z 2 + + p K h K z K subject to T x + U y = r V 1 y + W 1 z 1 = s 1 V 2 y + W 2 z 2 = s 2 V K y + W K z K = s K x , y , z 1 , z 2 , , z K 0 {\displaystyle {\begin{array}{lccccccccccccc}{\text{Minimize}}&f^{\top }x&+&g^{\top }y&+&p_{1}h_{1}^{\top }z_{1}&+&p_{2}h_{2}^{T}z_{2}&+&\cdots &+&p_{K}h_{K}^{\top }z_{K}&&\\{\text{subject to}}&Tx&+&Uy&&&&&&&&&=&r\\&&&V_{1}y&+&W_{1}z_{1}&&&&&&&=&s_{1}\\&&&V_{2}y&&&+&W_{2}z_{2}&&&&&=&s_{2}\\&&&\vdots &&&&&&\ddots &&&&\vdots \\&&&V_{K}y&&&&&&&+&W_{K}z_{K}&=&s_{K}\\&x&,&y&,&z_{1}&,&z_{2}&,&\ldots &,&z_{K}&\geq &0\\\end{array}}}

We have a different vector z k {\displaystyle z_{k}} of later-period variables for each scenario k {\displaystyle k} . The first-period variables x {\displaystyle x} and y {\displaystyle y} are the same in every scenario, however, because we must make a decision for the first period before we know which scenario will be realized. As a result, the constraints involving just x {\displaystyle x} and y {\displaystyle y} need only be specified once, while the remaining constraints must be given separately for each scenario.

In practice it might be possible to construct scenarios by eliciting experts' opinions on the future. The number of constructed scenarios should be relatively modest so that the obtained deterministic equivalent can be solved with reasonable computational effort. It is often claimed that a solution that is optimal using only a few scenarios provides more adaptable plans than one that assumes a single scenario only. In some cases such a claim could be verified by a simulation. In theory some measures of guarantee that an obtained solution solves the original problem with reasonable accuracy. Typically in applications only the first stage optimal solution x {\displaystyle x^{*}} has a practical value since almost always a "true" realization of the random data will be different from the set of constructed (generated) scenarios.

Suppose ξ {\displaystyle \xi } contains d {\displaystyle d} independent random components, each of which has three possible realizations (for example, future realizations of each random parameters are classified as low, medium and high), then the total number of scenarios is K = 3 d {\displaystyle K=3^{d}} . Such exponential growth of the number of scenarios makes model development using expert opinion very difficult even for reasonable size d {\displaystyle d} . The situation becomes even worse if some random components of ξ {\displaystyle \xi } have continuous distributions.

Monte Carlo sampling and Sample Average Approximation (SAA) Method

A common approach to reduce the scenario set to a manageable size is by using Monte Carlo simulation. Suppose the total number of scenarios is very large or even infinite. Suppose further that we can generate a sample ξ 1 , ξ 2 , , ξ N {\displaystyle \xi ^{1},\xi ^{2},\dots ,\xi ^{N}} of N {\displaystyle N} replications of the random vector ξ {\displaystyle \xi } . Usually the sample is assumed to be independent and identically distributed (i.i.d sample). Given a sample, the expectation function q ( x ) = E [ Q ( x , ξ ) ] {\displaystyle q(x)=E[Q(x,\xi )]} is approximated by the sample average

q ^ N ( x ) = 1 N j = 1 N Q ( x , ξ j ) {\displaystyle {\hat {q}}_{N}(x)={\frac {1}{N}}\sum _{j=1}^{N}Q(x,\xi ^{j})}

and consequently the first-stage problem is given by

g ^ N ( x ) = min x R n c T x + 1 N j = 1 N Q ( x , ξ j ) subject to A x = b x 0 {\displaystyle {\begin{array}{rlrrr}{\hat {g}}_{N}(x)=&\min \limits _{x\in \mathbb {R} ^{n}}&c^{T}x+{\frac {1}{N}}\sum _{j=1}^{N}Q(x,\xi ^{j})&\\&{\text{subject to}}&Ax&=&b\\&&x&\geq &0\end{array}}}

This formulation is known as the Sample Average Approximation method. The SAA problem is a function of the considered sample and in that sense is random. For a given sample ξ 1 , ξ 2 , , ξ N {\displaystyle \xi ^{1},\xi ^{2},\dots ,\xi ^{N}} the SAA problem is of the same form as a two-stage stochastic linear programming problem with the scenarios ξ j {\displaystyle \xi ^{j}} ., j = 1 , , N {\displaystyle j=1,\dots ,N} , each taken with the same probability p j = 1 N {\displaystyle p_{j}={\frac {1}{N}}} .

Consider the following stochastic programming problem

min x X { g ( x ) = f ( x ) + E [ Q ( x , ξ ) ] } {\displaystyle \min \limits _{x\in X}\{g(x)=f(x)+E[Q(x,\xi )]\}}

Here X {\displaystyle X} is a nonempty closed subset of R n {\displaystyle \mathbb {R} ^{n}} , ξ {\displaystyle \xi } is a random vector whose probability distribution P {\displaystyle P} is supported on a set Ξ R d {\displaystyle \Xi \subset \mathbb {R} ^{d}} , and Q : X × Ξ R {\displaystyle Q:X\times \Xi \rightarrow \mathbb {R} } . In the framework of two-stage stochastic programming, Q ( x , ξ ) {\displaystyle Q(x,\xi )} is given by the optimal value of the corresponding second-stage problem.

Assume that g ( x ) {\displaystyle g(x)} is well defined and finite valued for all x X {\displaystyle x\in X} . This implies that for every x X {\displaystyle x\in X} the value Q ( x , ξ ) {\displaystyle Q(x,\xi )} is finite almost surely.

Suppose that we have a sample ξ 1 , , ξ N {\displaystyle \xi ^{1},\dots ,\xi ^{N}} of N {\displaystyle N} realizations of the random vector ξ {\displaystyle \xi } . This random sample can be viewed as historical data of N {\displaystyle N} observations of ξ {\displaystyle \xi } , or it can be generated by Monte Carlo sampling techniques. Then we can formulate a corresponding sample average approximation

min x X { g ^ N ( x ) = f ( x ) + 1 N j = 1 N Q ( x , ξ j ) } {\displaystyle \min \limits _{x\in X}\{{\hat {g}}_{N}(x)=f(x)+{\frac {1}{N}}\sum _{j=1}^{N}Q(x,\xi ^{j})\}}

By the Law of Large Numbers we have that, under some regularity conditions 1 N j = 1 N Q ( x , ξ j ) {\displaystyle {\frac {1}{N}}\sum _{j=1}^{N}Q(x,\xi ^{j})} converges pointwise with probability 1 to E [ Q ( x , ξ ) ] {\displaystyle E[Q(x,\xi )]} as N {\displaystyle N\rightarrow \infty } . Moreover, under mild additional conditions the convergence is uniform. We also have E [ g ^ N ( x ) ] = g ( x ) {\displaystyle E[{\hat {g}}_{N}(x)]=g(x)} , i.e., g ^ N ( x ) {\displaystyle {\hat {g}}_{N}(x)} is an unbiased estimator of g ( x ) {\displaystyle g(x)} . Therefore, it is natural to expect that the optimal value and optimal solutions of the SAA problem converge to their counterparts of the true problem as N {\displaystyle N\rightarrow \infty } .

Consistency of SAA estimators

Suppose the feasible set X {\displaystyle X} of the SAA problem is fixed, i.e., it is independent of the sample. Let ϑ {\displaystyle \vartheta ^{*}} and S {\displaystyle S^{*}} be the optimal value and the set of optimal solutions, respectively, of the true problem and let ϑ ^ N {\displaystyle {\hat {\vartheta }}_{N}} and S ^ N {\displaystyle {\hat {S}}_{N}} be the optimal value and the set of optimal solutions, respectively, of the SAA problem.

  1. Let g : X R {\displaystyle g:X\rightarrow \mathbb {R} } and g ^ N : X R {\displaystyle {\hat {g}}_{N}:X\rightarrow \mathbb {R} } be a sequence of (deterministic) real valued functions. The following two properties are equivalent:
    • for any x ¯ X {\displaystyle {\overline {x}}\in X} and any sequence { x N } X {\displaystyle \{x_{N}\}\subset X} converging to x ¯ {\displaystyle {\overline {x}}} it follows that g ^ N ( x N ) {\displaystyle {\hat {g}}_{N}(x_{N})} converges to g ( x ¯ ) {\displaystyle g({\overline {x}})}
    • the function g ( ) {\displaystyle g(\cdot )} is continuous on X {\displaystyle X} and g ^ N ( ) {\displaystyle {\hat {g}}_{N}(\cdot )} converges to g ( ) {\displaystyle g(\cdot )} uniformly on any compact subset of X {\displaystyle X}
  2. If the objective of the SAA problem g ^ N ( x ) {\displaystyle {\hat {g}}_{N}(x)} converges to the true problem's objective g ( x ) {\displaystyle g(x)} with probability 1, as N {\displaystyle N\rightarrow \infty } , uniformly on the feasible set X {\displaystyle X} . Then ϑ ^ N {\displaystyle {\hat {\vartheta }}_{N}} converges to ϑ {\displaystyle \vartheta ^{*}} with probability 1 as N {\displaystyle N\rightarrow \infty } .
  3. Suppose that there exists a compact set C R n {\displaystyle C\subset \mathbb {R} ^{n}} such that
    • the set S {\displaystyle S} of optimal solutions of the true problem is nonempty and is contained in C {\displaystyle C}
    • the function g ( x ) {\displaystyle g(x)} is finite valued and continuous on C {\displaystyle C}
    • the sequence of functions g ^ N ( x ) {\displaystyle {\hat {g}}_{N}(x)} converges to g ( x ) {\displaystyle g(x)} with probability 1, as N {\displaystyle N\rightarrow \infty } , uniformly in x C {\displaystyle x\in C}
    • for N {\displaystyle N} large enough the set S ^ N {\displaystyle {\hat {S}}_{N}} is nonempty and S ^ N C {\displaystyle {\hat {S}}_{N}\subset C} with probability 1
then ϑ ^ N ϑ {\displaystyle {\hat {\vartheta }}_{N}\rightarrow \vartheta ^{*}} and D ( S , S ^ N ) 0 {\displaystyle \mathbb {D} (S^{*},{\hat {S}}_{N})\rightarrow 0} with probability 1 as N {\displaystyle N\rightarrow \infty } . Note that D ( A , B ) {\displaystyle \mathbb {D} (A,B)} denotes the deviation of set A {\displaystyle A} from set B {\displaystyle B} , defined as
D ( A , B ) := sup x A { inf x B x x } {\displaystyle \mathbb {D} (A,B):=\sup _{x\in A}\{\inf _{x'\in B}\|x-x'\|\}}

In some situations the feasible set X {\displaystyle X} of the SAA problem is estimated, then the corresponding SAA problem takes the form

min x X N g ^ N ( x ) {\displaystyle \min _{x\in X_{N}}{\hat {g}}_{N}(x)}

where X N {\displaystyle X_{N}} is a subset of R n {\displaystyle \mathbb {R} ^{n}} depending on the sample and therefore is random. Nevertheless, consistency results for SAA estimators can still be derived under some additional assumptions:

  1. Suppose that there exists a compact set C R n {\displaystyle C\subset \mathbb {R} ^{n}} such that
    • the set S {\displaystyle S} of optimal solutions of the true problem is nonempty and is contained in C {\displaystyle C}
    • the function g ( x ) {\displaystyle g(x)} is finite valued and continuous on C {\displaystyle C}
    • the sequence of functions g ^ N ( x ) {\displaystyle {\hat {g}}_{N}(x)} converges to g ( x ) {\displaystyle g(x)} with probability 1, as N {\displaystyle N\rightarrow \infty } , uniformly in x C {\displaystyle x\in C}
    • for N {\displaystyle N} large enough the set S ^ N {\displaystyle {\hat {S}}_{N}} is nonempty and S ^ N C {\displaystyle {\hat {S}}_{N}\subset C} with probability 1
    • if x N X N {\displaystyle x_{N}\in X_{N}} and x N {\displaystyle x_{N}} converges with probability 1 to a point x {\displaystyle x} , then x X {\displaystyle x\in X}
    • for some point x S {\displaystyle x\in S^{*}} there exists a sequence x N X N {\displaystyle x_{N}\in X_{N}} such that x N x {\displaystyle x_{N}\rightarrow x} with probability 1.
then ϑ ^ N ϑ {\displaystyle {\hat {\vartheta }}_{N}\rightarrow \vartheta ^{*}} and D ( S , S ^ N ) 0 {\displaystyle \mathbb {D} (S^{*},{\hat {S}}_{N})\rightarrow 0} with probability 1 as N {\displaystyle N\rightarrow \infty } .

Asymptotics of the SAA optimal value

Suppose the sample ξ 1 , , ξ N {\displaystyle \xi ^{1},\dots ,\xi ^{N}} is i.i.d. and fix a point x X {\displaystyle x\in X} . Then the sample average estimator g ^ N ( x ) {\displaystyle {\hat {g}}_{N}(x)} , of g ( x ) {\displaystyle g(x)} , is unbiased and has variance 1 N σ 2 ( x ) {\displaystyle {\frac {1}{N}}\sigma ^{2}(x)} , where σ 2 ( x ) := V a r [ Q ( x , ξ ) ] {\displaystyle \sigma ^{2}(x):=Var[Q(x,\xi )]} is supposed to be finite. Moreover, by the central limit theorem we have that

N [ g ^ N g ( x ) ] D Y x {\displaystyle {\sqrt {N}}[{\hat {g}}_{N}-g(x)]{\xrightarrow {\mathcal {D}}}Y_{x}}

where D {\displaystyle {\xrightarrow {\mathcal {D}}}} denotes convergence in distribution and Y x {\displaystyle Y_{x}} has a normal distribution with mean 0 {\displaystyle 0} and variance σ 2 ( x ) {\displaystyle \sigma ^{2}(x)} , written as N ( 0 , σ 2 ( x ) ) {\displaystyle {\mathcal {N}}(0,\sigma ^{2}(x))} .

In other words, g ^ N ( x ) {\displaystyle {\hat {g}}_{N}(x)} has asymptotically normal distribution, i.e., for large N {\displaystyle N} , g ^ N ( x ) {\displaystyle {\hat {g}}_{N}(x)} has approximately normal distribution with mean g ( x ) {\displaystyle g(x)} and variance 1 N σ 2 ( x ) {\displaystyle {\frac {1}{N}}\sigma ^{2}(x)} . This leads to the following (approximate) 100 ( 1 α ) {\displaystyle 100(1-\alpha )} % confidence interval for f ( x ) {\displaystyle f(x)} :

[ g ^ N ( x ) z α / 2 σ ^ ( x ) N , g ^ N ( x ) + z α / 2 σ ^ ( x ) N ] {\displaystyle \left[{\hat {g}}_{N}(x)-z_{\alpha /2}{\frac {{\hat {\sigma }}(x)}{\sqrt {N}}},{\hat {g}}_{N}(x)+z_{\alpha /2}{\frac {{\hat {\sigma }}(x)}{\sqrt {N}}}\right]}

where z α / 2 := Φ 1 ( 1 α / 2 ) {\displaystyle z_{\alpha /2}:=\Phi ^{-1}(1-\alpha /2)} (here Φ ( ) {\displaystyle \Phi (\cdot )} denotes the cdf of the standard normal distribution) and

σ ^ 2 ( x ) := 1 N 1 j = 1 N [ Q ( x , ξ j ) 1 N j = 1 N Q ( x , ξ j ) ] 2 {\displaystyle {\hat {\sigma }}^{2}(x):={\frac {1}{N-1}}\sum _{j=1}^{N}\left[Q(x,\xi ^{j})-{\frac {1}{N}}\sum _{j=1}^{N}Q(x,\xi ^{j})\right]^{2}}

is the sample variance estimate of σ 2 ( x ) {\displaystyle \sigma ^{2}(x)} . That is, the error of estimation of g ( x ) {\displaystyle g(x)} is (stochastically) of order O ( N ) {\displaystyle O({\sqrt {N}})} .

Biological applications

Stochastic dynamic programming is frequently used to model animal behaviour in such fields as behavioural ecology. Empirical tests of models of optimal foraging, life-history transitions such as fledging in birds and egg laying in parasitoid wasps have shown the value of this modelling technique in explaining the evolution of behavioural decision making. These models are typically many-staged, rather than two-staged.

Economic applications

Stochastic dynamic programming is a useful tool in understanding decision making under uncertainty. The accumulation of capital stock under uncertainty is one example; often it is used by resource economists to analyze bioeconomic problems where the uncertainty enters in such as weather, etc.

Example: multistage portfolio optimization

The following is an example from finance of multi-stage stochastic programming. Suppose that at time t = 0 {\displaystyle t=0} we have initial capital W 0 {\displaystyle W_{0}} to invest in n {\displaystyle n} assets. Suppose further that we are allowed to rebalance our portfolio at times t = 1 , , T 1 {\displaystyle t=1,\dots ,T-1} but without injecting additional cash into it. At each period t {\displaystyle t} we make a decision about redistributing the current wealth W t {\displaystyle W_{t}} among the n {\displaystyle n} assets. Let x 0 = ( x 10 , , x n 0 ) {\displaystyle x_{0}=(x_{10},\dots ,x_{n0})} be the initial amounts invested in the n assets. We require that each x i 0 {\displaystyle x_{i0}} is nonnegative and that the balance equation i = 1 n x i 0 = W 0 {\displaystyle \sum _{i=1}^{n}x_{i0}=W_{0}} should hold.

Consider the total returns ξ t = ( ξ 1 t , , ξ n t ) {\displaystyle \xi _{t}=(\xi _{1t},\dots ,\xi _{nt})} for each period t = 1 , , T {\displaystyle t=1,\dots ,T} . This forms a vector-valued random process ξ 1 , , ξ T {\displaystyle \xi _{1},\dots ,\xi _{T}} . At time period t = 1 {\displaystyle t=1} , we can rebalance the portfolio by specifying the amounts x 1 = ( x 11 , , x n 1 ) {\displaystyle x_{1}=(x_{11},\dots ,x_{n1})} invested in the respective assets. At that time the returns in the first period have been realized so it is reasonable to use this information in the rebalancing decision. Thus, the second-stage decisions, at time t = 1 {\displaystyle t=1} , are actually functions of realization of the random vector ξ 1 {\displaystyle \xi _{1}} , i.e., x 1 = x 1 ( ξ 1 ) {\displaystyle x_{1}=x_{1}(\xi _{1})} . Similarly, at time t {\displaystyle t} the decision x t = ( x 1 t , , x n t ) {\displaystyle x_{t}=(x_{1t},\dots ,x_{nt})} is a function x t = x t ( ξ [ t ] ) {\displaystyle x_{t}=x_{t}(\xi _{[t]})} of the available information given by ξ [ t ] = ( ξ 1 , , ξ t ) {\displaystyle \xi _{[t]}=(\xi _{1},\dots ,\xi _{t})} the history of the random process up to time t {\displaystyle t} . A sequence of functions x t = x t ( ξ [ t ] ) {\displaystyle x_{t}=x_{t}(\xi _{[t]})} , t = 0 , , T 1 {\displaystyle t=0,\dots ,T-1} , with x 0 {\displaystyle x_{0}} being constant, defines an implementable policy of the decision process. It is said that such a policy is feasible if it satisfies the model constraints with probability 1, i.e., the nonnegativity constraints x i t ( ξ [ t ] ) 0 {\displaystyle x_{it}(\xi _{[t]})\geq 0} , i = 1 , , n {\displaystyle i=1,\dots ,n} , t = 0 , , T 1 {\displaystyle t=0,\dots ,T-1} , and the balance of wealth constraints,

i = 1 n x i t ( ξ [ t ] ) = W t , {\displaystyle \sum _{i=1}^{n}x_{it}(\xi _{[t]})=W_{t},}

where in period t = 1 , , T {\displaystyle t=1,\dots ,T} the wealth W t {\displaystyle W_{t}} is given by

W t = i = 1 n ξ i t x i , t 1 ( ξ [ t 1 ] ) , {\displaystyle W_{t}=\sum _{i=1}^{n}\xi _{it}x_{i,t-1}(\xi _{[t-1]}),}

which depends on the realization of the random process and the decisions up to time t {\displaystyle t} .

Suppose the objective is to maximize the expected utility of this wealth at the last period, that is, to consider the problem

max E [ U ( W T ) ] . {\displaystyle \max E[U(W_{T})].}

This is a multistage stochastic programming problem, where stages are numbered from t = 0 {\displaystyle t=0} to t = T 1 {\displaystyle t=T-1} . Optimization is performed over all implementable and feasible policies. To complete the problem description one also needs to define the probability distribution of the random process ξ 1 , , ξ T {\displaystyle \xi _{1},\dots ,\xi _{T}} . This can be done in various ways. For example, one can construct a particular scenario tree defining time evolution of the process. If at every stage the random return of each asset is allowed to have two continuations, independent of other assets, then the total number of scenarios is 2 n T . {\displaystyle 2^{nT}.}

In order to write dynamic programming equations, consider the above multistage problem backward in time. At the last stage t = T 1 {\displaystyle t=T-1} , a realization ξ [ T 1 ] = ( ξ 1 , , ξ T 1 ) {\displaystyle \xi _{[T-1]}=(\xi _{1},\dots ,\xi _{T-1})} of the random process is known and x T 2 {\displaystyle x_{T-2}} has been chosen. Therefore, one needs to solve the following problem

max x T 1 E [ U ( W T ) | ξ [ T 1 ] ] subject to W T = i = 1 n ξ i T x i , T 1 i = 1 n x i , T 1 = W T 1 x T 1 0 {\displaystyle {\begin{array}{lrclr}\max \limits _{x_{T-1}}&E[U(W_{T})|\xi _{[T-1]}]&\\{\text{subject to}}&W_{T}&=&\sum _{i=1}^{n}\xi _{iT}x_{i,T-1}\\&\sum _{i=1}^{n}x_{i,T-1}&=&W_{T-1}\\&x_{T-1}&\geq &0\end{array}}}

where E [ U ( W T ) | ξ [ T 1 ] ] {\displaystyle E[U(W_{T})|\xi _{[T-1]}]} denotes the conditional expectation of U ( W T ) {\displaystyle U(W_{T})} given ξ [ T 1 ] {\displaystyle \xi _{[T-1]}} . The optimal value of the above problem depends on W T 1 {\displaystyle W_{T-1}} and ξ [ T 1 ] {\displaystyle \xi _{[T-1]}} and is denoted Q T 1 ( W T 1 , ξ [ T 1 ] ) {\displaystyle Q_{T-1}(W_{T-1},\xi _{[T-1]})} .

Similarly, at stages t = T 2 , , 1 {\displaystyle t=T-2,\dots ,1} , one should solve the problem

max x t E [ Q t + 1 ( W t + 1 , ξ [ t + 1 ] ) | ξ [ t ] ] subject to W t + 1 = i = 1 n ξ i , t + 1 x i , t i = 1 n x i , t = W t x t 0 {\displaystyle {\begin{array}{lrclr}\max \limits _{x_{t}}&E[Q_{t+1}(W_{t+1},\xi _{[t+1]})|\xi _{[t]}]&\\{\text{subject to}}&W_{t+1}&=&\sum _{i=1}^{n}\xi _{i,t+1}x_{i,t}\\&\sum _{i=1}^{n}x_{i,t}&=&W_{t}\\&x_{t}&\geq &0\end{array}}}

whose optimal value is denoted by Q t ( W t , ξ [ t ] ) {\displaystyle Q_{t}(W_{t},\xi _{[t]})} . Finally, at stage t = 0 {\displaystyle t=0} , one solves the problem

max x 0 E [ Q 1 ( W 1 , ξ [ 1 ] ) ] subject to W 1 = i = 1 n ξ i , 1 x i 0 i = 1 n x i 0 = W 0 x 0 0 {\displaystyle {\begin{array}{lrclr}\max \limits _{x_{0}}&E[Q_{1}(W_{1},\xi _{[1]})]&\\{\text{subject to}}&W_{1}&=&\sum _{i=1}^{n}\xi _{i,1}x_{i0}\\&\sum _{i=1}^{n}x_{i0}&=&W_{0}\\&x_{0}&\geq &0\end{array}}}

Stagewise independent random process

For a general distribution of the process ξ t {\displaystyle \xi _{t}} , it may be hard to solve these dynamic programming equations. The situation simplifies dramatically if the process ξ t {\displaystyle \xi _{t}} is stagewise independent, i.e., ξ t {\displaystyle \xi _{t}} is (stochastically) independent of ξ 1 , , ξ t 1 {\displaystyle \xi _{1},\dots ,\xi _{t-1}} for t = 2 , , T {\displaystyle t=2,\dots ,T} . In this case, the corresponding conditional expectations become unconditional expectations, and the function Q t ( W t ) {\displaystyle Q_{t}(W_{t})} , t = 1 , , T 1 {\displaystyle t=1,\dots ,T-1} does not depend on ξ [ t ] {\displaystyle \xi _{[t]}} . That is, Q T 1 ( W T 1 ) {\displaystyle Q_{T-1}(W_{T-1})} is the optimal value of the problem

max x T 1 E [ U ( W T ) ] subject to W T = i = 1 n ξ i T x i , T 1 i = 1 n x i , T 1 = W T 1 x T 1 0 {\displaystyle {\begin{array}{lrclr}\max \limits _{x_{T-1}}&E[U(W_{T})]&\\{\text{subject to}}&W_{T}&=&\sum _{i=1}^{n}\xi _{iT}x_{i,T-1}\\&\sum _{i=1}^{n}x_{i,T-1}&=&W_{T-1}\\&x_{T-1}&\geq &0\end{array}}}

and Q t ( W t ) {\displaystyle Q_{t}(W_{t})} is the optimal value of

max x t E [ Q t + 1 ( W t + 1 ) ] subject to W t + 1 = i = 1 n ξ i , t + 1 x i , t i = 1 n x i , t = W t x t 0 {\displaystyle {\begin{array}{lrclr}\max \limits _{x_{t}}&E[Q_{t+1}(W_{t+1})]&\\{\text{subject to}}&W_{t+1}&=&\sum _{i=1}^{n}\xi _{i,t+1}x_{i,t}\\&\sum _{i=1}^{n}x_{i,t}&=&W_{t}\\&x_{t}&\geq &0\end{array}}}

for t = T 2 , , 1 {\displaystyle t=T-2,\dots ,1} .

Modelling languages

All discrete stochastic programming problems can be represented with any algebraic modeling language, manually implementing explicit or implicit non-anticipativity to make sure the resulting model respects the structure of the information made available at each stage. An instance of an SP problem generated by a general modelling language tends to grow quite large (linearly in the number of scenarios), and its matrix loses the structure that is intrinsic to this class of problems, which could otherwise be exploited at solution time by specific decomposition algorithms. Extensions to modelling languages specifically designed for SP are starting to appear, see:

  • AIMMS - supports the definition of SP problems
  • EMP SP (Extended Mathematical Programming for Stochastic Programming) - a module of GAMS created to facilitate stochastic programming (includes keywords for parametric distributions, chance constraints and risk measures such as Value at risk and Expected shortfall).
  • SAMPL - a set of extensions to AMPL specifically designed to express stochastic programs (includes syntax for chance constraints, integrated chance constraints and Robust Optimization problems)

They both can generate SMPS instance level format, which conveys in a non-redundant form the structure of the problem to the solver.

  1. Shapiro, Alexander; Dentcheva, Darinka; Ruszczyński, Andrzej (2009). Lectures on stochastic programming: Modeling and theory(PDF). MPS/SIAM Series on Optimization. 9. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM). Mathematical Programming Society (MPS). pp. xvi+436. ISBN 978-0-89871-687-0. MR 2562798.
  2. Birge, John R.; Louveaux, François (2011). Introduction to Stochastic Programming. Springer Series in Operations Research and Financial Engineering. doi:10.1007/978-1-4614-0237-4. ISBN 978-1-4614-0236-7. ISSN 1431-8598.
  3. Stein W. Wallace and William T. Ziemba (eds.). Applications of Stochastic Programming. MPS-SIAM Book Series on Optimization 5, 2005.
  4. Applications of stochastic programming are described at the following website, Stochastic Programming Community.
  5. Shapiro, Alexander; Philpott, Andy. A tutorial on Stochastic Programming(PDF).
  6. "NEOS Server for Optimization".
  7. Ruszczyński, Andrzej; Shapiro, Alexander (2003). Stochastic Programming. Handbooks in Operations Research and Management Science. 10. Philadelphia: Elsevier. p. 700. ISBN 978-0444508546.
  8. Mangel, M. & Clark, C. W. 1988. Dynamic modeling in behavioral ecology. Princeton University Press ISBN 0-691-08506-4
  9. Houston, A. I & McNamara, J. M. 1999. Models of adaptive behaviour: an approach based on state. Cambridge University Press ISBN 0-521-65539-0
  10. Howitt, R., Msangi, S., Reynaud, A and K. Knapp. 2002. "Using Polynomial Approximations to Solve Stochastic Dynamic Programming Problems: or A "Betty Crocker " Approach to SDP." University of California, Davis, Department of Agricultural and Resource Economics Working Paper.

Stochastic programming
Stochastic programming Language Watch Edit For the context of control theory see Stochastic control In the field of mathematical optimization stochastic programming is a framework for modeling optimization problems that involve uncertainty A stochastic program is an optimization problem in which some or all problem parameters are uncertain but follow known probability distributions 1 2 This framework contrasts with deterministic optimization in which all problem parameters are assumed to be known exactly The goal of stochastic programming is to find a decision which both optimizes some criteria chosen by the decision maker and appropriately accounts for the uncertainty of the problem parameters Because many real world decisions involve uncertainty stochastic programming has found applications in a broad range of areas ranging from finance to transportation to energy optimization 3 4 Contents 1 Two stage problems 1 1 Distributional assumption 1 2 Discretization 2 Stochastic linear programming 2 1 Deterministic equivalent of a stochastic problem 3 Scenario construction 3 1 Monte Carlo sampling and Sample Average Approximation SAA Method 4 Statistical inference 4 1 Consistency of SAA estimators 4 2 Asymptotics of the SAA optimal value 5 Applications and Examples 5 1 Biological applications 5 2 Economic applications 5 3 Example multistage portfolio optimization 5 3 1 Stagewise independent random process 6 Software tools 6 1 Modelling languages 7 See also 8 References 9 Further reading 10 External linksTwo stage problems EditThe basic idea of two stage stochastic programming is that optimal decisions should be based on data available at the time the decisions are made and cannot depend on future observations The two stage formulation is widely used in stochastic programming The general formulation of a two stage stochastic programming problem is given by min x X g x f x E 3 Q x 3 displaystyle min x in X g x f x E xi Q x xi where Q x 3 displaystyle Q x xi is the optimal value of the second stage problem min y q y 3 T 3 x W 3 y h 3 displaystyle min y q y xi T xi x W xi y h xi The classical two stage linear stochastic programming problems can be formulated asmin x R n g x c T x E 3 Q x 3 subject to A x b x 0 displaystyle begin array llr min limits x in mathbb R n amp g x c T x E xi Q x xi amp text subject to amp Ax b amp amp x geq 0 amp end array where Q x 3 displaystyle Q x xi is the optimal value of the second stage problemmin y R m q 3 T y subject to T 3 x W 3 y h 3 y 0 displaystyle begin array llr min limits y in mathbb R m amp q xi T y amp text subject to amp T xi x W xi y h xi amp amp y geq 0 amp end array In such formulation x R n displaystyle x in mathbb R n is the first stage decision variable vector y R m displaystyle y in mathbb R m is the second stage decision variable vector and 3 q T W h displaystyle xi q T W h contains the data of the second stage problem In this formulation at the first stage we have to make a here and now decision x displaystyle x before the realization of the uncertain data 3 displaystyle xi viewed as a random vector is known At the second stage after a realization of 3 displaystyle xi becomes available we optimize our behavior by solving an appropriate optimization problem At the first stage we optimize minimize in the above formulation the cost c T x displaystyle c T x of the first stage decision plus the expected cost of the optimal second stage decision We can view the second stage problem simply as an optimization problem which describes our supposedly optimal behavior when the uncertain data is revealed or we can consider its solution as a recourse action where the term W y displaystyle Wy compensates for a possible inconsistency of the system T x h displaystyle Tx leq h and q T y displaystyle q T y is the cost of this recourse action The considered two stage problem is linear because the objective functions and the constraints are linear Conceptually this is not essential and one can consider more general two stage stochastic programs For example if the first stage problem is integer one could add integrality constraints to the first stage problem so that the feasible set is discrete Non linear objectives and constraints could also be incorporated if needed 5 Distributional assumption Edit The formulation of the above two stage problem assumes that the second stage data 3 displaystyle xi can be modeled as a random vector with a known probability distribution not just uncertain This would be justified in many situations For example 3 displaystyle xi could be information derived from historical data and the distribution does not significantly change over the considered period of time In such situations one may reliably estimate the required probability distribution and the optimization on average could be justified by the law of large numbers Another example is that 3 displaystyle xi could be realizations of a simulation model whose outputs are stochastic The empirical distribution of the sample could be used as an approximation to the true but unknown output distribution Discretization Edit To solve the two stage stochastic problem numerically one often needs to assume that the random vector 3 displaystyle xi has a finite number of possible realizations called scenarios say 3 1 3 K displaystyle xi 1 dots xi K with respective probability masses p 1 p K displaystyle p 1 dots p K Then the expectation in the first stage problem s objective function can be written as the summation E Q x 3 k 1 K p k Q x 3 k displaystyle E Q x xi sum limits k 1 K p k Q x xi k and moreover the two stage problem can be formulated as one large linear programming problem this is called the deterministic equivalent of the original problem see section Deterministic equivalent of a stochastic problem When 3 displaystyle xi has an infinite or very large number of possible realizations the standard approach is then to represent this distribution by scenarios This approach raises three questions namely How to construct scenarios see Scenario construction How to solve the deterministic equivalent Optimizers such as CPLEX GLPK and Gurobi can solve large linear nonlinear problems The NEOS Server 6 hosted at the University of Wisconsin Madison allows free access to many modern solvers The structure of a deterministic equivalent is particularly amenable to apply decomposition methods 7 such as Benders decomposition or scenario decomposition How to measure quality of the obtained solution with respect to the true optimum These questions are not independent For example the number of scenarios constructed will affect both the tractability of the deterministic equivalent and the quality of the obtained solutions Stochastic linear programming EditA stochastic linear program is a specific instance of the classical two stage stochastic program A stochastic LP is built from a collection of multi period linear programs LPs each having the same structure but somewhat different data The k t h displaystyle k th two period LP representing the k t h displaystyle k th scenario may be regarded as having the following form Minimize f T x g T y h k T z k subject to T x U y r V k y W k z k s k x y z k 0 displaystyle begin array lccccccc text Minimize amp f T x amp amp g T y amp amp h k T z k amp amp text subject to amp Tx amp amp Uy amp amp amp amp r amp amp amp V k y amp amp W k z k amp amp s k amp x amp amp y amp amp z k amp geq amp 0 end array The vectors x displaystyle x and y displaystyle y contain the first period variables whose values must be chosen immediately The vector z k displaystyle z k contains all of the variables for subsequent periods The constraints T x U y r displaystyle Tx Uy r involve only first period variables and are the same in every scenario The other constraints involve variables of later periods and differ in some respects from scenario to scenario reflecting uncertainty about the future Note that solving the k t h displaystyle k th two period LP is equivalent to assuming the k t h displaystyle k th scenario in the second period with no uncertainty In order to incorporate uncertainties in the second stage one should assign probabilities to different scenarios and solve the corresponding deterministic equivalent Deterministic equivalent of a stochastic problem Edit With a finite number of scenarios two stage stochastic linear programs can be modelled as large linear programming problems This formulation is often called the deterministic equivalent linear program or abbreviated to deterministic equivalent Strictly speaking a deterministic equivalent is any mathematical program that can be used to compute the optimal first stage decision so these will exist for continuous probability distributions as well when one can represent the second stage cost in some closed form For example to form the deterministic equivalent to the above stochastic linear program we assign a probability p k displaystyle p k to each scenario k 1 K displaystyle k 1 dots K Then we can minimize the expected value of the objective subject to the constraints from all scenarios Minimize f x g y p 1 h 1 z 1 p 2 h 2 T z 2 p K h K z K subject to T x U y r V 1 y W 1 z 1 s 1 V 2 y W 2 z 2 s 2 V K y W K z K s K x y z 1 z 2 z K 0 displaystyle begin array lccccccccccccc text Minimize amp f top x amp amp g top y amp amp p 1 h 1 top z 1 amp amp p 2 h 2 T z 2 amp amp cdots amp amp p K h K top z K amp amp text subject to amp Tx amp amp Uy amp amp amp amp amp amp amp amp amp amp r amp amp amp V 1 y amp amp W 1 z 1 amp amp amp amp amp amp amp amp s 1 amp amp amp V 2 y amp amp amp amp W 2 z 2 amp amp amp amp amp amp s 2 amp amp amp vdots amp amp amp amp amp amp ddots amp amp amp amp vdots amp amp amp V K y amp amp amp amp amp amp amp amp W K z K amp amp s K amp x amp amp y amp amp z 1 amp amp z 2 amp amp ldots amp amp z K amp geq amp 0 end array We have a different vector z k displaystyle z k of later period variables for each scenario k displaystyle k The first period variables x displaystyle x and y displaystyle y are the same in every scenario however because we must make a decision for the first period before we know which scenario will be realized As a result the constraints involving just x displaystyle x and y displaystyle y need only be specified once while the remaining constraints must be given separately for each scenario Scenario construction EditIn practice it might be possible to construct scenarios by eliciting experts opinions on the future The number of constructed scenarios should be relatively modest so that the obtained deterministic equivalent can be solved with reasonable computational effort It is often claimed that a solution that is optimal using only a few scenarios provides more adaptable plans than one that assumes a single scenario only In some cases such a claim could be verified by a simulation In theory some measures of guarantee that an obtained solution solves the original problem with reasonable accuracy Typically in applications only the first stage optimal solution x displaystyle x has a practical value since almost always a true realization of the random data will be different from the set of constructed generated scenarios Suppose 3 displaystyle xi contains d displaystyle d independent random components each of which has three possible realizations for example future realizations of each random parameters are classified as low medium and high then the total number of scenarios is K 3 d displaystyle K 3 d Such exponential growth of the number of scenarios makes model development using expert opinion very difficult even for reasonable size d displaystyle d The situation becomes even worse if some random components of 3 displaystyle xi have continuous distributions Monte Carlo sampling and Sample Average Approximation SAA Method Edit A common approach to reduce the scenario set to a manageable size is by using Monte Carlo simulation Suppose the total number of scenarios is very large or even infinite Suppose further that we can generate a sample 3 1 3 2 3 N displaystyle xi 1 xi 2 dots xi N of N displaystyle N replications of the random vector 3 displaystyle xi Usually the sample is assumed to be independent and identically distributed i i d sample Given a sample the expectation function q x E Q x 3 displaystyle q x E Q x xi is approximated by the sample average q N x 1 N j 1 N Q x 3 j displaystyle hat q N x frac 1 N sum j 1 N Q x xi j and consequently the first stage problem is given by g N x min x R n c T x 1 N j 1 N Q x 3 j subject to A x b x 0 displaystyle begin array rlrrr hat g N x amp min limits x in mathbb R n amp c T x frac 1 N sum j 1 N Q x xi j amp amp text subject to amp Ax amp amp b amp amp x amp geq amp 0 end array This formulation is known as the Sample Average Approximation method The SAA problem is a function of the considered sample and in that sense is random For a given sample 3 1 3 2 3 N displaystyle xi 1 xi 2 dots xi N the SAA problem is of the same form as a two stage stochastic linear programming problem with the scenarios 3 j displaystyle xi j j 1 N displaystyle j 1 dots N each taken with the same probability p j 1 N displaystyle p j frac 1 N Statistical inference EditConsider the following stochastic programming problem min x X g x f x E Q x 3 displaystyle min limits x in X g x f x E Q x xi Here X displaystyle X is a nonempty closed subset of R n displaystyle mathbb R n 3 displaystyle xi is a random vector whose probability distribution P displaystyle P is supported on a set 3 R d displaystyle Xi subset mathbb R d and Q X 3 R displaystyle Q X times Xi rightarrow mathbb R In the framework of two stage stochastic programming Q x 3 displaystyle Q x xi is given by the optimal value of the corresponding second stage problem Assume that g x displaystyle g x is well defined and finite valued for all x X displaystyle x in X This implies that for every x X displaystyle x in X the value Q x 3 displaystyle Q x xi is finite almost surely Suppose that we have a sample 3 1 3 N displaystyle xi 1 dots xi N of N displaystyle N realizations of the random vector 3 displaystyle xi This random sample can be viewed as historical data of N displaystyle N observations of 3 displaystyle xi or it can be generated by Monte Carlo sampling techniques Then we can formulate a corresponding sample average approximation min x X g N x f x 1 N j 1 N Q x 3 j displaystyle min limits x in X hat g N x f x frac 1 N sum j 1 N Q x xi j By the Law of Large Numbers we have that under some regularity conditions 1 N j 1 N Q x 3 j displaystyle frac 1 N sum j 1 N Q x xi j converges pointwise with probability 1 to E Q x 3 displaystyle E Q x xi as N displaystyle N rightarrow infty Moreover under mild additional conditions the convergence is uniform We also have E g N x g x displaystyle E hat g N x g x i e g N x displaystyle hat g N x is an unbiased estimator of g x displaystyle g x Therefore it is natural to expect that the optimal value and optimal solutions of the SAA problem converge to their counterparts of the true problem as N displaystyle N rightarrow infty Consistency of SAA estimators Edit Suppose the feasible set X displaystyle X of the SAA problem is fixed i e it is independent of the sample Let ϑ displaystyle vartheta and S displaystyle S be the optimal value and the set of optimal solutions respectively of the true problem and let ϑ N displaystyle hat vartheta N and S N displaystyle hat S N be the optimal value and the set of optimal solutions respectively of the SAA problem Let g X R displaystyle g X rightarrow mathbb R and g N X R displaystyle hat g N X rightarrow mathbb R be a sequence of deterministic real valued functions The following two properties are equivalent for any x X displaystyle overline x in X and any sequence x N X displaystyle x N subset X converging to x displaystyle overline x it follows that g N x N displaystyle hat g N x N converges to g x displaystyle g overline x the function g displaystyle g cdot is continuous on X displaystyle X and g N displaystyle hat g N cdot converges to g displaystyle g cdot uniformly on any compact subset of X displaystyle X If the objective of the SAA problem g N x displaystyle hat g N x converges to the true problem s objective g x displaystyle g x with probability 1 as N displaystyle N rightarrow infty uniformly on the feasible set X displaystyle X Then ϑ N displaystyle hat vartheta N converges to ϑ displaystyle vartheta with probability 1 as N displaystyle N rightarrow infty Suppose that there exists a compact set C R n displaystyle C subset mathbb R n such that the set S displaystyle S of optimal solutions of the true problem is nonempty and is contained in C displaystyle C the function g x displaystyle g x is finite valued and continuous on C displaystyle C the sequence of functions g N x displaystyle hat g N x converges to g x displaystyle g x with probability 1 as N displaystyle N rightarrow infty uniformly in x C displaystyle x in C for N displaystyle N large enough the set S N displaystyle hat S N is nonempty and S N C displaystyle hat S N subset C with probability 1then ϑ N ϑ displaystyle hat vartheta N rightarrow vartheta and D S S N 0 displaystyle mathbb D S hat S N rightarrow 0 with probability 1 as N displaystyle N rightarrow infty Note that D A B displaystyle mathbb D A B denotes the deviation of set A displaystyle A from set B displaystyle B defined as dd D A B sup x A inf x B x x displaystyle mathbb D A B sup x in A inf x in B x x In some situations the feasible set X displaystyle X of the SAA problem is estimated then the corresponding SAA problem takes the form min x X N g N x displaystyle min x in X N hat g N x where X N displaystyle X N is a subset of R n displaystyle mathbb R n depending on the sample and therefore is random Nevertheless consistency results for SAA estimators can still be derived under some additional assumptions Suppose that there exists a compact set C R n displaystyle C subset mathbb R n such that the set S displaystyle S of optimal solutions of the true problem is nonempty and is contained in C displaystyle C the function g x displaystyle g x is finite valued and continuous on C displaystyle C the sequence of functions g N x displaystyle hat g N x converges to g x displaystyle g x with probability 1 as N displaystyle N rightarrow infty uniformly in x C displaystyle x in C for N displaystyle N large enough the set S N displaystyle hat S N is nonempty and S N C displaystyle hat S N subset C with probability 1 if x N X N displaystyle x N in X N and x N displaystyle x N converges with probability 1 to a point x displaystyle x then x X displaystyle x in X for some point x S displaystyle x in S there exists a sequence x N X N displaystyle x N in X N such that x N x displaystyle x N rightarrow x with probability 1 then ϑ N ϑ displaystyle hat vartheta N rightarrow vartheta and D S S N 0 displaystyle mathbb D S hat S N rightarrow 0 with probability 1 as N displaystyle N rightarrow infty dd Asymptotics of the SAA optimal value Edit Suppose the sample 3 1 3 N displaystyle xi 1 dots xi N is i i d and fix a point x X displaystyle x in X Then the sample average estimator g N x displaystyle hat g N x of g x displaystyle g x is unbiased and has variance 1 N s 2 x displaystyle frac 1 N sigma 2 x where s 2 x V a r Q x 3 displaystyle sigma 2 x Var Q x xi is supposed to be finite Moreover by the central limit theorem we have that N g N g x D Y x displaystyle sqrt N hat g N g x xrightarrow mathcal D Y x where D displaystyle xrightarrow mathcal D denotes convergence in distribution and Y x displaystyle Y x has a normal distribution with mean 0 displaystyle 0 and variance s 2 x displaystyle sigma 2 x written as N 0 s 2 x displaystyle mathcal N 0 sigma 2 x In other words g N x displaystyle hat g N x has asymptotically normal distribution i e for large N displaystyle N g N x displaystyle hat g N x has approximately normal distribution with mean g x displaystyle g x and variance 1 N s 2 x displaystyle frac 1 N sigma 2 x This leads to the following approximate 100 1 a displaystyle 100 1 alpha confidence interval for f x displaystyle f x g N x z a 2 s x N g N x z a 2 s x N displaystyle left hat g N x z alpha 2 frac hat sigma x sqrt N hat g N x z alpha 2 frac hat sigma x sqrt N right where z a 2 F 1 1 a 2 displaystyle z alpha 2 Phi 1 1 alpha 2 here F displaystyle Phi cdot denotes the cdf of the standard normal distribution and s 2 x 1 N 1 j 1 N Q x 3 j 1 N j 1 N Q x 3 j 2 displaystyle hat sigma 2 x frac 1 N 1 sum j 1 N left Q x xi j frac 1 N sum j 1 N Q x xi j right 2 is the sample variance estimate of s 2 x displaystyle sigma 2 x That is the error of estimation of g x displaystyle g x is stochastically of order O N displaystyle O sqrt N Applications and Examples EditBiological applications Edit Stochastic dynamic programming is frequently used to model animal behaviour in such fields as behavioural ecology 8 9 Empirical tests of models of optimal foraging life history transitions such as fledging in birds and egg laying in parasitoid wasps have shown the value of this modelling technique in explaining the evolution of behavioural decision making These models are typically many staged rather than two staged Economic applications Edit Stochastic dynamic programming is a useful tool in understanding decision making under uncertainty The accumulation of capital stock under uncertainty is one example often it is used by resource economists to analyze bioeconomic problems 10 where the uncertainty enters in such as weather etc Example multistage portfolio optimization Edit Main article Intertemporal portfolio choice See also Merton s portfolio problem The following is an example from finance of multi stage stochastic programming Suppose that at time t 0 displaystyle t 0 we have initial capital W 0 displaystyle W 0 to invest in n displaystyle n assets Suppose further that we are allowed to rebalance our portfolio at times t 1 T 1 displaystyle t 1 dots T 1 but without injecting additional cash into it At each period t displaystyle t we make a decision about redistributing the current wealth W t displaystyle W t among the n displaystyle n assets Let x 0 x 10 x n 0 displaystyle x 0 x 10 dots x n0 be the initial amounts invested in the n assets We require that each x i 0 displaystyle x i0 is nonnegative and that the balance equation i 1 n x i 0 W 0 displaystyle sum i 1 n x i0 W 0 should hold Consider the total returns 3 t 3 1 t 3 n t displaystyle xi t xi 1t dots xi nt for each period t 1 T displaystyle t 1 dots T This forms a vector valued random process 3 1 3 T displaystyle xi 1 dots xi T At time period t 1 displaystyle t 1 we can rebalance the portfolio by specifying the amounts x 1 x 11 x n 1 displaystyle x 1 x 11 dots x n1 invested in the respective assets At that time the returns in the first period have been realized so it is reasonable to use this information in the rebalancing decision Thus the second stage decisions at time t 1 displaystyle t 1 are actually functions of realization of the random vector 3 1 displaystyle xi 1 i e x 1 x 1 3 1 displaystyle x 1 x 1 xi 1 Similarly at time t displaystyle t the decision x t x 1 t x n t displaystyle x t x 1t dots x nt is a function x t x t 3 t displaystyle x t x t xi t of the available information given by 3 t 3 1 3 t displaystyle xi t xi 1 dots xi t the history of the random process up to time t displaystyle t A sequence of functions x t x t 3 t displaystyle x t x t xi t t 0 T 1 displaystyle t 0 dots T 1 with x 0 displaystyle x 0 being constant defines an implementable policy of the decision process It is said that such a policy is feasible if it satisfies the model constraints with probability 1 i e the nonnegativity constraints x i t 3 t 0 displaystyle x it xi t geq 0 i 1 n displaystyle i 1 dots n t 0 T 1 displaystyle t 0 dots T 1 and the balance of wealth constraints i 1 n x i t 3 t W t displaystyle sum i 1 n x it xi t W t where in period t 1 T displaystyle t 1 dots T the wealth W t displaystyle W t is given by W t i 1 n 3 i t x i t 1 3 t 1 displaystyle W t sum i 1 n xi it x i t 1 xi t 1 which depends on the realization of the random process and the decisions up to time t displaystyle t Suppose the objective is to maximize the expected utility of this wealth at the last period that is to consider the problem max E U W T displaystyle max E U W T This is a multistage stochastic programming problem where stages are numbered from t 0 displaystyle t 0 to t T 1 displaystyle t T 1 Optimization is performed over all implementable and feasible policies To complete the problem description one also needs to define the probability distribution of the random process 3 1 3 T displaystyle xi 1 dots xi T This can be done in various ways For example one can construct a particular scenario tree defining time evolution of the process If at every stage the random return of each asset is allowed to have two continuations independent of other assets then the total number of scenarios is 2 n T displaystyle 2 nT In order to write dynamic programming equations consider the above multistage problem backward in time At the last stage t T 1 displaystyle t T 1 a realization 3 T 1 3 1 3 T 1 displaystyle xi T 1 xi 1 dots xi T 1 of the random process is known and x T 2 displaystyle x T 2 has been chosen Therefore one needs to solve the following problem max x T 1 E U W T 3 T 1 subject to W T i 1 n 3 i T x i T 1 i 1 n x i T 1 W T 1 x T 1 0 displaystyle begin array lrclr max limits x T 1 amp E U W T xi T 1 amp text subject to amp W T amp amp sum i 1 n xi iT x i T 1 amp sum i 1 n x i T 1 amp amp W T 1 amp x T 1 amp geq amp 0 end array where E U W T 3 T 1 displaystyle E U W T xi T 1 denotes the conditional expectation of U W T displaystyle U W T given 3 T 1 displaystyle xi T 1 The optimal value of the above problem depends on W T 1 displaystyle W T 1 and 3 T 1 displaystyle xi T 1 and is denoted Q T 1 W T 1 3 T 1 displaystyle Q T 1 W T 1 xi T 1 Similarly at stages t T 2 1 displaystyle t T 2 dots 1 one should solve the problem max x t E Q t 1 W t 1 3 t 1 3 t subject to W t 1 i 1 n 3 i t 1 x i t i 1 n x i t W t x t 0 displaystyle begin array lrclr max limits x t amp E Q t 1 W t 1 xi t 1 xi t amp text subject to amp W t 1 amp amp sum i 1 n xi i t 1 x i t amp sum i 1 n x i t amp amp W t amp x t amp geq amp 0 end array whose optimal value is denoted by Q t W t 3 t displaystyle Q t W t xi t Finally at stage t 0 displaystyle t 0 one solves the problem max x 0 E Q 1 W 1 3 1 subject to W 1 i 1 n 3 i 1 x i 0 i 1 n x i 0 W 0 x 0 0 displaystyle begin array lrclr max limits x 0 amp E Q 1 W 1 xi 1 amp text subject to amp W 1 amp amp sum i 1 n xi i 1 x i0 amp sum i 1 n x i0 amp amp W 0 amp x 0 amp geq amp 0 end array Stagewise independent random process Edit For a general distribution of the process 3 t displaystyle xi t it may be hard to solve these dynamic programming equations The situation simplifies dramatically if the process 3 t displaystyle xi t is stagewise independent i e 3 t displaystyle xi t is stochastically independent of 3 1 3 t 1 displaystyle xi 1 dots xi t 1 for t 2 T displaystyle t 2 dots T In this case the corresponding conditional expectations become unconditional expectations and the function Q t W t displaystyle Q t W t t 1 T 1 displaystyle t 1 dots T 1 does not depend on 3 t displaystyle xi t That is Q T 1 W T 1 displaystyle Q T 1 W T 1 is the optimal value of the problem max x T 1 E U W T subject to W T i 1 n 3 i T x i T 1 i 1 n x i T 1 W T 1 x T 1 0 displaystyle begin array lrclr max limits x T 1 amp E U W T amp text subject to amp W T amp amp sum i 1 n xi iT x i T 1 amp sum i 1 n x i T 1 amp amp W T 1 amp x T 1 amp geq amp 0 end array and Q t W t displaystyle Q t W t is the optimal value of max x t E Q t 1 W t 1 subject to W t 1 i 1 n 3 i t 1 x i t i 1 n x i t W t x t 0 displaystyle begin array lrclr max limits x t amp E Q t 1 W t 1 amp text subject to amp W t 1 amp amp sum i 1 n xi i t 1 x i t amp sum i 1 n x i t amp amp W t amp x t amp geq amp 0 end array for t T 2 1 displaystyle t T 2 dots 1 Software tools EditModelling languages Edit All discrete stochastic programming problems can be represented with any algebraic modeling language manually implementing explicit or implicit non anticipativity to make sure the resulting model respects the structure of the information made available at each stage An instance of an SP problem generated by a general modelling language tends to grow quite large linearly in the number of scenarios and its matrix loses the structure that is intrinsic to this class of problems which could otherwise be exploited at solution time by specific decomposition algorithms Extensions to modelling languages specifically designed for SP are starting to appear see AIMMS supports the definition of SP problems EMP SP Extended Mathematical Programming for Stochastic Programming a module of GAMS created to facilitate stochastic programming includes keywords for parametric distributions chance constraints and risk measures such as Value at risk and Expected shortfall SAMPL a set of extensions to AMPL specifically designed to express stochastic programs includes syntax for chance constraints integrated chance constraints and Robust Optimization problems They both can generate SMPS instance level format which conveys in a non redundant form the structure of the problem to the solver See also EditCorrelation gap EMP for Stochastic Programming Entropic value at risk FortSP SAMPL algebraic modeling language Scenario optimization Stochastic optimization Chance constrained portfolio selectionReferences Edit Shapiro Alexander Dentcheva Darinka Ruszczynski Andrzej 2009 Lectures on stochastic programming Modeling and theory PDF MPS SIAM Series on Optimization 9 Philadelphia PA Society for Industrial and Applied Mathematics SIAM Mathematical Programming Society MPS pp xvi 436 ISBN 978 0 89871 687 0 MR 2562798 Birge John R Louveaux Francois 2011 Introduction to Stochastic Programming Springer Series in Operations Research and Financial Engineering doi 10 1007 978 1 4614 0237 4 ISBN 978 1 4614 0236 7 ISSN 1431 8598 Stein W Wallace and William T Ziemba eds Applications of Stochastic Programming MPS SIAM Book Series on Optimization 5 2005 Applications of stochastic programming are described at the following website Stochastic Programming Community Shapiro Alexander Philpott Andy A tutorial on Stochastic Programming PDF NEOS Server for Optimization Ruszczynski Andrzej Shapiro Alexander 2003 Stochastic Programming Handbooks in Operations Research and Management Science 10 Philadelphia Elsevier p 700 ISBN 978 0444508546 Mangel M amp Clark C W 1988 Dynamic modeling in behavioral ecology Princeton University Press ISBN 0 691 08506 4 Houston A I amp McNamara J M 1999 Models of adaptive behaviour an approach based on state Cambridge University Press ISBN 0 521 65539 0 Howitt R Msangi S Reynaud A and K Knapp 2002 Using Polynomial Approximations to Solve Stochastic Dynamic Programming Problems or A Betty Crocker Approach to SDP University of California Davis Department of Agricultural and Resource Economics Working Paper Further reading EditJohn R Birge and Francois V Louveaux Introduction to Stochastic Programming Springer Verlag New York 1997 Kall Peter Wallace Stein W 1994 Stochastic programming Wiley Interscience Series in Systems and Optimization Chichester John Wiley amp Sons Ltd pp xii 307 ISBN 0 471 95158 7 MR 1315300 G Ch Pflug Optimization of Stochastic Models The Interface between Simulation and Optimization Kluwer Dordrecht 1996 Andras Prekopa Stochastic Programming Kluwer Academic Publishers Dordrecht 1995 Andrzej Ruszczynski and Alexander Shapiro eds 2003 Stochastic Programming Handbooks in Operations Research and Management Science Vol 10 Elsevier Shapiro Alexander Dentcheva Darinka Ruszczynski Andrzej 2009 Lectures on stochastic programming Modeling and theory PDF MPS SIAM Series on Optimization 9 Philadelphia PA Society for Industrial and Applied Mathematics SIAM Mathematical Programming Society MPS pp xvi 436 ISBN 978 0 89871 687 0 MR 2562798 Stein W Wallace and William T Ziemba eds 2005 Applications of Stochastic Programming MPS SIAM Book Series on Optimization 5 King Alan J Wallace Stein W 2012 Modeling with Stochastic Programming Springer Series in Operations Research and Financial Engineering New York Springer ISBN 978 0 387 87816 4 External links EditStochastic Programming Community Home Page Retrieved from https en wikipedia org w index php title Stochastic programming amp oldid 1030772636, wikipedia, wiki, book,

books

, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.