Karush-Kuhn-Tucker Optimality Necessary Conditions Consider the problem − $min :fleft ( x right )$ such that $x in X$, where X is an open set in $mathbb{R}^n$ and $g_i left ( x right )leq 0, i=1, 2,…,m$ Let $S=left { x in X:g_ileft ( x right )leq 0, forall i right }$ Let $hat{x} in S$ and let $f$ and $g_i,i in I$ are differentiable at $hat{x}$ and $g_i, i in J$ are continuous at $hat{x}$. Furthermore, $bigtriangledown g_ileft ( hat{x} right), i in I$ are linearly independent. If $hat{x}$ solves the above problem locally, then there exists $u_i,i in I$ such that $bigtriangledown fleft ( xright)+displaystylesumlimits_{iin I} u_i bigtriangledown g_ileft ( hat{x} right)=0$, $::u_i geq 0, i in I$ If $g_i,i in J$ are also differentiable at $hat{x}$. then $hat{x}$, then $bigtriangledown fleft ( hat{x}right)+displaystylesumlimits_{i= 1}^m u_i bigtriangledown g_ileft ( hat{x} right)=0$ $u_ig_ileft ( hat{x} right)=0, forall i=1,2,…,m$ $u_i geq 0 forall i=1,2,…,m$ Example Consider the following problem − $min :fleft ( x_1,x_2 right )=left ( x_1-3right )^2+left ( x_2-2right )^2$ such that $x_{1}^{2}+x_{2}^{2}leq 5$, $x_1,2x_2 geq 0$ and $hat{x}=left ( 2,1 right )$ Let $g_1left ( x_1, x_2 right)=x_{1}^{2}+x_{2}^{2}-5$, $g_2left ( x_1, x_2 right)=x_{1}+2x_2-4$ $g_3left ( x_1, x_2 right)=-x_{1}$ and $g_4left ( x_1,x_2 right )=-x_2$ Thus the above constraints can be written as − $g_1 left ( x_1,x_2 right)leq 0, g_2 left ( x_1,x_2 right) leq 0$ $g_3 left ( x_1,x_2 right)leq 0,$ and $g_4 left ( x_1,x_2 right) leq 0$ Thus, $I=left { 1,2 right }$ therefore, $ u_3=0,:: u_4=0$ $bigtriangledown f left ( hat{x} right)=left ( 2,-2 right), bigtriangledown g_1 left ( hat{x} right)= left ( 4,2 right)$ and $bigtriangledown g_2left ( hat{x} right ) =left ( 1,2 right )$ Thus putting these values in the first condition of Karush-Kuhn-Tucker conditions, we get − $u_1=frac{1}{3}$ and $u_2=frac{2}{3}$ Thus Karush-Kuhn-Tucker conditions are satisfied. Learning working make money
Category: convex Optimization
Convex Optimization – Programming Problem There are four types of convex programming problems − Step 1 − $min :fleft ( x right )$, where $x in S$ and S be a non-empty convex set in $mathbb{R}^n$ and $fleft ( x right )$ is convex function. Step 2 − $min : fleft ( x right ), x in mathbb{R}^n$ subject to $g_ileft ( x right ) geq 0, 1 leq m_1$ and $g_ileft ( x right )$ is a convex function. $g_ileft ( x right ) leq 0,m_1+1 leq m_2$ and $g_ileft ( x right )$ is a concave function. $g_ileft ( x right ) = 0, m_2+1 leq m$ and $g_ileft ( x right )$ is a linear function. where $fleft ( x right )$ is a convex fucntion. Step 3 − $max :fleft ( x right )$ where $x in S$ and S be a non-empty convex set in $mathbb{R}^n$ and $fleft ( x right )$ is concave function. Step 4 − $min :fleft ( x right )$, where $x in mathbb{R}^n$ subject to $g_ileft ( x right ) geq 0, 1 leq m_1$ and $g_ileft ( x right )$ is a convex function. $g_ileft ( x right ) leq 0, m_1+1 leq m_2$ and $g_ileft ( x right )$ is a concave function. $g_ileft ( x right ) = 0,m_2+1 leq m$ and $g_ileft ( x right )$ is a linear function. where $fleft ( x right )$ is a concave function. Cone of feasible direction Let S be a non-empty set in $mathbb{R}^n$ and let $hat{x} in :Closureleft ( S right )$, then the cone of feasible direction of S at $hat{x}$, denoted by D, is defined as $D=left { d:dneq 0,hat{x}+lambda d in S, lambda in left ( 0, delta right ), delta > 0 right }$ Each non-zero vector $d in D$ is called feasible direction. For a given function $f:mathbb{R}^n Rightarrow mathbb{R}$ the cone of improving direction at $hat{x}$ is denoted by F and is given by $$F=left { d:fleft ( hat{x}+lambda d right )leq fleft ( hat{x} right ),forall lambda in left ( 0,delta right ), delta >0 right }$$ Each direction $d in F$ is called an improving direction or descent direction of f at $hat{x}$ Theorem Necessary Condition Consider the problem $min fleft ( x right )$ such that $x in S$ where S be a non-empty set in $mathbb{R}^n$. Suppose f is differentiable at a point $hat{x} in S$. If $hat{x}$ is a local optimal solution, then $F_0 cap D= phi$ where $F_0=left { d:bigtriangledown fleft ( hat{x} right )^T d Sufficient Condition If $F_0 cap D= phi$ f is a pseudoconvex function at $hat{x}$ and there exists a neighbourhood of $hat{x},N_varepsilon left ( hat{x} right ), varepsilon > 0$ such that $d=x-hat{x}in D$ for any $x in S cap N_varepsilon left ( hat{x} right )$, then $hat{x}$ is local optimal solution. Proof Necessary Condition Let $F_0 cap Dneq phi$, ie, there exists a $d in F_0 cap D$ such that $d in F_0$ and $din D$ Since $d in D$, therefore there exists $delta_1 >0$ such that $hat{x}+lambda d in S, lambda in left ( 0,delta_1 right ).$ Since $d in F_0$, therefore $bigtriangledown f left ( hat{x}right )^T d Thus, there exists $delta_2>0$ such that $fleft ( hat{x}+lambda dright ) Let $delta=min left {delta_1,delta_2 right }$ Then $hat{x}+lambda d in S, fleft (hat{x}+lambda d right ) But $hat{x}$ is local optimal solution. Hence it is contradiction. Thus $F_0cap D=phi$ Sufficient Condition Let $F_0 cap Dneq phi$ nd let f be a pseudoconvex function. Let there exists a neighbourhood of $hat{x}, N_{varepsilon}left ( hat{x} right )$ such that $d=x-hat{x}, forall x in S cap N_varepsilonleft ( hat{x} right )$ Let $hat{x}$ is not a local optimal solution. Thus, there exists $bar{x} in S cap N_varepsilon left ( hat{x} right )$ such that $f left ( bar{x} right ) By assumption on $S cap N_varepsilon left ( hat{x} right ),d=left ( bar{x}-hat{x} right )in D$ By pseudoconvex of f, $$fleft ( hat{x} right )>fleft ( bar{x} right )Rightarrow bigtriangledown fleft ( hat{x} right )^Tleft ( bar{x}-hat{x} right ) $Rightarrow bigtriangledown fleft ( hat{x} right) ^T d $Rightarrow d in F_0$ hence $F_0cap D neq phi$ which is a contradiction. Hence, $hat{x}$ is local optimal solution. Consider the following problem:$min :fleft ( xright )$ where $x in X$ such that $g_xleft ( xright ) leq 0, i=1,2,…,m$ $f:X rightarrow mathbb{R},g_i:X rightarrow mathbb{R}^n$ and X is an open set in $mathbb{R}^n$ Let $S=left {x:g_ileft ( xright )leq 0,forall i right }$ Let $hat{x} in X$, then $M=left {1,2,…,m right }$ Let $I=left {i:g_ileft ( hat{x}right )=0, i in Mright }$ where I is called index set for all active constraints at $hat{x}$ Let $J=left {i:g_ileft ( hat{x}right ) Let $F_0=left { d in mathbb{R}^m:bigtriangledown fleft ( hat{x} right )^T d Let $G_0=left { d in mathbb{R}^m:bigtriangledown gIleft ( hat{x} right )^T d or $G_0=left { d in mathbb{R}^m:bigtriangledown gileft ( hat{x} right )^T d Lemma If $S=left { x in X:g_ileft ( xright ) leq 0, forall i in Iright }$ and X is non-empty open set in $mathbb{R}^n$. Let $hat{x}in S$ and $g_i$ are different at $hat{x}, i in I$ and let $g_i$ where $i in J$ are continuous at $hat{x}$, then $G_0 subseteq D$. Proof Let $d in G_0$ Since $hat{x} in X$ and X is an open set, thus there exists $delta_1 >0$ such that $hat{x}+lambda d in X$ for $lambda in left ( 0, delta_1right )$ Also since $g_hat{x}0$, $g_ileft ( hat{x}+lambda dright ) Since $d in G_0$, therefore, $bigtriangledown g_ileft ( hat{x}right )^T d 0, g_ileft ( hat{x}+lambda dright ) Let $delta=minleft { delta_1, delta_2, delta_3 right }$ therefore, $hat{x}+lambda d in X, g_ileft ( hat{x}+lambda dright ) $Rightarrow hat{x}+lambda d in S$ $Rightarrow d in D$ $Rightarrow G_0 subseteq D$ Hence Proved. Theorem Necessary Condition Let $f$ and $g_i, i in I$, are different at $hat{x} in S,$ and $g_j$ are continous at $hat{x} in
Convex Optimization – Quick Guide Convex Optimization – Introduction This course is useful for the students who want to solve non-linear optimization problems that arise in various engineering and scientific applications. This course starts with basic theory of linear programming and will introduce the concepts of convex sets and functions and related terminologies to explain various theorems that are required to solve the non linear programming problems. This course will introduce various algorithms that are used to solve such problems. These type of problems arise in various applications including machine learning, optimization problems in electrical engineering, etc. It requires the students to have prior knowledge of high school maths concepts and calculus. In this course, the students will learn to solve the optimization problems like $min fleft ( x right )$ subject to some constraints. These problems are easily solvable if the function $fleft ( x right )$ is a linear function and if the constraints are linear. Then it is called a linear programming problem (LPP). But if the constraints are non-linear, then it is difficult to solve the above problem. Unless we can plot the functions in a graph, then try to analyse the optimization can be one way, but we can”t plot a function if it”s beyond three dimensions. Hence there comes the techniques of non-linear programming or convex programming to solve such problems. In these tutorial, we will focus on learning such techniques and in the end, a few algorithms to solve such problems. first we will bring the notion of convex sets which is the base of the convex programming problems. Then with the introduction of convex functions, we will some important theorems to solve these problems and some algorithms based on these theorems. Terminologies The space $mathbb{R}^n$ − It is an n-dimensional vector with real numbers, defined as follows − $mathbb{R}^n=left { left ( x_1,x_2,…,x_n right )^{tau }:x_1,x_2,….,x_n in mathbb{R} right }$ The space $mathbb{R}^{mXn}$ − It is a set of all real values matrices of order $mXn$. Convex Optimization – Linear Programming Methodology Linear Programming also called Linear Optimization, is a technique which is used to solve mathematical problems in which the relationships are linear in nature. the basic nature of Linear Programming is to maximize or minimize an objective function with subject to some constraints. The objective function is a linear function which is obtained from the mathematical model of the problem. The constraints are the conditions which are imposed on the model and are also linear. From the given question, find the objective function. find the constraints. Draw the constraints on a graph. find the feasible region, which is formed by the intersection of all the constraints. find the vertices of the feasible region. find the value of the objective function at these vertices. The vertice which either maximizes or minimizes the objective function (according to the question) is the answer. Examples Step 1 − Maximize $5x+3y$ subject to $x+yleq 2$, $3x+yleq 3$, $xgeq 0 :and :ygeq 0$ Solution − The first step is to find the feasible region on a graph. Clearly from the graph, the vertices of the feasible region are $left ( 0, 0 right )left ( 0, 2 right )left ( 1, 0 right )left ( frac{1}{2}, frac{3}{2} right )$ Let $fleft ( x, y right )=5x+3y$ Putting these values in the objective function, we get − $fleft ( 0, 0 right )$=0 $fleft ( 0, 2 right )$=6 $fleft ( 1, 0 right )$=5 $fleft ( frac{1}{2}, frac{3}{2} right )$=7 Therefore, the function maximizes at $left ( frac{1}{2}, frac{3}{2} right )$ Step 2 − A watch company produces a digital and a mechanical watch. Long-term projections indicate an expected demand of at least 100 digital and 80 mechanical watches each day. Because of limitations on production capacity, no more than 200 digital and 170 mechanical watches can be made daily. To satisfy a shipping contract, a total of at least 200 watches much be shipped each day. If each digital watch sold results in a $$2$ loss, but each mechanical watch produces a $$5$ profit, how many of each type should be made daily to maximize net profits? Solution − Let $x$ be the number of digital watches produced $y$ be the number of mechanical watches produced According to the question, at least 100 digital watches are to be made daily and maximaum 200 digital watches can be made. $Rightarrow 100 leq :xleq 200$ Similarly, at least 80 mechanical watches are to be made daily and maximum 170 mechanical watches can be made. $Rightarrow 80 leq :yleq 170$ Since at least 200 watches are to be produced each day. $Rightarrow x +yleq 200$ Since each digital watch sold results in a $$2$ loss, but each mechanical watch produces a $$5$ profit, Total profit can be calculated as $Profit =-2x + 5y$ And we have to maximize the profit, Therefore, the question can be formulated as − Maximize $-2x + 5y$ subject to $100 :leq x:leq 200$ $80 :leq y:leq 170$ $x+y:leq 200$ Plotting the above equations in a graph, we get, The vertices of the feasible region are $left ( 100, 170right )left ( 200, 170right )left ( 200, 180right )left ( 120, 80right ) and left ( 100, 100right )$ The maximum value of the objective function is obtained at $left ( 100, 170right )$ Thus, to maximize the net profits, 100 units of digital watches and 170 units of mechanical watches should be produced. Convex Optimization – Norm A norm is a function that gives a strictly positive value to a vector or a variable. Norm is a function $f:mathbb{R}^nrightarrow mathbb{R}$ The basic characteristics of a norm are − Let $X$ be a vector such that $Xin mathbb{R}^n$ $left | x right |geq 0$ $left | x right |= 0 Leftrightarrow x= 0forall x in X$ $left |alpha x right |=left | alpha right |left | x right |forall 😡 in X and :alpha :is :a :scalar$ $left | x+y right |leq left
Strongly Quasiconvex Function Let $f:Srightarrow mathbb{R}^n$ and S be a non-empty convex set in $mathbb{R}^n$ then f is strongly quasiconvex function if for any $x_1,x_2 in S$ with $left ( x_1 right ) neq left ( x_2 right )$, we have $fleft ( lambda x_1+left ( 1-lambda right )x_2 right ) Theorem A quasiconvex function $f:Srightarrow mathbb{R}^n$ on a non-empty convex set S in $mathbb{R}^n$ is strongly quasiconvex function if it is not constant on a line segment joining any points of S. Proof Let f is quasiconvex function and it is not constant on a line segment joining any points of S. Suppose f is not strongly quasiconvex function. There exist $x_1,x_2 in S$ with $x_1 neq x_2$ such that $$fleft ( z right )geq maxleft { fleft ( x_1 right ), fleft ( x_2 right ) right }, forall z= lambda x_1+left ( 1-lambda right )x_2, lambda in left ( 0,1 right )$$ $Rightarrow fleft ( x_1 right )leq fleft ( z right )$ and $fleft ( x_2 right )leq fleft ( z right )$ Since f is not constant in $left [ x_1,z right ]$ and $left [z,x_2 right ] $ So there exists $u in left [ x_1,z right ]$ and $v=left [ z,x_2 right ]$ $$Rightarrow u= mu_1x_1+left ( 1-mu_1right )z,v=mu_2z+left ( 1- mu_2right )x_2$$ Since f is quasiconvex, $$Rightarrow fleft ( u right )leq maxleft { fleft ( x_1 right ),f left ( z right ) right }=fleft ( z right ):: and ::f left ( v right ) leq max left { fleft ( z right ),fleft ( x_2 right ) right }$$ $$Rightarrow fleft ( u right )leq fleft ( z right ) :: and :: fleft ( v right )leq fleft ( z right )$$ $$Rightarrow max left { fleft ( u right ),fleft ( v right ) right } leq fleft ( z right )$$ But z is any point between u and v, if any of them are equal, then f is constant. Therefore, $max left { fleft ( u right ),fleft ( v right ) right } leq fleft ( z right )$ which contradicts the quasiconvexity of f as $z in left [ u,v right ]$. Hence f is strongly quasiconvex function. Theorem Let $f:Srightarrow mathbb{R}^n$ and S be a non-empty convex set in $mathbb{R}^n$. If $hat{x}$ is local optimal solution, then $hat{x}$ is unique global optimal solution. Proof Since a strong quasiconvex function is also strictly quasiconvex function, thus a local optimal solution is global optimal solution. Uniqueness − Let f attains global optimal solution at two points $u,v in S$ $$Rightarrow fleft ( u right ) leq fleft ( x right ).forall x in S:: and ::fleft ( v right ) leq fleft ( x right ).forall x in S$$ If u is global optimal solution, $fleft ( u right )leq fleft ( v right )$ and $fleft ( v right )leq fleft ( uright )Rightarrow fleft ( u right )=fleft ( vright )$ $$fleft ( lambda u+left ( 1-lambdaright )vright ) which is a contradiction. Hence there exists only one global optimal solution. Remarks A strongly quasiconvex function is also strictly quasiconvex fucntion. A strictly convex function may or may not be strongly quasiconvex. A differentiable strictly convex is strongly quasiconvex. Learning working make money
Convex Optimization – Differentiable Function Let S be a non-empty open set in $mathbb{R}^n$,then $f:Srightarrow mathbb{R}$ is said to be differentiable at $hat{x} in S$ if there exist a vector $bigtriangledown fleft ( hat{x} right )$ called gradient vector and a function $alpha :mathbb{R}^nrightarrow mathbb{R}$ such that $fleft ( x right )=fleft ( hat{x} right )+bigtriangledown fleft ( hat{x} right )^Tleft ( x-hat{x} right )+left | x=hat{x} right |alpha left ( hat{x}, x-hat{x} right ), forall x in S$ where $alpha left (hat{x}, x-hat{x} right )rightarrow 0 bigtriangledown fleft ( hat{x} right )=left [ frac{partial f}{partial x_1}frac{partial f}{partial x_2}…frac{partial f}{partial x_n} right ]_{x=hat{x}}^{T}$ Theorem let S be a non-empty, open convexset in $mathbb{R}^n$ and let $f:Srightarrow mathbb{R}$ be differentiable on S. Then, f is convex if and only if for $x_1,x_2 in S, bigtriangledown fleft ( x_2 right )^T left ( x_1-x_2 right ) leq fleft ( x_1 right )-fleft ( x_2 right )$ Proof Let f be a convex function. i.e., for $x_1,x_2 in S, lambda in left ( 0, 1 right )$ $fleft [ lambda x_1+left ( 1-lambda right )x_2 right ]leq lambda fleft ( x_1 right )+left ( 1-lambda right )fleft ( x_2 right )$ $ Rightarrow fleft [ lambda x_1+left ( 1-lambda right )x_2 right ]leq lambda left ( fleft ( x_1 right )-fleft ( x_2 right ) right )+fleft ( x_2 right )$ $ Rightarrowlambda left ( fleft ( x_1 right )-fleft ( x_2 right ) right )geq fleft ( x_2+lambda left ( x_1-x_2 right ) right )-fleft ( x_2 right )$ $Rightarrow lambda left ( fleft ( x_1 right )-fleft ( x_2 right ) right )geq fleft ( x_2 right )+bigtriangledown fleft ( x_2 right )^Tleft ( x_1-x_2 right )lambda +$ $left | lambda left ( x_1-x_2 right ) right |alpha left ( x_2,lambdaleft (x_1 – x_2 right )-fleft ( x_2 right ) right )$ where $alphaleft ( x_2, lambdaleft (x_1 – x_2 right ) right )rightarrow 0$ as$lambda rightarrow 0$ Dividing by $lambda$ on both sides, we get − $fleft ( x_1 right )-fleft ( x_2 right ) geq bigtriangledown fleft ( x_2 right )^T left ( x_1-x_2 right )$ Converse Let for $x_1,x_2 in S, bigtriangledown fleft ( x_2 right )^T left ( x_1-x_2 right ) leq fleft ( x_1 right )-f left ( x_2 right )$ To show that f is convex. Since S is convex, $x_3=lambda x_1+left (1-lambda right )x_2 in S, lambda in left ( 0, 1 right )$ Since $x_1, x_3 in S$, therefore $fleft ( x_1 right )-f left ( x_3 right ) geq bigtriangledown fleft ( x_3 right )^T left ( x_1 -x_3right )$ $ Rightarrow fleft ( x_1 right )-f left ( x_3 right )geq bigtriangledown fleft ( x_3 right )^T left ( x_1 – lambda x_1-left (1-lambda right )x_2right )$ $ Rightarrow fleft ( x_1 right )-f left ( x_3 right )geq left ( 1- lambdaright )bigtriangledown fleft ( x_3 right )^T left ( x_1 – x_2right )$ Since, $x_2, x_3 in S$ therefore $fleft ( x_2 right )-fleft ( x_3 right )geq bigtriangledown fleft ( x_3 right )^Tleft ( x_2-x_3 right )$ $Rightarrow fleft ( x_2 right )-fleft ( x_3 right )geq bigtriangledown fleft ( x_3 right )^Tleft ( x_2-lambda x_1-left ( 1-lambda right )x_2 right )$ $Rightarrow fleft ( x_2 right )-fleft ( x_3 right )geq left ( -lambda right )bigtriangledown fleft ( x_3 right )^Tleft ( x_1-x_2 right )$ Thus, combining the above equations, we get − $lambda left ( fleft ( x_1 right )-fleft ( x_3 right ) right )+left ( 1- lambda right )left ( fleft ( x_2 right )-fleft ( x_3 right ) right )geq 0$ $Rightarrow fleft ( x_3right )leq lambda fleft ( x_1 right )+left ( 1-lambda right )fleft ( x_2 right )$ Theorem let S be a non-empty open convex set in $mathbb{R}^n$ and let $f:S rightarrow mathbb{R}$ be differentiable on S, then f is convex on S if and only if for any $x_1,x_2 in S,left ( bigtriangledown f left ( x_2 right )-bigtriangledown f left ( x_1 right ) right )^T left ( x_2-x_1 right ) geq 0$ Proof let f be a convex function, then using the previous theorem − $bigtriangledown fleft ( x_2 right )^Tleft ( x_1-x_2 right )leq fleft ( x_1 right )-fleft ( x_2 right )$ and $bigtriangledown fleft ( x_1 right )^Tleft ( x_2-x_1 right )leq fleft ( x_2 right )-fleft ( x_1 right )$ Adding the above two equations, we get − $bigtriangledown fleft ( x_2 right )^Tleft ( x_1-x_2 right )+bigtriangledown fleft ( x_1 right )^Tleft ( x_2-x_1 right )leq 0$ $Rightarrow left ( bigtriangledown fleft ( x_2 right )-bigtriangledown fleft ( x_1 right ) right )^Tleft ( x_1-x_2 right )leq 0$ $Rightarrow left ( bigtriangledown fleft ( x_2 right )-bigtriangledown fleft ( x_1 right ) right )^Tleft ( x_2-x_1 right )geq 0$ Converse Let for any $x_1,x_2 in S,left (bigtriangledown f left ( x_2right )- bigtriangledown f left ( x_1right )right )^T left ( x_2-x_1right )geq 0$ To show that f is convex. Let $x_1,x_2 in S$, thus by mean value theorem, $frac{fleft ( x_1right )-fleft ( x_2right )}{x_1-x_2}=bigtriangledown fleft ( xright ),x in left ( x_1-x_2right ) Rightarrow x= lambda x_1+left ( 1-lambdaright )x_2$ because S is a convex set. $Rightarrow fleft ( x_1 right )- fleft ( x_2 right )=left ( bigtriangledown fleft ( x right )^T right )left ( x_1-x_2 right )$ for $x,x_1$, we know − $left ( bigtriangledown fleft ( x right )-bigtriangledown fleft ( x_1 right ) right )^Tleft ( x-x_1 right )geq 0$ $Rightarrow left ( bigtriangledown fleft ( x right )-bigtriangledown fleft ( x_1 right ) right )^Tleft ( lambda x_1+left ( 1-lambda right )x_2-x_1 right )geq 0$ $Rightarrow left ( bigtriangledown fleft ( x right )- bigtriangledown fleft ( x_1 right )right )^Tleft ( 1- lambda right )left ( x_2-x_1 right )geq 0$ $Rightarrow bigtriangledown fleft ( x right )^Tleft ( x_2-x_1 right )geq bigtriangledown fleft ( x_1 right )^Tleft ( x_2-x_1 right )$ Combining
Differentiable Quasiconvex Function Theorem Let S be a non empty convex set in $mathbb{R}^n$ and $f:S rightarrow mathbb{R}$ be differentiable on S, then f is quasiconvex if and only if for any $x_1,x_2 in S$ and $fleft ( x_1 right )leq fleft ( x_2 right )$, we have $bigtriangledown fleft ( x_2 right )^Tleft ( x_2-x_1 right )leq 0$ Proof Let f be a quasiconvex function. Let $x_1,x_2 in S$ such that $fleft ( x_1 right ) leq fleft ( x_2 right )$ By differentiability of f at $x_2, lambda in left ( 0, 1 right )$ $fleft ( lambda x_1+left ( 1-lambda right )x_2 right )=fleft ( x_2+lambda left (x_1-x_2 right ) right )=fleft ( x_2 right )+bigtriangledown fleft ( x_2 right )^Tleft ( x_1-x_2 right )$ $+lambda left | x_1-x_2 right |alpha left ( x_2,lambda left ( x_1-x_2 right ) right )$ $Rightarrow fleft ( lambda x_1+left ( 1-lambda right )x_2 right )-fleft ( x_2 right )-fleft ( x_2 right )=bigtriangledown fleft ( x_2 right )^Tleft ( x_1-x_2 right )$ $+lambda left | x_1-x_2 right |alpha left ( x2, lambdaleft ( x_1-x_2 right )right )$ But since f is quasiconvex, $f left ( lambda x_1+ left ( 1- lambda right )x_2 right )leq f left (x_2 right )$ $bigtriangledown fleft ( x_2 right )^Tleft ( x_1-x_2 right )+lambda left | x_1-x_2 right |alpha left ( x_2,lambda left ( x_1,x_2 right ) right )leq 0$ But $alpha left ( x_2,lambda left ( x_1,x_2 right )right )rightarrow 0$ as $lambda rightarrow 0$ Therefore, $bigtriangledown fleft ( x_2 right )^Tleft ( x_1-x_2 right ) leq 0$ Converse let for $x_1,x_2 in S$ and $fleft ( x_1 right )leq fleft ( x_2 right )$, $bigtriangledown fleft ( x_2 right )^T left ( x_1,x_2 right ) leq 0$ To show that f is quasiconvex,ie, $fleft ( lambda x_1+left ( 1-lambda right )x_2 right )leq fleft ( x_2 right )$ Proof by contradiction Suppose there exists an $x_3= lambda x_1+left ( 1-lambda right )x_2$ such that $fleft ( x_2 right ) For $x_2$ and $x_3,bigtriangledown fleft ( x_3 right )^T left ( x_2-x_3 right ) leq 0$ $Rightarrow -lambda bigtriangledown fleft ( x_3 right )^Tleft ( x_2-x_3 right )leq 0$ $Rightarrow bigtriangledown fleft ( x_3 right )^T left ( x_1-x_2 right )geq 0$ For $x_1$ and $x_3,bigtriangledown fleft ( x_3 right )^T left ( x_1-x_3 right ) leq 0$ $Rightarrow left ( 1- lambda right )bigtriangledown fleft ( x_3 right )^Tleft ( x_1-x_2 right )leq 0$ $Rightarrow bigtriangledown fleft ( x_3 right )^T left ( x_1-x_2 right )leq 0$ thus, from the above equations, $bigtriangledown fleft ( x_3 right )^T left ( x_1-x_2 right )=0$ Define $U=left { x:fleft ( x right )leq fleft ( x_2 right ),x=mu x_2+left ( 1-mu right )x_3, mu in left ( 0,1 right ) right }$ Thus we can find $x_0 in U$ such that $x_0 = mu_0 x_2= mu x_2+left ( 1- mu right )x_3$ for some $mu _0 in left ( 0,1 right )$ which is nearest to $x_3$ and $hat{x} in left ( x_0,x_1 right )$ such that by mean value theorem, $$frac{fleft ( x_3right )-fleft ( x_0right )}{x_3-x_0}= bigtriangledown fleft ( hat{x}right )$$ $$Rightarrow fleft ( x_3 right )=fleft ( x_0 right )+bigtriangledown fleft ( hat{x} right )^Tleft ( x_3-x_0 right )$$ $$Rightarrow fleft ( x_3 right )=fleft ( x_0 right )+mu_0 lambda fleft ( hat{x}right )^T left ( x_1-x_2 right )$$ Since $x_0$ is a combination of $x_1$ and $x_2$ and $fleft (x_2 right ) By repeating the starting procedure, $bigtriangledown f left ( hat{x}right )^T left ( x_1-x_2right )=0$ Thus, combining the above equations, we get: $$fleft ( x_3right )=fleft ( x_0 right ) leq fleft ( x_2right )$$ $$Rightarrow fleft ( x_3right )leq fleft ( x_2right )$$ Hence, it is contradiction. Examples Step 1 − $fleft ( xright )=X^3$ $Let f left ( x_1right )leq fleft ( x_2right )$ $Rightarrow x_{1}^{3}leq x_{2}^{3}Rightarrow x_1leq x_2$ $bigtriangledown fleft ( x_2 right )left ( x_1-x_2 right )=3x_{2}^{2}left ( x_1-x_2 right )leq 0$ Thus, $fleft ( xright )$ is quasiconvex. Step 2 − $fleft ( xright )=x_{1}^{3}+x_{2}^{3}$ Let $hat{x_1}=left ( 2, -2right )$ and $hat{x_2}=left ( 1, 0right )$ thus, $fleft ( hat{x_1}right )=0,fleft ( hat{x_2}right )=1 Rightarrow fleft ( hat{x_1}right )setminus Thus, $bigtriangledown f left ( hat{x_2}right )^T left ( hat{x_1}- hat{x_2}right )= left ( 3, 0right )^T left ( 1, -2right )=3 >0$ Hence $fleft ( xright )$ is not quasiconvex. Learning working make money
Strictly Quasiconvex Function Let $f:Srightarrow mathbb{R}^n$ and S be a non-empty convex set in $mathbb{R}^n$ then f is said to be strictly quasicovex function if for each $x_1,x_2 in S$ with $fleft ( x_1 right ) neq fleft ( x_2 right )$, we have $fleft ( lambda x_1+left ( 1-lambda right )x_2 right ) Remarks Every strictly quasiconvex function is strictly convex. Strictly quasiconvex function does not imply quasiconvexity. Strictly quasiconvex function may not be strongly quasiconvex. Pseudoconvex function is a strictly quasiconvex function. Theorem Let $f:Srightarrow mathbb{R}^n$ be strictly quasiconvex function and S be a non-empty convex set in $mathbb{R}^n$.Consider the problem: $min :fleft ( x right ), x in S$. If $hat{x}$ is local optimal solution, then $bar{x}$ is global optimal solution. Proof Let there exists $ bar{x} in S$ such that $fleft ( bar{x}right )leq f left ( hat{x}right )$ Since $bar{x},hat{x} in S$ and S is convex set, therefore, $$lambda bar{x}+left ( 1-lambda right )hat{x}in S, forall lambda in left ( 0,1 right )$$ Since $hat{x}$ is local minima, $fleft ( hat{x} right ) leq fleft ( lambda bar{x}+left ( 1-lambda right )hat{x} right ), forall lambda in left ( 0,delta right )$ Since f is strictly quasiconvex. $$fleft ( lambda bar{x}+left ( 1-lambda right )hat{x} right ) Hence, it is contradiction. Strictly quasiconcave function Let $f:Srightarrow mathbb{R}^n$ and S be a non-empty convex set in $mathbb{R}^n$, then f is saud to be strictly quasicovex function if for each $x_1,x_2 in S$ with $fleft (x_1right )neq fleft (x_2right )$, we have $$fleft (lambda x_1+left (1-lambdaright )x_2right )> min left { f left (x_1right ),fleft (x_2right )right }$$. Examples $fleft (xright )=x^2-2$ It is a strictly quasiconvex function because if we take any two points $x_1,x_2$ in the domain that satisfy the constraints in the definition $fleft (lambda x_1+left (1- lambdaright )x_2right ) $fleft (xright )=-x^2$ It is not a strictly quasiconvex function because if we take take $x_1=1$ and $x_2=-1$ and $lambda=0.5$, then $fleft (x_1right )=-1=fleft (x_2right )$ but $fleft (lambda x_1+left (1- lambdaright )x_2right )=0$ Therefore it does not satisfy the conditions stated in the definition. But it is a quasiconcave function because if we take any two points in the domain that satisfy the constraints in the definition $fleft ( lambda x_1+left (1-lambdaright )x_2right )> min left { f left (x_1right ),fleft (x_2right )right }$. As the function is increasing in the negative x-axis and it is decreasing in the positive x-axis. Learning working make money
Convex and Concave Function Let $f:S rightarrow mathbb{R}$, where S is non empty convex set in $mathbb{R}^n$, then $fleft ( x right )$ is said to be convex on S if $fleft ( lambda x_1+left ( 1-lambda right )x_2 right )leq lambda fleft ( x_1 right )+left ( 1-lambda right )fleft ( x_2 right ), forall lambda in left ( 0,1 right )$. On the other hand, Let $f:Srightarrow mathbb{R}$, where S is non empty convex set in $mathbb{R}^n$, then $fleft ( x right )$ is said to be concave on S if $fleft ( lambda x_1+left ( 1-lambda right )x_2 right )geq lambda fleft ( x_1 right )+left ( 1-lambda right )fleft ( x_2 right ), forall lambda in left ( 0, 1 right )$. Let $f:S rightarrow mathbb{R}$ where S is non empty convex set in $mathbb{R}^n$, then $fleft ( xright )$ is said to be strictly convex on S if $fleft ( lambda x_1+left ( 1-lambda right )x_2 right ) Let $f:S rightarrow mathbb{R}$ where S is non empty convex set in $mathbb{R}^n$, then $fleft ( xright )$ is said to be strictly concave on S if $fleft ( lambda x_1+left ( 1-lambda right )x_2 right )> lambda fleft ( x_1 right )+left ( 1-lambda right )fleft ( x_2 right ), forall lambda in left ( 0, 1 right )$. Examples A linear function is both convex and concave. $fleft ( x right )=left | x right |$ is a convex function. $fleft ( x right )= frac{1}{x}$ is a convex function. Theorem Let $f_1,f_2,…,f_k:mathbb{R}^n rightarrow mathbb{R}$ be convex functions. Consider the function $fleft ( x right )=displaystylesumlimits_{j=1}^k alpha_jf_jleft ( x right )$ where $alpha_j>0,j=1, 2, …k,$ then $fleft ( x right )$is a convex function. Proof Since $f_1,f_2,…f_k$ are convex functions Therefore, $f_ileft ( lambda x_1+left ( 1-lambda right )x_2 right )leq lambda f_ileft ( x_1 right )+left ( 1-lambda right )f_ileft ( x_2 right ), forall lambda in left ( 0, 1 right )$ and $i=1, 2,….,k$ Consider the function $fleft ( x right )$. Therefore, $ fleft ( lambda x_1+left ( 1-lambda right )x_2 right )$ $=displaystylesumlimits_{j=1}^k alpha_jf_jleft ( lambda x_1 +1-lambda right )x_2leq displaystylesumlimits_{j=1}^kalpha_jlambda f_jleft ( x_1 right )+left ( 1-lambda right )f_jleft ( x_2 right )$ $Rightarrow fleft ( lambda x_1+left ( 1-lambda right )x_2 right )leq lambda left ( displaystylesumlimits_{j=1}^k alpha _jf_jleft ( x_1 right ) right )+left ( displaystylesumlimits_{j=1}^k alpha _jf_jleft ( x_2 right ) right )$ $Rightarrow fleft ( lambda x_1+left ( 1-lambda right )x_2 right )leq lambda fleft ( x_2 right )leq left ( 1-lambda right )fleft ( x_2 right )$ Hence, $fleft ( xright )$ is a convex function. Theorem Let $fleft ( xright )$ be a convex function on a convex set $Ssubset mathbb{R}^n$ then a local minima of $fleft ( xright )$ on S is a global minima. Proof Let $hat{x}$ be a local minima for $fleft ( x right )$ and $hat{x}$ is not global minima. therefore, $exists hat{x} in S$ such that $fleft ( bar{x} right ) Since $hat{x}$ is a local minima, there exists neighbourhood $N_varepsilon left ( hat{x} right )$ such that $fleft ( hat{x} right )leq fleft ( x right ), forall x in N_varepsilon left ( hat{x} right )cap S$ But $fleft ( x right )$ is a convex function on S, therefore for $lambda in left ( 0, 1 right )$ we have $lambda hat{x}+left ( 1-lambda right )bar{x}leq lambda fleft ( hat{x} right )+left ( 1-lambda right )fleft ( bar{x} right )$ $Rightarrow lambda hat{x}+left ( 1-lambda right )bar{x} $Rightarrow lambda hat{x}+left ( 1-lambda right )bar{x} But for some $lambda $lambda hat{x}+left ( 1-lambda right )bar{x} in N_varepsilon left ( hat{x} right )cap S$ and $fleft ( lambda hat{x}+left ( 1-lambda right )bar{x} right ) which is a contradiction. Hence, $bar{x}$ is a global minima. Epigraph let S be a non-empty subset in $mathbb{R}^n$ and let $f:S rightarrow mathbb{R}$ then the epigraph of f denoted by epi(f) or $E_f$ is a subset of $mathbb{R}^n+1$ defined by $E_f=left { left ( x,alpha right ):x in mathbb{R}^n, alpha in mathbb{R}, fleft ( x right )leq alpha right }$ Hypograph let S be a non-empty subset in $mathbb{R}^n$ and let $f:S rightarrow mathbb{R}$, then the hypograph of f denoted by hyp(f) or $H_f=left { left ( x, alpha right ):x in mathbb{R}^n, alpha in mathbb{R}^n, alpha in mathbb{R}, fleft ( x right )geq alpha right }$ Theorem Let S be a non-empty convex set in $mathbb{R}^n$ and let $f:S rightarrow mathbb{R}^n$, then f is convex if and only if its epigraph $E_f$ is a convex set. Proof Let f is a convex function. To show $E_f$ is a convex set. Let $left ( x_1, alpha_1 right ),left ( x_2, alpha_2 right ) in E_f,lambda inleft ( 0, 1 right )$ To show $lambda left ( x_1,alpha_1 right )+left ( 1-lambda right )left ( x_2, alpha_2 right ) in E_f$ $Rightarrow left [ lambda x_1+left ( 1-lambda right )x_2, lambda alpha_1+left ( 1-lambda right )alpha_2 right ]in E_f$ $fleft ( x_1 right )leq alpha _1, fleft ( x_2right )leq alpha _2$ Therefore, $fleft (lambda x_1+left ( 1-lambda right )x_2 right )leq lambda fleft ( x_1 right )+left ( 1-lambda right )f left ( x_2 right )$ $Rightarrow fleft ( lambda x_1+left ( 1-lambda right )x_2 right )leq lambda alpha_1+left ( 1-lambda right )alpha_2$ Converse Let $E_f$ is a convex set. To show f is convex. i.e., to show if $x_1, x_2 in S,lambda left ( 0, 1right )$ $fleft ( lambda x_1+left ( 1-lambda right )x_2 right )leq lambda fleft ( x_1 right )+left ( 1-lambda right )fleft ( x_2 right )$ Let $x_1,x_2 in S, lambda in left ( 0, 1 right ),fleft ( x_1 right ), fleft ( x_2 right ) in mathbb{R}$ Since $E_f$ is a convex set, $left ( lambda x_1+left ( 1-lambda right )x_2, lambda fleft ( x_1 right )+left ( 1-lambda right )right )fleft ( x_2 right )in E_f$ Therefore, $fleft ( lambda x_1+left
Sufficient & Necessary Conditions for Global Optima Theorem Let f be twice differentiable function. If $bar{x}$ is a local minima, then $bigtriangledown fleft ( bar{x} right )=0$ and the Hessian matrix $Hleft ( bar{x} right )$ is a positive semidefinite. Proof Let $d in mathbb{R}^n$. Since f is twice differentiable at $bar{x}$. Therefore, $fleft ( bar{x} +lambda dright )=fleft ( bar{x} right )+lambda bigtriangledown fleft ( bar{x} right )^T d+lambda^2d^THleft ( bar{x} right )d+lambda^2d^THleft ( bar{x} right )d+$ $lambda^2left | d right |^2beta left ( bar{x}, lambda d right )$ But $bigtriangledown fleft ( bar{x} right )=0$ and $betaleft ( bar{x}, lambda d right )rightarrow 0$ as $lambda rightarrow 0$ $Rightarrow fleft ( bar{x} +lambda d right )-fleft ( bar{x} right )=lambda ^2d^THleft ( bar{x} right )d$ Since $bar{x }$ is a local minima, there exists a $delta > 0$ such that $fleft ( x right )leq fleft ( bar{x}+lambda d right ), forall lambda in left ( 0,delta right )$ Theorem Let $f:S rightarrow mathbb{R}^n$ where $S subset mathbb{R}^n$ be twice differentiable over S. If $bigtriangledown fleft ( xright )=0$ and $Hleft ( bar{x} right )$ is positive semi-definite, for all $x in S$, then $bar{x}$ is a global optimal solution. Proof Since $Hleft ( bar{x} right )$ is positive semi-definite, f is convex function over S. Since f is differentiable and convex at $bar{x}$ $bigtriangledown fleft ( bar{x} right )^T left ( x-bar{x} right ) leq fleft (xright )-fleft (bar{x}right ),forall x in S$ Since $bigtriangledown fleft ( bar{x} right )=0, fleft ( x right )geq fleft ( bar{x} right )$ Hence, $bar{x}$ is a global optima. Theorem Suppose $bar{x} in S$ is a local optimal solution to the problem $f:S rightarrow mathbb{R}$ where S is a non-empty subset of $mathbb{R}^n$ and S is convex. $min :fleft ( x right )$ where $x in S$. Then: $bar{x}$ is a global optimal solution. If either $bar{x}$ is strictly local minima or f is strictly convex function, then $bar{x}$ is the unique global optimal solution and is also strong local minima. Proof Let $bar{x}$ be another global optimal solution to the problem such that $x neq bar{x}$ and $fleft ( bar{x} right )=fleft ( hat{x} right )$ Since $hat{x},bar{x} in S$ and S is convex, then $frac{hat{x}+bar{x}}{2} in S$ and f is strictly convex. $Rightarrow fleft ( frac{hat{x}+bar{x}}{2} right ) This is contradiction. Hence, $hat{x}$ is a unique global optimal solution. Corollary Let $f:S subset mathbb{R}^n rightarrow mathbb{R}$ be a differentiable convex function where $phi neq Ssubset mathbb{R}^n$ is a convex set. Consider the problem $min fleft (xright ),x in S$,then $bar{x}$ is an optimal solution if $bigtriangledown fleft (bar{x}right )^Tleft (x-bar{x}right ) geq 0,forall x in S.$ Proof Let $bar{x}$ is an optimal solution, i.e, $fleft (bar{x}right )leq fleft (xright ),forall x in S$ $Rightarrow fleft (xright )=fleft (bar{x}right )geq 0$ $fleft (xright )=fleft (bar{x}right )+bigtriangledown fleft (bar{x}right )^Tleft (x-bar{x}right )+left | x-bar{x} right |alpha left ( bar{x},x-bar{x} right )$ where $alpha left ( bar{x},x-bar{x} right )rightarrow 0$ as $x rightarrow bar{x}$ $Rightarrow fleft (xright )-fleft (bar{x}right )=bigtriangledown fleft (bar{x}right )^Tleft (x-bar{x}right )geq 0$ Corollary Let f be a differentiable convex function at $bar{x}$,then $bar{x}$ is global minimum iff $bigtriangledown fleft (bar{x}right )=0$ Examples $fleft (xright )=left (x^2-1right )^{3}, x in mathbb{R}$. $bigtriangledown fleft (xright )=0 Rightarrow x= -1,0,1$. $bigtriangledown^2fleft (pm 1 right )=0, bigtriangledown^2 fleft (0 right )=6>0$. $fleft (pm 1 right )=0,fleft (0 right )=-1$ Hence, $fleft (x right ) geq -1=fleft (0 right )Rightarrow fleft (0 right ) leq f left (x right)forall x in mathbb{R}$ $fleft (x right )=xlog x$ defined on $S=left { x in mathbb{R}, x> 0 right }$. ${f}”x=1+log x$ ${f}””x=frac{1}{x}>0$ Thus, this function is strictly convex. $f left (x right )=e^{x},x in mathbb{R}$ is strictly convex. Learning working make money
Algorithms for Convex Problem Method of Steepest Descent This method is also called Gradient method or Cauchy”s method. This method involves the following terminologies − $$x_{k+1}=x_k+alpha_kd_k$$ $d_k= – bigtriangledown fleft ( x_k right )$ or $ d_k= -frac{bigtriangledown fleft ( x_k right )}{left | bigtriangledown fleft ( x_k right ) right |}$ Let $phi left (alpha right )=fleft ( x_k +alpha d_kright )$ By differentiating $phi$ and equating it to zero, we can get $alpha$. So the algorithm goes as follows − Initialize $x_0$,$varepsilon_1$,$varepsilon_2$ and set $k=0$. Set $d_k=-bigtriangledown fleft ( x_k right ) $or $d_k=-frac{bigtriangledown fleft (x_k right )}{left |bigtriangledown fleft (x_k right ) right |}$. find $alpha_k$ such that it minimizes $phileft ( alpha right )=fleft ( x_k+alpha d_k right )$. Set $x_{k+1}=x_k+alpha_kd_k$. If $left | x_{k+1-x_k} right | The optimal solution is $hat{x}=x_{k+1}$. Newton Method Newton Method works on the following principle − $fleft ( x right )=yleft ( x right )=fleft ( x_k right )+left ( x-x_k right )^T bigtriangledown fleft ( x_k right )+frac{1}{2}left ( x-x_k right )^T Hleft ( x_k right )left ( x-x_k right )$ $bigtriangledown yleft ( x right )=bigtriangledown fleft ( x_k right )+Hleft ( x_k right )left ( x-x_k right )$ At $x_{k+1}, bigtriangledown yleft ( x_{k+1} right )=bigtriangledown fleft ( x_k right )+Hleft ( x_k right )left ( x_{k+1}-x_k right )$ For $x_{k+1}$ to be optimal solution $bigtriangledown yleft ( x_k+1 right )=0$ Thus, $x_{k+1}=x_k-Hleft ( x_k right )^{-1} bigtriangledown fleft ( x_k right )$ Here $Hleft ( x_k right )$ should be non-singular. Hence the algorithm goes as follows − Step 1 − Initialize $x_0,varepsilon$ and set $k=0$. Step 2 − find $Hleft ( x_k right ) bigtriangledown fleft ( x_k right )$. Step 3 − Solve for the linear system $Hleft ( x_k right )hleft ( x_k right )=bigtriangledown fleft ( x_k right )$ for $hleft ( x_k right )$. Step 4 − find $x_{k+1}=x_k-hleft ( x_k right )$. Step 5 − If $left | x_{k+1}-x_k right | Step 6 − The optimal solution is $hat{x}=x_{k+1}$. Conjugate Gradient Method This method is used for solving problems of the following types − $min fleft ( x right )=frac{1}{2}x^T Qx-bx$ where Q is a positive definite nXn matrix and b is constant. Given $x_0,varepsilon,$ compute $g_0=Qx_0-b$ Set $d_0=-g_0$ for $k=0,1,2,…,$ Set $alpha_k=frac{g_{k}^{T}g_k}{d_{k}^{T}Q d_k}$ Compute $x_{k+1}=x_k+alpha_kd_k$ Set $g_{k+1}=g_k+alpha_kd_k$ Compute $beta_k=frac{g_{k}^{T}g_k}{d_{k}^{T}Qd_k}$ Compute $x_{k+1}=x_k+alpha_kd_k$ Set $g_{k+1}=x_k+alpha_kQd_k$ Compute $beta_k=frac{g_{k+1}^{T}g_{k+1}}{g_{k}^{T}gk}$ Set $d_{k+1}=-g_{k+1}+beta_kd_k$. Learning working make money