Method of continuity for nonlinear equation solving

| categories: nonlinear algebra | View Comments



Method of continuity for nonlinear equation solving

John Kitchin

Adapted from Perry's Chemical Engineers Handbook, 6th edition 2-63.

function main

We seek the solution to the following nonlinear equations:

$2 + x + y - x^2 + 8 x y + y^3 = 0$

$1 + 2x - 3y + x^2 + xy - y e^x = 0$

f = @(x,y) 2 + x + y - x^2 + 8*x*y + y^3;
g = @(x,y) 1 + 2*x - 3*y + x^2 + x*y - y*exp(x);

In principle this is easy, we simply need some initial guesses. The challenge here is what would you guess? The equations are implicit, so it is not easy to graph them, but lets give it a shot, starting on the x range -5 to 5. The idea is set a value for x, and then solve for y in each equation

x = linspace(-5,5,500);
% at x = 0, we have 2 + y + y^3 = 0
fy = arrayfun(@(x) fzero(@(y) f(x,y), 0.1),x);
gy = arrayfun(@(x) fzero(@(y) g(x,y), 0),x);

graphical solution

you can see there is a solution near x = -1, y = 0, because both functions equal zero there. We can even use that guess with fsolve. It is disappointly easy! But, keep in mind that in 3 or more dimensions, you cannot perform this visualization, and another method could be required.

f1 = @(X) [f(X(1),X(2)); g(X(1),X(2))];
fsolve(f1,[-1 0])
Equation solved at initial point.

fsolve completed because the vector of function values at the initial point
is near zero as measured by the default value of the function tolerance, and
the problem appears regular as measured by the gradient.

ans =

    -1     0

We explore a method that bypasses this problem today. The principle is to introduce a new variable, $\lambda$, which will vary from 0 to 1. at $\lambda=0$ we will have a simpler equation, preferrably a linear one, which can be solved, which can be analytically solved. At $\lambda=1$, we have the original equations. Then, we create a system of differential equations that start at this solution, and integrate from $\lambda=0$ to $\lambda=1$, to recover the final solution.

We rewrite the equations as:

$f(x,y) = (2 + x + y) + \lambda(- x^2 + 8 x y + y^3) = 0$

$g(x,y) = (1 + 2x - 3y) + \lambda(x^2 + xy - y e^x) = 0$

Now, at $\lambda=0$ we have the simple linear equations:

$x + y = -2$

$2x + 3y = -1$

x0 = [[1 1]; [2 -3]]\[-2; -1]
x0 =


Construct the system of ODEs

We form the system of ODEs by differentiating the new equations with respect to $\lambda$. Why do we do that? The solution, (x,y) will be a function of $\lambda$. From calculus, you can show that:

$\frac{\partial f}{\partial x}\frac{\partial x}{\partial \lambda}+\frac{\partial f}{\partial y}\frac{\partial y}{\partial \lambda}=-\frac{\partial f}{\partial \lambda}$

$\frac{\partial g}{\partial x}\frac{\partial x}{\partial \lambda}+\frac{\partial g}{\partial y}\frac{\partial y}{\partial \lambda}=-\frac{\partial g}{\partial \lambda}$

Now, solve this for $\frac{\partial x}{\partial \lambda}$ and $\frac{\partial y}{\partial \lambda}$. You can use Cramer's rule to solve for these to yield: $\begin{array}{c} \frac{\partial x}{\partial \lambda} = \frac{\partial f/\partial y \partial g/\partial \lambda - \partial f/\partial \lambda  \partial g/\partial y}{\partial f/\partial x \partial g/\partial y - \partial f/\partial y  \partial g/\partial x }\\ \frac{\partial y}{\partial \lambda} = \frac{\partial f/\partial \lambda \partial g/\partial x - \partial f/\partial x  \partial g/\partial \lambda}{\partial f/\partial x \partial g/\partial y - \partial f/\partial y  \partial g/\partial x } \end{array}$

For this set of equations: $\begin{array}{l} \partial f/\partial x = 1 - 2\lambda x + 8\lambda y \\ \partial f/\partial y = 1 + 8 \lambda x + 3 \lambda y^2 \\ \partial g/\partial x = 2 + 2 \lambda x + \lambda y - \lambda y e^x\\ \partial g/\partial y = -3 + \lambda x - \lambda e^x \end{array}$

Now, we simply set up those two differential equations on $frac{\partial x}{\partial \lambda}$ and $frac{\partial y}{\partial \lambda}$, with the initial conditions at $\lambda = 0$ which is the solution of the simpler linear equations, and integrate to $\lambda = 1$, which is the final solution of the original equations!

    function dXdlambda = ode(lambda, X)
        x = X(1);
        y = X(2);

        pfpx = 1 - 2*lambda*x + 8*lambda*y;
        pfpy = 1 + 8*lambda*x + 3*lambda*y^2;
        pfplambda = -x^2+8*x*y+y^3;

        pgpx = 2 + 2*lambda*x + lambda*y - lambda*y*exp(x);
        pgpy = -3 + lambda*x - lambda*exp(x);
        pgplambda = x^2 + x*y -y*exp(x);

        dxdlambda = (pfpy*pgplambda - pfplambda*pgpy)/(pfpx*pgpy-pfpy*pgpx);
        dydlambda = (pfplambda*pgpx-pfpx*pgplambda)/(pfpx*pgpy-pfpy*pgpx);

        dXdlambda = [dxdlambda; dydlambda];

[lambda X] = ode45(@ode,[0 1],x0);

the solution

The last X value corresponds to $\lambda = 1$, which is the solution to the original equations.

Xsol = X(end,:);
x = Xsol(1)
y = Xsol(2)
x =


y =


evaluate solution

You can see the solution is somewhat approximate; the true solution is x = -1, y = 0. The approximation could be improved by lowering the tolerance on the ODE solver.

ans =


ans =



This is a fair amount of work to get a solution! The idea is to solve a simple problem, and then gradually turn on the hard part by the lambda parameter. What happens if there are multiple solutions? The answer you finally get will depend on your $\lambda=0$ starting point, so it is possible to miss solutions this way. For problems with lots of variables, this would be a good approach if you can identify the easy problem.


% categories: Nonlinear algebra
blog comments powered by Disqus