Draft:Newton's method for systems of nonlinear equations

Review waiting, please be patient.

This may take 3 months or more, since drafts are reviewed in no specific order. There are 2,813 pending submissions waiting for review.

If the submission is accepted, then this page will be moved into the article space.
If the submission is declined, then the reason will be posted here.
In the meantime, you can continue to improve this submission by editing normally.

Where to get help

If you need help editing or submitting your draft, please ask us a question at the AfC Help Desk or get live help from experienced editors. These venues are only for help with editing and the submission process, not to get reviews.
If you need feedback on your draft, or if the review is taking a lot of time, you can try asking for help on the talk page of a relevant WikiProject. Some WikiProjects are more active than others so a speedy reply is not guaranteed.

How to improve a draft

Wikipedia:Contributing to Wikipedia – a basic overview on how to edit Wikipedia.
Help:Wikitext – how to use the markup
Help:Referencing for beginners – how to include references
Wikipedia:Article development – how to develop your article
Wikipedia:Writing better articles – how to improve your article
Wikipedia:Verifiability – make sure your article includes reliable third-party sources

You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article.

Improving your odds of a speedy review

To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags.

Add tags to your draft

Editor resources

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL
Easy tools: Citation bot (help) | Advanced: Fix bare URLs

Reviewer tools

Instructions · What links here · Newton's method for systems of nonlinear equations (talk: + · bio) · (log) · Copyvios report · reFill · Citation Bot · (Search: Google, Bing, Wikipedia) · Submitted 17 days ago by Netshine2 (talk: D · +) · Last edited 2 days ago by Netshine2

When multiple nonlinear equations need to be solved for more than one variable, Newton's Method for Systems of Equations may be used to solve the equations simultaneously for the solution vector.^[1]^[2]^[3] The process is very similar to solving Newton's method for one variable, except the single nonlinear equation is replaced with a system of nonlinear equations, the derivative is replaced with a Jacobian matrix of partial derivatives, and the subtraction is replaced with a vector subtraction. Newton's method for systems of nonlinear equations reduces to Newton's method for nonlinear equations when the system of equations includes only one equation.

Procedure[edit]

For single equations, Newton's method consists of the following iterations until the iterations no longer produce any changes to $x$ of significance,

$x_{k+1}=x_{k}-{\frac {f(x_{k})-y_{k}}{f'(x_{k})}}$

For systems of equations, the same is true, but for vector $X$ instead of scalar $x$ ,

${\begin{aligned}&X_{k+1}=X_{k}-C(X_{k})\\\end{aligned}}$

Where $C(X_{k})$ is the solution vector to $J(X_{k})C(X_{k})=(F(X_{k})-Y)$

$J(X_{k})$ is the Jacobian matrix for $X_{k}$

$Y$ is a vector of known values

If the $Y$ vector is set to all zeros, the defining equations may be rewritten in the commonly found form below.

${\begin{aligned}&X_{k+1}=X_{k}-J(X_{k})^{-1}F(X_{k})\\\end{aligned}}$

or

${\begin{aligned}&J(X_{k})(X_{k+1}-X_{k})=-F(X_{k})\\\end{aligned}}$

Simple example[edit]

For example, the following set of equations needs to be solved for vector of points $(x_{1},x_{2})$ , given the vector of known values, (2,3).

${\begin{array}{lcr}5x_{1}^{2}+x_{1}x_{2}+sin^{2}(2x_{2})&=2\\e^{2x_{1}-x_{2}}+5x_{2}&=3\end{array}}$

the function vector, $F(X_{k})$ , and Jacobian Matrix, $J(X_{k})$ for iteration k, and the vector of known values, $Y$ , are defined below.

${\begin{aligned}&F(X_{k})={\begin{bmatrix}{\begin{aligned}&5x_{1}^{2}+x_{1}x_{2}^{2}+sin^{2}(2x_{2})\\&e^{2x_{1}-x_{2}}+4x_{2}\end{aligned}}\end{bmatrix}}_{k}\\&J(X_{k})={\begin{bmatrix}{\frac {\partial {f(x_{1})}}{\partial {x_{1}}}}&{\frac {\partial {f(x_{1})}}{\partial {x_{2}}}}\\{\frac {\partial {f(x_{2})}}{\partial {x_{1}}}}&{\frac {\partial {f(x_{2})}}{\partial {x_{2}}}}\end{bmatrix}}_{k}={\begin{bmatrix}{\begin{aligned}&10x_{1}+x_{2}^{2}&&2x_{1}x_{2}+4sin(2x_{2})cos(2x_{2})\\&2e^{2x_{1}-x_{2}}&&-e^{2x_{1}-x_{2}}+4\end{aligned}}\end{bmatrix}}_{k}\\&Y={\begin{bmatrix}2\\3\end{bmatrix}}\end{aligned}}$

Note that $F(X_{k})$ could have been rewritten to absorb $Y$ , and thus eliminate $Y$ from the equations. The equation to solve for each iteration are

${\begin{aligned}{\begin{bmatrix}{\begin{aligned}&10x_{1}+x_{2}^{2}&&2x_{1}x_{2}+4sin(2x_{2})cos(2x_{2})\\&2e^{2x_{1}-x_{2}}&&-e^{2x_{1}-x_{2}}+4\end{aligned}}\end{bmatrix}}_{k}{\begin{bmatrix}c_{1}\\c_{2}\end{bmatrix}}_{k+1}={\begin{bmatrix}5x_{1}^{2}+x_{1}x_{2}^{2}+sin^{2}(2x_{2})-2\\e^{2x_{1}-x_{2}}+4x_{2}-3\end{bmatrix}}_{k}\end{aligned}}$

and

$X_{k+1}=X_{k}-C_{k+1}$

The iterations should be repeated until ${\bigg [}\sum _{i=1}^{i=2}|f(x_{i})_{k}-(y_{i})_{k}|{\bigg ]}<E$ , where $E$ is a value acceptably small enough to meet application requirements.

If vector $X_{0}$ is initially chosen to be ${\begin{bmatrix}1&1\end{bmatrix}}$ , that is, $x_{1}=1{\text{ and }}x_{2}=1$ , and $E$ is chosen to be 1.e-03, then the example converges after four iterations to a value of $X_{4}=(0.567297,-0.309442)$ .

Iterations[edit]

The following iterations were made during the course of the solution.

Iteration Convergence Sequence
Iteration	Variable	Variable Contents
0	X	${\begin{bmatrix}1&1\end{bmatrix}}$
	F(X)	${\begin{bmatrix}6.82682&6.71828\end{bmatrix}}$

1	J	${\begin{bmatrix}11&0.486395\\5.43656&1.28172\end{bmatrix}}$
	C	${\begin{bmatrix}0.382211&1.27982\end{bmatrix}}$
	X	${\begin{bmatrix}0.617789&-0.279818\end{bmatrix}}$
	F(X)	${\begin{bmatrix}2.23852&3.43195\end{bmatrix}}$

2	J	${\begin{bmatrix}6.25618&-2.1453\\9.10244&-0.551218\end{bmatrix}}$
	C	${\begin{bmatrix}0.0494549&0.0330411\end{bmatrix}}$
	X	${\begin{bmatrix}0.568334&-0.312859\end{bmatrix}}$
	F(X)	${\begin{bmatrix}2.01366&3.00966\end{bmatrix}}$

3	J	${\begin{bmatrix}5.78122&-2.25449\\8.52219&-0.261095\end{bmatrix}}$
	C	${\begin{bmatrix}0.00102862&-0.00342339\end{bmatrix}}$
	X	${\begin{bmatrix}0.567305&-0.309435\end{bmatrix}}$
	F(X)	${\begin{bmatrix}2.00003&3.00006\end{bmatrix}}$

4	J	${\begin{bmatrix}5.7688&-2.24118\\8.47561&-0.237805\end{bmatrix}}$
	C	${\begin{bmatrix}7.73132E-06&6.93265E-06\end{bmatrix}}$
	X	${\begin{bmatrix}0.567297&-0.309442\end{bmatrix}}$
	F(X)	${\begin{bmatrix}2&3\end{bmatrix}}$

Practical considerations[edit]

Newton's Method for systems of equations especially large sets of equations, can be finickier than for single equations. Care should be taken to insure a solution is found within a reasonable number of iterations.

Singular matrices[edit]

The solution to the linear set of equations that must be solved may not necessarily result in a usable non-singular solution. This can be because the set of equations has no solution, or because of a poorly chosen starting vector for $X_{0}$ . For example, had the initial $X_{0}$ been chosen to be (0,0), the first iteration should have resulted in a singular matrix and the convergence would have failed. Care should be taken to choose a valid starting $X_{0}$ that is known to produce a non-singular first iteration.

Asymptotic divergence to infinity[edit]

Even though a system of equations is known to have a solution, the iterations may asymptotically diverge to infinity. This is especially more likely to happen with large number of equations. However, this condition can be mitigated or prevented by selecting a good starting point for $X_{0}$ . In addition, the following steps may further mitigate divergence.

Limit the range of the linear solution, the C vector, to a small range, but large enough to converge in a reasonable number of iterations.
Limit the $X$ vector iterations to known limits for each $x_{k}$ entry.
Scale down the C vector entries slightly to slow down the convergence. Slow convergences are less likely to go divergent.

Slow convergence[edit]

In time sensitive applications, convergence speed is importance in that slow convergence can have detrimental affects on the application that is being supported. Convergence time can be minimized through the following steps.

Good selection of the initial $X_{0}$ starting point is very importance in minimizing the number of iterations required for an acceptable convergence error. For example, the example above, had the initial $X_{0}$ been chosen to be (2,2) instead of (1,1), then seven iterations would have been required instead of four.
Limit the $X$ vector iterations to known limits for each $x_{k}$ entry. The $X$ vector does not have diverge to slow things down. If no limit has been placed on the vector, or the limit is too big, the iterations may spend too much time recomputing large values instead of converging.
Scale down the C vector entries slightly to slow down the convergence. This may help prevent the iterations from jumping around and taking too long to converge.
Select a conference error point as large as possible that still meets the application requirements. For example, had an error of 1.e-15 been chosen for the example above, six iterations should have been required, as opposed to only four needed to converge to an error of 1,e-03. The additional two iterations may be acceptable for high precision applications, but would be a waste for applications that only need light precision.

Insure a solution does exist[edit]

It is much easier to determine that a known solution exists or does not exist with single equations. For example, $X^{2}=4$ has an obvious known solution (2), while $X^{2}=-4$ is obvious that no solution exists in the set of real number. With sets of equations, especially large sets, it is far more difficult to determine that a solution exists or does not exist. If a solution does not exists, the iterations will certainly fail, but if a solution does exist, the iterations may still fail. Upfront work may be required to determine that a solution does or does not exist before making conclusions.

Multiple solutions[edit]

It is easier to determine that multiple solutions exist with single equations. The $X^{2}=4$ from the preceding paragraph, for example, has a solution of $X=2{\text{ and }}X=-2$ . For sets of equations, especially large sets, it may not be so obvious, and even if it is obvious, it may be more difficult to insure convergence takes place at the desired solution. Care should taken to start with an $X_{0}$ as near as possible to the desired solution., and that limits are installed on the individual $X$ entries to move the iterations away from undesirable solutions and toward the desirable solutions(s). It should be noted that many solutions exist in the example used above.

Digital verses continuous derivatives[edit]

Derivatives should be calculated using continuous functions whenever possible to maximize accuracy and minimize convergence problems. If the function is unknown and not possible to calculate continuous derivatives, digital derivatives may be used, but care should be taken to maximize accuracy. Use double samples close together, if possible. If not possible, such as in a string of data, cubic interpolations are preferred due to the cubic iterations retention of a defined first derivative. If accuracy is not an issue, linear interpolations may be used, while keeping in mind that the first directives are not defined at the data point, requiring that the next or prior linear segment be used to estimate the derivative.

Applications[edit]

Shaping the frequency response in filter design, such as constricting a Chebyshev pass band ripple to a percentage of the pass band^[4].

References[edit]

^ Burden, Burton; Fairs, J. Douglas; Reunolds, Albert C (July 1981). Numerical Analysis (2nd ed.). Boston, MA, United States: Prindle, Weber & Schmidt. pp. 448 to 452. ISBN 0-87150-314-X.{{cite book}}: CS1 maint: date and year (link)
^ A. Evans, Gwynne (1995). Practical Numerical Analysis. Baffins Lane, Chichester, West Suffix, PO19 IUD, England: John Wiley & Sons, Ltd. pp. 30 to 33. ISBN 0471955353.{{cite book}}: CS1 maint: location (link)
^ Demidovich, Boris Pavlovich; Maron, Isaak Abramovich (1981). Computational Mathematics (Third printing ed.). Moscow: MIR Publishers. pp. 460–478. ISBN 9780828507042.{{cite book}}: CS1 maint: date and year (link)
^ Pelz, Dieter (2005). "Microwave Lowpass Filters with a Constricted Equi-Ripple Passband" (PDF). AMW. 13 (7): 28 to 34 – via APPLIED MICROWAVE & WIRELESS.

[:3-1] Burden, Burton; Fairs, J. Douglas; Reunolds, Albert C (July 1981). Numerical Analysis (2nd ed.). Boston, MA, United States: Prindle, Weber & Schmidt. pp. 448 to 452. ISBN 0-87150-314-X.{{cite book}}: CS1 maint: date and year (link)

[2] A. Evans, Gwynne (1995). Practical Numerical Analysis. Baffins Lane, Chichester, West Suffix, PO19 IUD, England: John Wiley & Sons, Ltd. pp. 30 to 33. ISBN 0471955353.{{cite book}}: CS1 maint: location (link)

[3] Demidovich, Boris Pavlovich; Maron, Isaak Abramovich (1981). Computational Mathematics (Third printing ed.). Moscow: MIR Publishers. pp. 460–478. ISBN 9780828507042.{{cite book}}: CS1 maint: date and year (link)

[:1-4] Pelz, Dieter (2005). "Microwave Lowpass Filters with a Constricted Equi-Ripple Passband" (PDF). AMW. 13 (7): 28 to 34 – via APPLIED MICROWAVE & WIRELESS.

[1]

[2]

[3]

[4]