Talk:Taylor's theorem/Archive 2

This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Archive 2

Limitations of Taylor's Theorem:

Taylor's theorem can not be applied for to a function at a point where the function is not continuous.For example we can not apply Taylor's theorem to the step function f(x)=1 for x ≥ 0,-1 for x ≤ 0 at x =0.— Preceding unsigned comment added by 132.235.44.240 (talk • contribs) 17:06, 11 October 2010

G-Taylor?

(The following discussion has been copied by me from Wikipedia talk:WikiProject Mathematics Paul August ☎ 20:16, 8 December 2010 (UTC))

An IP has repeatedly added a mention of theorem called "G-Taylor", an apparent generalization of Taylor's theorem, to Taylor's theorem (e.g. [1]). I removed it once as, at first glance, it seemed probably not significant enough (yet?) to be included in that article. Anyone else have an opinion on this? Paul August ☎ 18:11, 6 December 2010 (UTC)

The article is published in a vanity press, so it's not appropriate for Wikipedia, even if we ignore the (obvious) WP:COI issues. CRGreathouse (t | c) 18:48, 6 December 2010 (UTC)

Even if it were published in a reliable journal, Taylor's theorem is at the level that it is well covered by textbooks. A Google scholar search for "Taylor's theorem" finds about 8670 matching articles, and even searching for those words in the article titles returns over 100 hits. So obviously there's so much material available that we can't cover every detail of what's known about Taylor's theorem here, only the significant aspects of the subject, and I would want to see some evidence of significance of this result (e.g. high citation counts or coverage in a textbook) rather than mere publication before we include it. Since it's not even reliably published yet, obviously it falls far below that standard. —David Eppstein (talk) 01:19, 9 December 2010 (UTC)

CRGreathouse, removed the mention of "G-Taylor", another (or the same editor using a different IP) added back the mention, I removed it again (asking for a discussion on the talk page) and the original IP has restored it again (with no discussion). Do other editors wish to weigh in? Paul August ☎ 18:08, 7 December 2010 (UTC)

I've filed a RPP; asking for temporary, semi-protection citing an edit war. ([2]) — Fly by Night (talk) 18:31, 7 December 2010 (UTC)

The page has been protected for a week. The protecting admin said that one week would give enough time for the problem to be resolved on the article's talk page. I suggest we carry on this discussion over on the article's talk page. That way, we'll have something in writing, attached to the article itself, in terms of consensus. — Fly by Night (talk) 13:52, 8 December 2010 (UTC)

Fine. I will copy the above discussion to Talk:Taylor's theorem. Paul August ☎ 20:16, 8 December 2010 (UTC)

(End of copied text)

In addition to myself and CRGreathouse, Algebraist (as well as several IP's (24.6.3.41, 71.230.169.109, 131.171.70.159, 76.94.218.197 -- with the comment "2010 article is not notable") has removed the mention of this theorem from the article. My thinking was that, leaving aside the status of the publisher of the cited article being a "vanity press", that, on the face of it, a 2010 publication is probably too new to be notable enough to warrant mention here. Paul August ☎ 20:16, 8 December 2010 (UTC)

I don't think I agree with that reasoning; there are 2010 papers that I would add to articles. But the apparent COI is troubling (look at the other contributions) and the journal is not only non-peer-reviewed but actually a vanity press. That's at least three Wikipedia guidelines violated right there, yes? CRGreathouse (t | c) 18:06, 9 December 2010 (UTC)

COI is simply a red flag, much like the fact that article was only recently published, neither would rule out the result being significant enough to warrant a mention, but both raise suspicions. On the one hand, about the objectivity of the editor's judgement, the other about the lack of time for the result to achieve sufficient notability. I know nothing about the publisher -- from where are you getting that it is a "vanity press" ? -- If so that would make the the source unreliable and preclude inclusion. In any case, David's point above about lack of "citation counts or coverage in a textbook" is telling. Paul August ☎ 18:52, 9 December 2010 (UTC)

They're a print on demand publisher, a more respectable term for a vanity press. Also, the "fast review process" in the publisher's description of their journals is a big red flag for "not seriously peer reviewed". —David Eppstein (talk) 19:04, 9 December 2010 (UTC)

Thanks David. Paul August ☎ 20:05, 9 December 2010 (UTC)

Precisely. I would add that it appears that their "fast review process" is, in fact, no review at all, peer or otherwise -- but I suppose that isn't central to my point. CRGreathouse (t | c) 20:01, 9 December 2010 (UTC)

By the way -- I have nothing against POD services, I just don't want to confuse them with peer-reviewed journals. CRGreathouse (t | c) 20:02, 9 December 2010 (UTC)

One can follow the link which is still in the article to a statement of the theorem. It is merely Taylor's theorem applied to the composition of f with the inverse of g between the points g(a) and g(b). To say that Taylor's theorem and the Mean Value Theorem are special cases is misleading, as one most likely needs to use these theorems, or analogous results, to prove the "g-Taylor" theorem. Without any evidence that someone finds the identity useful, or that the result is somehow important, I'd say that this particular identity does not merit any mention in the article. 70.22.103.189 (talk) 13:11, 17 December 2010 (UTC)

I agree that "G-Taylor" is out of place here. It is a simple consequence which might be useful in some contexts, but putting it on-par with a such a central principle in real analysis shows lack of perspective, and more importantly, only confuses the readers. Lapasotka (talk) 09:30, 6 April 2011 (UTC)

Assumptions and refinements

It is not mentioned anywhere what is really needed for the Taylor theorem (whatever it is) to be true. In this form this article is mostly about the ways of writing out the remainder under some additional regularity assumptions of the given function. How about starting with the simplest version, which really is only the iterated version of the definition of a derivative.

Theorem: If the function $f:\mathbb {R} \to \mathbb {R}$ is k≥1 times differentiable at $a\in \mathbb {R}$ , then one has the Taylor expansion

f(x)=f(a)+f'(a)(x-a)+\ldots +{\tfrac {1}{k!}}f^{(k)}(a)(x-a)^{k}+h_{a,k}(x)(x-a)^{k},

where $h_{a,k}:\mathbb {R} \to \mathbb {R}$ is some (not necessarily continuous) function with $\lim _{x\to a}h_{a,k}(x)=0$ .

After this one can make further assumptions (such as f is k+1 times continuously differentiable etc.) and explain the different ways of expressing the remainder. Lapasotka (talk) 01:58, 4 April 2011 (UTC)

Hear, hear! This article has long been in the lamentable state of being called a theorem without stating any clear theorem. I've always thought it should be called "Taylor's formula", because that is the only thing everybody seems to agree about what it is. But the version you mention is certainly the one that should be called Taylor's theorem (although I'm somewhat doubtful whether Taylor ever stated anything like this). Marc van Leeuwen (talk) 07:43, 4 April 2011 (UTC)

Article too long and repetitive

Does anyone else think that this article is too long and repetitive? I am about to delete the "Motivation"-section, which does not serve any purpose in this form. Also the proofs are too long and should be cut down to maybe one-fifth. My general impression is that there is too much focus on Taylor's series, which has its own article, and too little focus on other aspects of Taylor-polynomials. For instance, nothing is mentioned about the concept of "k-th order differentiability" of multivariate functions and interchangeability of mixed partial derivatives. Lapasotka (talk) 21:23, 6 April 2011 (UTC)

I don't think it is too long, but it is too repetitive.

I like the idea of a motivation section first but as it stands it doesn't seem to accomplish much. I see no reason to include the same equations twice and I don't think that starting with an example Taylor expansion (of exp(x)) helps much, if at all. A short paragraph of words (without equations) motivating the Taylor Theorem. Something along the lines of: Taylor's series is a very useful way to approximate functions. Taylor's theorem show when the Taylor series can be used and how to estimate the error, etc.. It is useful X-field of mathematics because of Y... Some of the last paragraph of the lead may fit better here. I know I am destroying this, I am not a mathematician. As a physicist this is the type of thing I would like to know, what does Taylor's theorem do to help us understand and use Taylor's series, plus why do mathematicians care about Taylor's theorem. (Feynman said that you don't know physics unless you can explain it to a 'barmaid'. Perhaps we can say that a mathematician doesn't know math unless she can explain it to a physicist. At times I certainly feel like a 'barmaid' reading the math articles.)

A suggestion about the proofs is that you isolate them and put them in a automatically hidden proof box (that you click to expand). There is a template around here somewhere for it. I am not a big fan of such boxes, but someone went through a fair amount of trouble to type in the proofs and there will definitely people who will want and/or benefit from having the proof available if and when they want it.

I suspect that you are correct that it has too much focus on Taylor's series and duplicates too much of it. If you get rid of that material, though, I would appreciate it if you make tighter links between the two articles. It is too easy to confuse one with the other. So that if you don't make it obvious enough (in all of the appropriate places) that there is another article called Taylor's series and it covers X, Y, and Z which will only be summarized here, then I guarantee that all your hard work cleaning up the article will soon be undid by someone who wants adds X, Y, and Z and probably badly.

""k-th order differentiability" of multivariate functions and interchangeability of mixed partial derivatives" seems like something that should be discussed in detail in another article like multivariable calculus or partial derivatives. A summary and a link to the appropriate article would be greatly appreciated.

Finally, anything that you can do to either write out the math shorthand in English words or eliminate when not needed would be greatly appreciated. I think I know some of it such as f: R -> C (is that a function of a real number that returns a complex number or is it the other way around or something else.) and a member of, but I have a hard time remembering the different symbols for the different types of 'numbers'. I know that R is real and R^n is a real space in n-dimensions (or close enough I hope), but the others...

Sorry for the length of this response, but I hope that it helps. TStein (talk) 22:37, 6 April 2011 (UTC)

Thank you for the very constructive response! It is especially useful to hear opinions of people working on slightly different fields. Could you come up with a good short example of Taylor's theorem (perhaps only in words) in intermediate level physics? That would make the motivation section more substantial. Another great topic for the motivation section would be the curvature of plane curves. There are also some examples from numerics, but they are probably too technical to be appreciated by the general audience possibly visiting this page. BTW, what I referred to in the interchangeability of mixed partial derivatives is the following. If you look at the Taylor's theorem (without estimates) carefully you might appreciate it as an iterated version in (n=m=1) of the following general definition.

The derivative of a mapping F:Rⁿ→R^m at a∈Rⁿ is a linear mapping (an m×n-matrix) A:Rⁿ→R^m, if there exists a function h:Rⁿ→R^m such that

F({\boldsymbol {x}})=F({\boldsymbol {a}})+{\boldsymbol {A}}({\boldsymbol {x}}-{\boldsymbol {a}})+h({\boldsymbol {x}}),\quad \lim _{{\boldsymbol {x}}\to {\boldsymbol {a}}}{\frac {h({\boldsymbol {x}})}{|{\boldsymbol {x}}-{\boldsymbol {a}}|}}=0.

In this sense Taylor's theorem is a truly fundamental result which explains what it means for a function to be "k-times differentiable". (In jargon, it characterizes the existence of higher order derivatives.) Of course these kind of statements are shot down within milliseconds as POV inside the article, but at least the connection is too intimate (and perhaps not obvious enough) to be left out of discussion. For the mixed partial derivatives, they can always be interchanged if the function is k times differentiable in this iterated sense, because then they correspond to xy- and yx-terms in the Taylor-polynomial, which are the same by the second part of Taylor's theorem. Lapasotka (talk) 09:46, 7 April 2011 (UTC)

I cannot think of an example where Taylor's theorem is used in intermediate physics except implicitly where Taylor's series is used and even then it is usually the first term that doesn't cancel out. Taylor's expansion is used extensively in geometrical optics using the small angle approximation that sin(theta) = theta and cos(theta) = 1. The cubic term in the sign is used for aberrations. Outside of optics the most frequent use is to show that two expressions are equivalent for small deviations from a number (usually zero). Example of this is showing the the relativistic expression of the kinetic energy agrees with the classical form of the kinetic energy for speeds that are 'small' compared to c. Another example in electrostatics is that for small enough lengths the electric field of a line charge reduces to that of a point charge. Another example of an expansion is using the generator function of the Legendre polynomials to expand the electric potential of an arbitrarily shaped charge distribution using multipole expansion.

I don't think that any of these examples are directly related to Taylor's theorem. A physicists coming to this page will either do so because of an accident (not knowing that Taylor's series exist) or because they are interested in the finer details of why, how, and what are the limitations of the Taylor's series.

I am sorry, but I was unable to strictly follow your mathematical argument in your response; I think that is because mathematicians and physicists speak with a different mathematical dialect. I know enough that I think I know what you are trying to say though and with a little time I could translate. I think the idea that the Taylor series can be used to define when something is differentiable is quite interesting. Although that leads to the question that does expanding the function in other basis sets (like the orthogonal Legendre Polynomial (over the range -1 to 1) or sines and cosines like the Fourier expansion also lead to similar yet different definitions of differentiability. I know the Fourier series in particular can 'handle' a finite number of discontinuities. TStein (talk) 17:55, 7 April 2011 (UTC)

Kinetic energy in special relativity is a good example from the physics side, and maybe physical pendulum is another good one, since it is not too Star Trek. To be precise with the "characterization of higher differentiability" in the single variable case, what I mean is

Theorem. A function f:R→R is k times differentiable at a∈R if and only if there exists a polynomial

p(x)=c_{0}+c_{1}(x-a)+\ldots +c_{k}(x-a)^{k}

such that

\lim _{\to a}{\frac {f(x)-p(x)}{(x-a)^{k}}}=0,

and then p is exactly the k-th order Taylor-polynomial of f at a.

This is almost but not quite the traditional Taylor's theorem. It is interesting that you mentioned Fourier-transform in this context. You might want to take a look at Bessel potential spaces in Sobolev spaces. Intuitively, they characterize differentiability of functions by the decay rate of the Fourier-coefficients as the frequency tends to infinity. Lapasotka (talk) 19:10, 7 April 2011 (UTC)

I'd just like to put in my point of view, which is that there is absolutely no relation between Taylor's theorem and Taylor series, and suggesting so is comforting the reader in a very easily acquired misunderstanding. Taylor's theorem is about quantifying to which extent the polynomial with n-th order contact to a function at a approximates that function near a; it trivially reduces to a statement of possible growth near a of a function with n-th order contact to the zero function. The terms of the Taylor expansion are just smoke to distract from the essential point, the remainder term. Taylor series on the other hand are a way to organise the derivatives of an infinitely differentiable function in a into a (formal) power series. No regularity requirements on the function at all will cause the Taylor series to converge, or of it does to converge to the function (outside the point a); that is just plain false. The only thing one can say is that if any power series converges to the function (i.e., it is analytic, which is not a regularity condition for a function of a real variable), it must be its Taylor series; this is a consequence of a trivial computation of derivatives to convergent power series at a, not of Taylor's theorem. In particular Taylor's theorem cannot be construed to imply the quality of "approximation by Taylor series" outside a, except for analytic functions where that is a tautology (and the "approximation" actually an equality). That important examples of functions turn out to be analytic is somewhat of a mystery (given their sparseness among C-infinity functions), for which more or less philosophical explanations can be given, but this is outside the scope of Taylor's theorem, in any form. I've said this strongly to make the point clear; don't try to make this article say something it should not. This being said I am quite sympathetic to the improvements that are being made to this article. Marc van Leeuwen (talk) 08:33, 8 April 2011 (UTC)

Thank you Marc van Leeuwen! In my second last edit I rewrote the section "Relationship to analyticity" before reading this comment. (I suppose we were typing at the same time.) I tried to drive your point home, though less eloquently :) Talking about "the Mystery", I suppose the reason is that "most important examples of functions" arise as solutions of differential equations with polynomial coefficients, and hence are analytic. Lapasotka (talk) 10:07, 8 April 2011 (UTC)

Proofs

I added a proof for the actual Taylor's theorem without any assumptions on differentiability outside the point a. It is kind of brute force, but the older and nicer proofs had another serious flaw -- they didn't prove the theorem. I am not very experienced with Wikipedia's markup language, so the current version has some typographical errors in addition to the mathematical ones. I will try to leave this page alone for a while and let other people do their share, but I will read the talk page. Lapasotka (talk) 21:35, 8 April 2011 (UTC)

Mean value form

There have a large number of recent edits by HasnaaNJITWILL. Though clearly in good faith, they do not seem to improve the article very much as is. I don't think mentioning the mean value theorem in the lead as it is done now, even with a citation from a book, is understandable to readers. Also the section 1.2 "Another expression... " would seem more in place as a remark under "mean-value forms of the remainder" above. It bothers me that it refers to the n=0 form of Taylor's theorem, since there is no such case (it starts at once differentiable). Maybe the mean-value forms are valid for k=0, but the text does not say so. For now I won't undo the sequence of edits (which would be the easiest way to tidy up, but I don't want to undo all first edits of the new User:HasnaaNJITWILL), but these issues should be addressed rapidly. Marc van Leeuwen (talk) 07:31, 11 April 2011 (UTC)

I strongly agree. I sent a message about Show preview -button on his user page discussion and noticed that there were no welcoming messages at that point, and now I added the standard welcoming message with helpful links after your message. I have put some considerable effort on cleaning up this page recently, but I will follow your example and let the edits by User:HasnaaNJITWILL stay there for a little while. I have also another argument against forcing the POV that "Taylor's theorem can be regarded as the generalization of the mean value theorem into higher order derivatives." Namely, while this is true in the one-dimensional case, the whole mean value theorem is blatantly false if the function is either multivariate or vector valued. (Use your geometric intuition -- this is not a "technical issue".) Saying such things in calculus books is bad pedagogy, since many calculus students also study and possibly apply multivariate calculus in their future work. Taking it to the extreme, these kind of misconceptions might even cost lives, if they stay lurking in the backs of the heads of, say, engineers or nuclear physisists. Lapasotka (talk) 08:17, 11 April 2011 (UTC)

Multivariate case and organization of proofs

User talk:Sławomir Biały, there were a couple of reasons why I changed the proof for the multivariate case. One point is that the one I gave had a clear reference, but the more important one is that the old proof (which you restored) kind of pushed under the rug the crucial step by waving hands with multinomial coefficients, which by the way, should be binomial coefficients. Also, the indexing of k in the statement of the theorem was meant to be same than with the single variable case. Note that with k-th order differentiability at a point it really happens that the remainder term has same powers of (x-a) in the remainder term as in the highest order term in the Taylor's polynomial. This holds for both the single variable case and the multivariate case. I think it is better to use a notation consistent with the versions of the theorem with integral etc. expressions for the remainder. As a final point, the organization of the proofs was not very satisfying as it was before you moved all of them into the end, which was an improvement. However, I think the best thing to do is the following.

Add a decent motivation section (not like the one that was there a week ago) and iterate with the integration by parts and fundamental theorem of calculus there to derive the correct form of the Taylor polynomial. This already proves Taylor's theorem with the remainder in the integral form. As far as I see, all the other proofs need an "educated guess" and an induction argument.
Prove the full multivariate Taylor's theorem in detail in a subsection of the multivariate section.
Remove the full proof of single variable Taylor's theorem, since it is quite technical, and do only the mean value forms of the remainder in the single variable section assuming the main theorem, which is proved later on in the full generality. This would be also good for the purpose of pointing out that the mean value versions are intimately single variable results, and do not really generalize.

I am really glad to see you taking interest in this page and I am eager to hear your comments. Let's finally fix this mess of an article. Lapasotka (talk) 19:38, 11 April 2011 (UTC)

Our objective isn't really to give complete proofs of every result (Wikipedia:WikiProject Mathematics/Proofs). I think that it is better to give the proofs that have an important highlight for understanding the subject: thus the Cauchy mean value theorem is important for one proof, integration by parts is important for another, restricting to a line is important for yet another one. These are somehow "iconic" proofs in the subject, and so they are important for a complete encyclopedia article. Having a proof of the general version of Taylor's theorem currently stated in the article seems to me to be much less so. I would be happy to see it removed from the article, but I don't care very strongly about it.

As for the specific point about the proof in several variables, it's true there is some hand-waving in using the multinomial coefficients, but generally we prefer short, possibly incomplete, proofs that convey the main idea, rather than insisting on proofs that hammer out every detail at the expense of concealing the main ideas. The main idea here is a simple one: apply the one variable version of Taylor's theorem on line segments, and I think that needs to be emphasized above other considerations. The details then follow by a routine turn of the crank. As I see it, that's the essence of the proof. Without a doubt, there are also sources for this proof (for instance, Hoermander, The analysis of linear partial differential operators, Vol. I, p. 12–13). A similar technique will also give the mean value form of the remainder. This seems to be emphasized by more elementary sources (e.g., Apostol).

A decent motivation section is needed. What I think would also be helpful is a good example. I have taken some steps to change the emphasis of the existing example from estimating the value of e to approximating the function

e^{x}

on a whole interval to a desired accuracy, since the statement of the theorem (and the estimates of the remainder) are more directly relevant to that case. Sławomir Biały (talk) 20:33, 11 April 2011 (UTC)

The new proof from the pseudodiff book I happened to have at hand also reeked of line segments miles away, but nevermind. Somehow I feel that the proof of the general multivariate case might serve many purposes, though. For example, it would illustrate the differentiability of partial derivatives at a point and interchangeability of mixed derivatives, which is quite central for the understanding of the theorem itself.

What I meant by "mean value versions being one-dimensional results" is that all there is is already there in one dimension. They generalize to multivariate functions only by pulling back over paths. (Perhaps this technique should be mentioned explicitly?) Even then one evaluates the directed derivative along the path, which is somehow a bit of a turn-off. For the vector valued case (to be added some day) the mean value versions actually fail to be true, since the mean values for different components can occur at different points. The integral version is one-dimensional in a less severe sense, since it at least holds for the vector valued case. The general Taylor's theorem, however, has considerably more content in several variables. Lapasotka (talk) 21:45, 11 April 2011 (UTC)

I agree that your proof ultimately relied on the same technique, but approached the matter from the other side as it were, which would make it much harder I think for someone not already familiar with the technique to guess where this function

g(t)

had come from. To an uninitiated reader, it will seem to have been pulled out of a hat. Sławomir Biały (talk) 22:04, 11 April 2011 (UTC)

Estimates of the remainder

I'm puzzled why the section "Estimates of the remainder" was removed. Originally, some of it was absorbed into the "Relation with analytic functions" section. But it seems to me quite important for numerical applications to have at least some discussion of how the remainder is estimated. This can be found, for instance, in just about any calculus textbook that discusses Taylor's theorem. I'm in the process of rewriting this, but I'd like to understand the reasons for its removal as well. Sławomir Biały (talk) 21:10, 11 April 2011 (UTC)

I've put it back in, with a few changes. Obviously from a theoretical point of view, not much is gained by being able to obtain concrete numerical estimates of the remainder, and so from this perspective it is redundant with the various explicit forms of the remainder. But from a practical point of view, this can be useful sometimes, so it belongs there. Rather than removing it, it might be better to edit it until it is more satisfactory. Sławomir Biały (talk) 21:32, 11 April 2011 (UTC)

Well, this estimate is a kind of trivial consequence of Lagrange's form of remainder, and I was about to include it there. I have a problem with the nomencalure, though. Isn't the actual Cauchy's estimate stronger than this one and applicable only for analytic functions? Lapasotka (talk) 21:45, 11 April 2011 (UTC)

Your last edit on the uniform estimate was a great improvement. Thanks. Lapasotka (talk) 21:55, 11 April 2011 (UTC)

(ec) I left the dubious moniker out of the revised version. Yeah, it's a trivial consequence of the Lagrange form, but one that is used so often in basic applications that I think it needs to get special emphasis. Sławomir Biały (talk) 21:59, 11 April 2011 (UTC)

Also why was the asymptotic notation removed? That seemed to me to be a worthwhile addition. Sławomir Biały (talk) 21:14, 11 April 2011 (UTC)

The big O -notation turned the estimate it tried to re-express into something weaker. Namely, the original statement is a uniform estimate in a neighborhood of a, and almost exactly the (possibly misnomed?) "Cauchy's Estimate", but one power weaker, since one less derivative was assumed. The little-o notation was in principle fine, but I don't see the point to write it out because the line above it was precisely its definition. Those who know little-o see it immediately, and those who don't lose their focus. Lapasotka (talk) 21:45, 11 April 2011 (UTC)

Allow me to interject a demand for explanation. It was me who introduced the big-O estimate while also increasing the exponent to k+1, and putting in the preceeding little-o for comparison. The main point is the big-O estimate (or it's equivalent) with exponent k is silly because it is immediately implied by the little-o estimation (little-o always implies big-O, and yes both involve a uniform estimate in an unspecified neighbourhood). So it seemed to me the whole point of refining the estimate is pushing little-o for k to big-O for k+1. Now I see the the exponent is brought back to k for the latter estimate. To me this is make the "considerable stronger" estimate trivially implied by the estimate given before. It really looks extremely silly to me right now. What am I missing? Marc van Leeuwen (talk) 05:27, 12 April 2011 (UTC)

The Big-O notation is a kind of uniform estimate, but in an unspecified neighborhood. In this context,

R_{k}(x)=O(|x-a|^{k}){\text{ as }}x\to a\quad {\text{iff there exists }}C>0{\text{ and }}r>0{\text{ such that}}\quad |R_{k}(x)|\leq C|x-a|^{k}{\text{ whenever }}|x-a|<r.

A true statement, but weaker than the one it was supposed to re-express, since the introduction of Big-O loses the information on which neighborhood of a the estimate is valid. The actual statement is the following, for comparison.

{\text{For any }}r>0{\text{ there exists }}C>0{\text{ such that}}\quad |R_{k}(x)|\leq C|x-a|^{k}{\text{ whenever }}|x-a|<r.

The little-o in this context reads as

f(x)=o(|x-a|^{k})\quad {\text{iff}}\quad \lim _{x\to a}{\frac {f(x)}{|x-a|^{k}}}=0,

which corresponds exactly to the limiting behavior of the remainder in the "crude version" of the Taylor's theorem. Perhaps the little-o should be restored, as I said above. I hope the situation doesn't look silly after this explanation. Lapasotka (talk) 06:07, 12 April 2011 (UTC)

I remain confused. There may be some difficulty for k=1, but that should not be the case to concentrate on, so assume k≥2. Now in order to be k times differentiable at a, our function needs to be k−1 times differentiable on an open interval containing a, since otherwise there are insufficient values available to even define a k-th derivative at a. So in particular our function and the remainder term are continuous on that open neighbourhood of a, and so is h_k(x)=R_k(x)/(x−a)^k (the only potential discontinuity is at a, but that is controlled by the "crude version" of the Taylor's theorem). Since continuous implies bounded on any closed subinterval of that open interval, one gets "there exists

C>0

such that, for all x in this closed interval,

|R_{k}(x)|\leq C|x-a|^{k}

" for free. So the "considerable stronger estimate" is in fact weaker. The "open to closed-subinterval" consideration is not vain, since

{\tfrac {1}{1-x}}

C-infinity on the open interval

(-1,1)

, yet no constant C can exist (for any estimate at all) for the same interval (which shows the current statement in the article is actually false; your above formulation is even worse, since you say "for any r, but the function need not even be defined for large r). I think you should think about this. The above reasoning mainly serves to show that there is no point trying to push an asymptotic estimate to a concrete interval, since the "improved" form can either be obtained by increasing the constant under compactness consideration, or is simply out of reach (for open intervals). For me the general estimates should be asymptotic ones, and then the concrete forms (mean-value, integral) can of course remain as they are. Marc van Leeuwen (talk) 07:20, 12 April 2011 (UTC)

There were some edits (not by myself) which might have obscured the original meaning. I rewrote the paragraph and re-inserted little-o and Big-O, and added a precaution that there is an estimate with a "better exponent" coming up. Nevertheless, there is a difference with the current one and the one in the section "Estimates for the remainder", which was added by User:Sławomir Biały. The first one says that

if f is k times continuously differentiable on the closed interval [a-r,a+r], then |R_k(x)|≤C|x-a|^k

and the second one says that

if f is k+1 times continuously differentiable on the closed interval [a-r,a+r], then |R_k(x)|≤C|x-a|^k+1.

These estimates have different applications. The second one is of course more sophisticated, but sometimes one actually wants to estimate the remainder of the highest possible order. My response on the Talk page assumed some benevolent interpretation and left the restriction of r so that f is k times continuously differentiable on the closed interval [a-r,a+r] implicit. Please read the current version and let me know if you still find it questionable. I am sorry about removing the asymptotic notation. At the moment it seemed like a good idea (see the discussion above.) Lapasotka (talk) 09:43, 12 April 2011 (UTC)

Now I'm even more at a loss of where this is going at. I'll not try to argue, but just mention two reasons why I find the current "considerable improvement" (which is the same as the "first one" just cited) silly (to say it politely) (1) the hypothesis k times continuously differentiable is much too strong, just continuous suffices to get from the little-o estimate to the existence of C, (2) the conclusion remains valid if one adds a random multiple of (x−a)^k to the remainder term, which trivially satisfies the same estimate; for instance one could make the the Taylor polynomial one term shorter, incorporating its final term into the error term instead. It is beyond me what could be the use of such an estimate for R_k, or what you mean by "sometimes one actually wants to estimate the remainder of the highest possible order". Order k+1 is higher than order k, so that is an argument in favour of the "second one", seems to me. Couldn't we just drop the vague (and unsourced) stuff about considerable improvement and go to the precise formulae right away? Marc van Leeuwen (talk) 14:38, 12 April 2011 (UTC)

Something is definitely fishy, since the function h is automatically continuous(in a neighborhood) for k>1, possibly after modifying the value at a itself. The thread following this one is related. I won't have time to sort this out until much later today, though. Sławomir Biały (talk) 15:02, 12 April 2011 (UTC)

I see. I was thinking primarily of the little-o, since this is a helpful mnemonic to someone with some familiarity with asymptotic notation, but who may not immediately make the connection. This is helpful to someone knowledgeable giving the article a quick scan, in my opinion. Perhaps the little-o should be restored and the big-O left out? Sławomir Biały (talk) 21:59, 11 April 2011 (UTC)

That is a good idea. I am currently writing a proof for the general multivariate Taylor's theorem.( Too bad I don't know where to find it right now.) It should replace the single variable version and be placed in a "hidden box" or whatever the nice gadget for ugly content is called. BTW, do you have any good ideas for the Complex Analysis section? (I think Taylor's theorem is useless in complex analysis which is full of way stronger tools, which reduces my interest to elaborate it.) How about vector-valued Taylor polynomials? I think the fact that mean value forms of the remainder fail there should be mentioned somewhere. Lapasotka (talk) 22:18, 11 April 2011 (UTC)

I would strongly advise against putting a proof of the general multivariate theorem in. It seems to me that the article already pushes our guidelines on proofs too far. An encyclopedia article is not supposed to replace a textbook or monograph on a subject. However, having a reference to a general proof would be helpful I think. Sławomir Biały (talk) 22:43, 11 April 2011 (UTC)

On statement(s) of the theorem

Good job for finding the reference! Too bad it doesn't mention the converse, which is easier, but very enlightening. Do you think we could state the converse as a "matter of fact" kind of way right after the theorem without running into trouble with OR? I am pretty sure there is a reference for a version with the converse. Is the converse mentioned in your reference at all? If it is, maybe we can even add it to the statement of the theorem itself and make a more precise citation? Lapasotka (talk) 10:47, 12 April 2011 (UTC)

The converse wouldn't be true without added conditions on h, specifically that h needs to be k-1 times differentiable in a punctured neighborhood of a. I'd like to see a reference for that though. Sławomir Biały (talk) 10:52, 12 April 2011 (UTC)

Here is the proof for the "converse" from the proof you swithced to the l'Hopital -based one:

To prove the converse, assume that f has an expansion of the form

f(x)=P_{k+1}(x)+h_{k+1}(x)(x-a)^{k+1}\

for some (k+1)-th order polynomial P_k+1 and some function h_k+1 which tends to zero as x tends to a. Using the expression (k-1)-th order Taylor expansion for f we see that

h_{k}(x)(x-a)^{k}=c_{k+1}(x-a)^{k+1}+h_{k+1}(x)(x-a)^{k+1}\

for some

\ c_{k+1}\in \mathbb {R}

. Using the limits of h_k and h_k+1 as x approaches a we see that

h_{k}'(a)=\lim _{x\to a}{\frac {h_{k}(x)}{x-a}}=c_{k+1}+\lim _{x\to a}h_{k+1}(x)=c_{k+1},

so h_k is differentiable at a. Now we see from the (k-1)-th order Taylor expansion that f is also k+1 times differentiable at a.

Do you find a mistake here? Lapasotka (talk) 11:34, 12 April 2011 (UTC)

There are obvious counterexamples. Just take h to be some discontinuous function in a neighborhood of a that is o(1) as x tends to a. Sławomir Biały (talk) 12:07, 12 April 2011 (UTC)

So h→0 as x→a. Discontinuity is not a problem as long as one has h(a)=0 (when h must be continuous at the point a). I hoped this assumption was obvious enough not to be stated explicitly. Note also that the claim is about differentiability at a point, not in some neighborhood around it. Do you have a more substantial counterexample or can you find an error in the proof? Of course a reference would be the best. Lapasotka (talk) 12:57, 12 April 2011 (UTC)

There are lots of functions which are differentiable at a point and not continuous in any of its neighborhoods. For example,

h(x)={\begin{cases}x^{2}&,x\in \mathbb {R} \setminus \mathbb {Q} \\-x^{2}&,x\in \mathbb {Q} \end{cases}}

at the point x=0. Lapasotka (talk) 13:04, 12 April 2011 (UTC)

Too bad there is no mention about the differential expansion of a function in Wikipedia. At least in my old calculus books (unfortunately not written in English) there was always a theorem of the form

A function f:R→R has a derivative at a∈R if and only if it has a differential expansion at a, that is, there exists a function h:R→R and a constant c such that
$f(x)-f(a)=c(x-a)+h(x)(x-a),\qquad \lim _{x\to a}h(x)=0,$
and if this is the case, then c=f'(a).

This should be easy enough for anyone editing this page to verify, and already contains the bit you are worried about. Taylor's theorem says the same thing for higher order derivatives. This is also good material for the Motivation section. Lapasotka (talk) 13:27, 12 April 2011 (UTC)

There are no functions that are discontinuous in every neighborhood of a point, but twice differentiable at the point. The definition of higher differentiability at a point requires existence of lower-order derivatives in a neighborhood of the point. Sławomir Biały (talk) 14:19, 12 April 2011 (UTC)

Yes, you are right, and the missing part in the argument was to show that the lower order derivatives of f exist near a. The proof only shows that f is "kind of k times differentiable" at a, in the same way that

g(x)={\begin{cases}x^{3}&,x\in \mathbb {R} \setminus \mathbb {Q} \\-x^{3}&,x\in \mathbb {Q} \end{cases}}

is "kind of twice differentiable" at 0. In my head this g was "differentiable enough" at 0. This leads to a dilemma: If h from the example above is regarded as once differentiable at 0, why shouldn't g be regarded as twice differentiable at 0? Of couse this discrepancy arises from the iterated definition of higher order differentiability. Perhaps there is a canonical way (other than Taylor expansion) of defining higher order differentiability differently, so that g becomes as twice differentiable as h is once differentiable at 0. My guess is that it can be achieved using a single limit with higher order differences. Does anyone know about such calculus and about any references? If there are any, this matter should probably be mentioned in some subsection. Otherwise, for the purpose of this Wikipedia page, the case is closed for my part and we can just drop the converse statement. Lapasotka (talk) 20:37, 12 April 2011 (UTC)

To expand Sławomir Biały's comment: for the sake of differential calculus, we only define the derivative of a function at interior points of the domain (in fact, one can also consider a function defined on a closed intervals and define the derivative at an endpoint; but that is in fact a right or left derivative).

As a consequence, when we say that a function

f:I\to \mathbb {R}

, defined on some open interval I, admits the k-th derivative at a point

x_{0}\in I

, we are also implicitly saying that it has all derivatives of order less than k, in a neighbourhood of

x_{0}

. This because

f^{(k)}(x_{0})

is by definition the derivative at

x_{0}

of the function

f^{(k-1)}

, so that the latter needs to be defined in a nbd of

x_{0}

.

If f admits the k-th derivative at

x_{0}

, then it has a polynomial expansion of order k at

x_{0}

, meaning that

f(x)=P(x)+o(|x-x_{0}|^{k})

, where

P

is a polynomial. For k=1 this is equivalent to the definition of derivability; but it is not as soon as k>1. For instance

f:\mathbb {R} \to \mathbb {R}

defined by

f(x):=x^{8}\chi _{\mathbb {Q} }(x)

has a polynomial expansion of order 7 at zero, though it is discontinuous at any

x\neq 0

, and only differentiable once at

0,

as in your example.

That said, a nice converse of Taylor theorem does exist in the setting of k-times continuously differentiable functions; it is indeed a characterization of the class

C^{k}(I)

. Precisely: a function

f:I\to \mathbb {R}

(

I

an open interval) is of class

C^{k}(I)

if and only if it has at any point

x\in I

a polynomial expansion of order k varying continuously wrto the point x, meaning that (for all

x\in I

and all

h

such that

x+h\in I

)

f(x+h)=P(x,h)+h^{k}r(x,h)

where

P(x,h)=\sum _{j=0}^{k}a_{j}(x)h^{j}

is a polynomial in

h

with continuous coefficients

a_{j}\in C^{0}(I)

; and

r

is a continuous function in the pair

(x,h)

such that

r(x,0)=0

for all

x

. This is due to Marcinkiewicz and Zygmund (1936). The proof is not difficult and quite elementary; we may include a sketch of it if you like. It generalizes to functions of several variables, and to mappings between Banach spaces too.--pm a 21:59, 12 April 2011 (UTC)

This reminds me. A good account of some of the issues to do with Taylor polynomials (I believe including Marcinkiewicz Zygmund theorem) can be found in Stromberg's "Introduction to classical real analysis", if memory serves. I don't have a copy, but I will see if I can get it out of the library in the next few days. Sławomir Biały (talk) 00:45, 13 April 2011 (UTC)

Just a side remark. Doing polynomial expansions and the "kind of higher order differential in a point" remind me of Don Knuth's advocacy of introducing calculus without doing limits, but big-O estimates instead. It is described in a "Teach Calculus with Big O" commentary that appeared in the Notices of the AMS, and for which an extended version can be found on his website as preprint Q171. (Somewhat unfortunately it is just a plain TeX source file, so you need to do-it-yourself to get nice output.) It is nice reading, though I think the proposal has not been followed up much. Marc van Leeuwen (talk) 05:41, 13 April 2011 (UTC)

I support the proposal of User:PMajer to do the converse in C^k. This should be separate from the single variable section which, I suppose, is intented as very elementary. I think it would be appropriate to do it in Rⁿ→R^m. A section on Rⁿ→R^m would also be a nice target link for differential geometry oriented articles. Lapasotka (talk) 08:10, 13 April 2011 (UTC)

The currently stated form of Taylor's theorem in C^k(I) is grossly erroneous. For instance for I=R , f(x)=|x| and a=1 it states that f is in C^k(I) for any k, because the function h_k that is zero for non-negative arguments and equal to 2x/(1−x) for negative x is continuous everywhere. The main point that is missed is that it does not suffice to consider a single Taylor polynomial, but a family of Taylor polynomials at each point of the interval, of which each of the coefficients should vary continuously with the base point, as correctly formulated by pm above. I am unsure though whether this is the proper thing to state at that point of the article, since it involves rather more complicated notions than either the previous or the following statements. So I will remove the erroneous statement shortly, unless the situation is repaired in a satisfactory manner. I would also like to say to Lapasotka, in all due respect for your efforts to improve this article, that it might be time to stop adding things before having studied these matters in detail in the literature, and making sure you understand the various issues at stake. Maybe you should just leave the article for some time, and allow editing by people with a different perspective. Marc van Leeuwen (talk) 09:50, 14 April 2011 (UTC)

I removed the statement. The result is not even true locally in a neighborhood of a, since

h_{k}(x)={\frac {f(x)-P_{k}(x)}{(x-a)^{k}}}

is automatically continuous if f is k times differentiable. Moreover,

h_{k}

will automatically be (k−1) times differentiable in a punctured neighborhood of a. Sławomir Biały (talk) 12:08, 14 April 2011 (UTC)

That was sheer sloppiness from my part. I was going to write down the correct formulation by pm. Do you have any objections against including the correct version? Lapasotka (talk) 13:27, 14 April 2011 (UTC)

I think it's not really suitable for this point in the article. If it belongs here at all, it would be in a section at the bottom on generalizations and extensions. Sławomir Biały (talk) 13:48, 14 April 2011 (UTC)

Motivation

That is fine. How about changing the pictures in the motivation to show several approximations for sin(x)? It would be more vivid and the article already has so many instances of the exponential function that someone might wonder whether Taylor's theorem is good for anything else. Lapasotka (talk) 15:22, 14 April 2011 (UTC)

Sounds good. Sławomir Biały (talk) 15:26, 14 April 2011 (UTC)

What was your rationale for discarding the formal explanation how one arrives at Taylor expansions from the differential expansion? I included it in the motivation to make Taylor's theorem look like a very natural result, which one can easily derive by hand with a routine computation. The current form kind of hides the very simple mechanism, how I see it, for no particular reason. Of course this is also a matter of taste. Lapasotka (talk) 16:33, 14 April 2011 (UTC)

I don't think the motivation section should emphasize the method of proof. That's going to be a turn-off to most readers who just need the theorem for an application. Sławomir Biały (talk) 12:25, 15 April 2011 (UTC)

I think the latest edit by User:Marc van Leeuwen should be shortened a little bit. Now there is almost more demotivation than motivation. Or should we change the section name ;) ? Lapasotka (talk) 12:19, 15 April 2011 (UTC)

It's probably important to emphasize the limitations of Taylor's theorem somewhere, but doing so before the theorem is even stated seems premature. Sławomir Biały (talk) 12:25, 15 April 2011 (UTC)

I disagree (obviously). Though it is maybe a bit on the long side, I think it is crucially important to understand what Taylor's theorem sets out to do. Somewhere we all want to believe that Taylor polynomials approximate the function value outside the point a (I'm sure this is what Taylor wanted to believe) and only after having killed this hope as an impossible mirage can one start to appreciate what Taylor's theorem does have to say. Marc van Leeuwen (talk) 12:44, 15 April 2011 (UTC)

It's a little strange to have a section asking the reader to note something said and not said in a theorem that hasn't even been stated yet. I can only imagine what a reader never encountering Taylor's theorem would make of such a paragraph. Furthermore, I think it is easier to understand what this is saying if it is informed by examples: say 1/(1+x^2) and e^{-1/x^2} for the two scenarios described in the paragraph. But there isn't enough space there to cover any examples in detail. I think it's better pedagogy to state the theorem, and then explain properly what the limitations are. Sławomir Biały (talk) 12:53, 15 April 2011 (UTC)

Well the motivation section is (and was) already saying things about Taylor's theorem that hasn't even been stated yet. Like "Taylor's theorem ensures that the quadratic approximation is... a better approximation than the linear approximation". It also says "Similarly, we get still better approximations to f if we use polynomials of higher degree"; although not literally attributed to Taylor's theorem, that is clearly the suggestion. And since that says twice "better approximation", which in a very natural sense of that term is not true, there is some reason to tone this down a bit. But the main point of the paragraph I added is not that Taylor's theorem does not say "better approximation", but that it cannot say such a thing. That makes sense even if no formulation is given yet, and alerts the reader to read closely what the theorem does say. Marc van Leeuwen (talk) 15:34, 15 April 2011 (UTC)

Well, it may be appropriate to adjust the wording on the motivation, then. The intuition is that we expect higher order polynomials to give "better" approximations. That's kinda the point of a "motivation" section: to clarify the intuition behind the theorem Taylor's theorem gives the precise sense in which that is true. It's not surprising that the naive intuition misses some details, but I don't see the value in rubbing this in until the theorem is actually presented. Sławomir Biały (talk) 16:21, 15 April 2011 (UTC)

Relationship to Analyticity

I combined the stub section on complex analysis with the subsection "Relationship to analyticity" into a new section. I started writing something on the role of Taylor's theorem in complex analysis (which I believe is minimal), but it needs some more work. I will get back to it after some real life duties. I know there are too many details at this point, but they will be cut down. Lapasotka (talk) 21:08, 14 April 2011 (UTC)

Looks good. Sławomir Biały (talk) 22:58, 14 April 2011 (UTC)

Now the section is at least in some shape. There is one screenful of "theory" which does not exist in Wikipedia in such a brief form. Should there be a mention about the behavior of the Taylor polynomials of the "innocent looking real analytic function"

f(x)={\frac {1}{1+x^{2}}}

at a=0 and r→1, and how it can be understood using complex analysis? Lapasotka (talk) 12:27, 15 April 2011 (UTC)

The article should discuss how the Taylor approximation (or the error estimates) can get worse in higher order, maybe with the 1/(1+x^2) example compared and contrasted with a function like e^(-1/x^2). I think van Leeuwen's last paragraph could be spun out into a separate section dealing with these cases, maybe right before the analyticity section. What other counterintuitive limitations are there with Taylor's theorem? Sławomir Biały (talk) 12:35, 15 April 2011 (UTC)

Proof of the integral form

It looks like the proof of integral form of the remainder got lost somewhere. I think it had been moved to an earlier version of the "motivation" section. I've put the proof back in the article for now, but we can discuss removing it if that was the actual intention. Sławomir Biały (talk) 12:01, 22 April 2011 (UTC)

I removed the subsection when I wrote the new Motivation section and added the crucial bit of the proof over there. There are quite a few proofs on this page right now and the whole may need some re-thinking. Perhaps collecting the one-variable proofs together in a shorter form and moving them into the last subsection of "Taylor's theorem in one real variable" would be a good idea. They are considerably more elementary than the rest of the article. I also think that a short derivation of the integral form of the remainder should appear in the Motivation instead of the special case of k=2, but I didn't want to react on your edit without further thought. It has some elements which I like a lot, but I don't believe that k=2 as a special case has any qualities why it deserves to be underlined. Lapasotka (talk) 20:52, 22 April 2011 (UTC)

I think it's essential that the motivation section be proof-free. Our target audience should not be presumed to be interested in proofs, which is also why I think the proofs should be towards the end of the article. The primary application of Taylor's theorem, outside the rarefied world of pure mathematics, is to approximate actual functions, and have some reasonable control over the error. The k=2 case is important because it is easiest to explain in that case why matching second derivatives one intuitively expects a "better" approximation. So the reason for emphasizing this case in a "motivation" section seems clear. Most readers should be expected to have a reasonable grasp of the intuitive meaning of the second derivative, although it's less likely that they will have a great deal of competence with integration by parts. The earlier parts of the article should mostly emphasize the theorem and its applications. I agree that the proofs should be shortened and simplified as much as possible. Ideally, we should just be able to report the essential idea of each of the proofs without giving details. Sławomir Biały (talk) 22:14, 22 April 2011 (UTC)

Where is the actual statement of the theorem??

I was skimming through the article hoping for a precise formal statement of the theorem (with either the Lagrange or Cauchy remainder). I saw great writings introducing what the theorem is about. But where is the actual statement?? What are the remainders? For example, "for an n-times differentiable function f, f=T+R, where T is the Taylor series till the n-th term and R = f^(n)(c)*x^n/n!" The reason I post this is that I think a straight-forward statement with explicit formulae is what people coming to this page need the most. I understand that Taylor wasn't responsible for the discovery of the remainder term. But I argue that for students in math, the theorem is far from complete without the remainder term. Experts, please help! 冷雾 (talk) 06:24, 7 May 2011 (UTC)

The section "Statement of the theorem" and the one immediately following it "Explicit formulae for the remainder" seem to be what you're after. Sławomir Biały (talk) 10:37, 7 May 2011 (UTC)

This page is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.