Wikipedia talk:Close paraphrasing/Archive 3

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 1 Archive 2 Archive 3

Improving the example analysis

The analysis given in the "Example" section is unhelpful, IMO, as it re-enforces the idea that avoiding close paraphrasing is just about rewording individual sentences. Close paraphrasing isn't just about individual sentences, it's about reproducing a source substantially, but with different words. Instead of analyzing the paragraph sentence by sentence, I think it would be more useful to treat it as a whole and point out the facts that:

  1. It is relying on a single source rather than pulling from multiple sources.
  2. It is reproducing the source in detail rather than summarizing and presenting only the most salient facts.
  3. The structure of the paragraph largely mirrors the presentation of information in the source.
  4. The structure and wording of several sentences are similar to those in the source.

Thoughts? Kaldari (talk) 18:57, 21 December 2012 (UTC)

Ok, those points are all valid. Without replacing the example, we can extend it. One key concept is: rewrite vs summarize. A summary should be shorter, which means, necessarily, that some fact or verbosity has to be excised. -Lexein (talk) 19:30, 21 December 2012 (UTC)
  • An article should present the facts, and cite sources so the reader can verify those facts. That is different from reproducing in different words what the sources say. This essay should make that clearer. But as long as the article sticks to the facts and avoids copying any creative expression, there will normally be no problem. To User:Kaldari's points:
  1. Multiple sources are preferred but not required. With a notable topic they should normally be available. This essay should make that point in the "concepts" section. But I fully agree that the example would be a lot more useful if it showed an attempt using two or three sources. Not sure if I want to volunteer to make that change...
  2. I think I am with User:Lexein on summarizing. If all the information in the source is relevant it should all be reproduced. I sometimes start short bios of people where an excellent source is a short article in an 1836 collection of biographies. There is nothing that should be dropped.
  3. The structure of an article may be natural: "She was born ... high school ... university ... degree ... joined company ... ran for election ... appointed minister ... accused ... trial ... executed." I don't see any reason to swap a sequence like this to try to avoid similarity to the source.
I also dislike the point-by-point analysis, but more because it seems to say "this does not disguise copying" where it should be saying "this copies creative expression". Aymatth2 (talk) 03:30, 31 December 2012 (UTC)

Proposal: Moral rights definition

This is to propose adding a section under "Concepts" after "Creative expression" but before "Wikipedia's guidelines" called "Moral rights".

The "moral rights" of an author are independent of copyright ownership. They include the author's right to control first publication of a work; the author's right to be attributed or to remain anonymous; the author's right for the work to be published without distortion or mutilation. As with copyright, moral rights apply to creative expression but not to mere facts. Editors must respect moral rights to ensure that Wikipedia content can be reused as widely as possible.

Wikipedia editors must not use unpublished work in any way. With published work, editors should attribute each source to the author where the publication names the author, and attribute the source to the publication if it does not name the author. It is sometimes relevant for an article to include a short quotation such as a significant statement made by the subject of the article or a notable comment about the subject. In these cases a verbatim quotation should be given rather than a paraphrase. Quotations should be used very sparingly and must be clearly identified and formatted as defined in MOS:QUOTE.

I think more is needed on the subject of quotations, basically saying they are not to be used just as a way to avoid the effort of extracting and stating the facts, but that belongs in a later section. Comments or suggestions on this proposed addition? Aymatth2 (talk) 02:39, 31 December 2012 (UTC)

  • Since there were no objections, I have made the change. Aymatth2 (talk) 02:19, 9 January 2013 (UTC)

Proposal: Substantial similarity definition

This is to propose modifying the section "Copyright law" to reflect the importance of moral rights, while moving the discussion of substantial similarity to an expanded stand-along section. The modified section of "Copyright law" would read:

Wikipedia's primary concern is with the legal constraints imposed by copyright law. Close paraphrasing of the creative expression in a non-free copyrighted source is likely to be an infringement of the copyright of the source. In many countries close paraphrasing may be also seen as mutilation or distortion of an author's work, infringing on their moral rights.

The new section of "Substantial similarity", to be placed after the section on "Moral rights", would read:

Depending on the context and extent of the paraphrasing, limited close paraphrase may be permitted under the doctrine of fair use. An author may think they are being original when they write "Charles de Gaulle was a towering statesman", not realizing that many other authors have independently come up with these identical words. Copying or close paraphrasing may thus be accidental. Two reasonable people may disagree on whether a paraphrase of one sentence copies creative expression or only reproduces facts, which are not protected. There is thus an element of subjectivity. The concept of "substantial similarity" is used to weed out these trivial instances.

Paraphrasing rises to the level of copyright infringement when there is substantial similarity between an article and a source. This may exist when the creative expression in an important passage of the source has been closely paraphrased, even if it is a small portion of the source, or when paraphrasing is looser but covers a larger part of the source. A close paraphrase of one sentence from a book may of low concern while a close paraphrase of one paragraph of a two-paragraph article would be considered a serious violation. Editors must therefore take particular care when writing an article, or a section of an article, that has much the same scope as a single source. The editor must be extra careful in these cases to extract the facts alone and present the facts in plain language, without carrying forward anything that could be considered "creative expression".

Comments or suggestions on this proposed change?

  • Given lack of objection, I have made that change. I also dropped down the heading level of the following section "When is close paraphrase permitted?" since this seems more prescription than concept. Aymatth2 (talk) 01:29, 18 January 2013 (UTC)

Proposal: Quotations

The section on "Brief indirect quotation of non-free text" reads as follows:

(ORIG) If a non-free copyrighted source is being used, it is recommended to use original language and direct quotations, to clearly separate source material from original material. This is in keeping with non-free content policy and guideline. However, brief instances of indirect quotation may be acceptable without quotation marks with in-text attribution. If the text is markedly creative or if content to be duplicated is extensive, direct quotation should be used instead. Extensive instances of indirect quotation are not generally acceptable; even if content is attributed, it can still create copyright problems if the taking is too substantial. To avoid this risk, Wikipedia keeps this—like other non-free content—minimal.

This is to propose clarifying and expanding the section to read as follows:

(ALT1) Limited direct or indirect quotation from a non-free copyrighted source may be considered "fair use" in some circumstances, as discussed in Wikipedia's non-free content policy and guideline. Direct or indirect quotation must have in-text attribution. With direct quotation, editors must clearly distinguish the quoted material from the original text of the article following the guidelines for quotations. Brief instances of indirect quotation may be acceptable without quotation marks, although direct quotation should be used instead if the text is markedly creative or if the duplicated content is more than a few words. Extensive use of direct or indirect quotation from non-free sources is not acceptable. Even if content is attributed, it can still create copyright problems if the taking is too substantial. To avoid this risk, Wikipedia keeps this—like other non-free content—minimal.

Quotation taken from non-free sources, whether direct or indirect, should generally be restricted to statements made by a person discussed in the article, or significant opinions about the subject of the article. Quotation is not an acceptable alternative to extracting facts and presenting them in plain language. Thus:

  • Right: Churchill said, "I have nothing to offer but blood, toil, tears and sweat."[1]
  • Right: The New York Times reviewer found the film pretentious and boring.[2]
  • Wrong: According to Bulgarian Butterflies, "the patient observer may be fortunate enough to glimpse this rare moth flitting along the mossy banks of a woodland stream."[3]

Comments or suggestions? Aymatth2 (talk) 12:41, 19 January 2013 (UTC)

As per the norm you have done a great job in the wording - let alone the courtesy of again informing us of your intent.Moxy (talk) 08:09, 20 January 2013 (UTC)
Very nicely done. As Moxy said. Truthkeeper (talk) 14:09, 20 January 2013 (UTC)
Presented here without comment, Moxy's edit summary: (→‎Proposal: Quotations: Worst Idea I have ever seen here on Wikipedia let alone the internet - you should be blocked). --Lexein (talk) 16:55, 20 January 2013 (UTC)
  • That certainly got my attention when I saw it on my watchlist. :~) Aymatth2 (talk) 17:11, 20 January 2013 (UTC)
:-) every once and a while we must have some fun. Again great job.Moxy (talk) 19:38, 20 January 2013 (UTC)

I like much of it, but I'm afraid it may be a major issue that WP:NFC does not discuss indirect quotations. :) WP:NFC says only this, "Articles and other Wikipedia pages may, in accordance with the guideline, use brief verbatim textual excerpts from copyrighted media, properly attributed or cited to its original source or author, and specifically indicated as direct quotations via quotation marks, <blockquote>, or a similar method." and "Brief quotations of copyrighted text may be used to illustrate a point, establish context, or attribute a point of view or idea. In all cases, a citation is required. Copyrighted text that is used verbatim must be attributed with quotation marks or other standard notation, such as block quotes. Any alterations must be clearly marked, i.e. [brackets] for added text, an ellipsis (...) for removed text, and emphasis noted after the quotation as "(emphasis added)" or "(emphasis in the original)". Extensive quotation of copyrighted text is prohibited. Please see both WP:QUOTE for use and formatting issues in using quotations, and WP:MOSQUOTE for style guidelines related to quoting." That would seem, to me, to make this sentence a problem: "Limited direct or indirect quotation from a non-free copyrighted source may be considered "fair use" in some circumstances, as discussed in Wikipedia's non-free content policy and guideline." WP:C says, "Wikipedia articles may also include quotations, images, or other media under the U.S. Copyright law "fair use" doctrine in accordance with our guidelines for non-free content. In Wikipedia, such "fair use" material should be identified as from an external source by an appropriate method (on the image description page, or history page, as appropriate; quotations should be denoted with quotation marks or block quotation in accordance with Wikipedia's manual of style)." Currently, there is no copyright-related policy that supports the use of indirect quotation. --Moonriddengirl (talk) 20:26, 21 January 2013 (UTC)

  • I was a bit unsure about indirect quotation, but carried it forward from the current version of this essay, which has this section called "Brief indirect quotation of non-free text" discussing indirect quotation and saying it is sometimes allowed. That has been there for quite a while. The essay should be corrected. I would prefer to keep the section (a direct quote is an extremely close paraphrase), rename it "Quotation" and word it as follows:

(ALT2) Limited direct quotation from a non-free copyrighted source may be considered "fair use" in some circumstances, as discussed in Wikipedia's non-free content policy and guideline. Direct quotation must have in-text attribution. The article must clearly distinguish the quoted material from the original text of the article following the guidelines for quotations. Indirect quotation of non-free content is not allowed. Extensive use of quotation from non-free sources is not acceptable. Even if content is attributed, it can still create copyright problems if the taking is too substantial. To avoid this risk, Wikipedia keeps this—like other non-free content—minimal.

Quotation taken from non-free sources should generally be restricted to statements made by a person discussed in the article, brief illustrative excerpts from a work discussed in the article or significant opinions about the subject of the article. Quotation is not an acceptable alternative to extracting facts and presenting them in plain language. Thus:

  • Right: Churchill said, "I have nothing to offer but blood, toil, tears and sweat."[1]
  • Right: The New York Times reviewer found the film "pretentious and boring".[2]
  • Wrong: According to Bulgarian Butterflies, "the patient observer may be fortunate enough to glimpse this rare moth flitting along the mossy banks of a woodland stream."[3]

Comments? Aymatth2 (talk) 22:37, 21 January 2013 (UTC)

Well, here's my problem - even though they're not mentioned in NFC or C, I myself think that limited indirect quotations are fine. I don't think the section in the essay (I can't remember precisely when it was added, but I believe it was probably added by SlimVirgin, although I wouldn't be surprised if I had a hand in its development) is a problem, as it doesn't imply that the NFC and C support indirect quotation and it clearly sets out best practice: "it is recommended to use original language and direct quotations, to clearly separate source material from original material." I'm not comfortable with suggesting that indirect quotations are covered in C or NFC, but I am worried that it is going too far to explicitly forbid them here - I think that would be a matter for a policy discussion. (The only issue I have with them is that some people will inevitably try to run with it well beyond reason - "I said 'Accord to X' in the first sentence. What do you mean I can't copy the next two paragraphs?'.) For that reason, I prefer what we have on indirect quotations already.
In terms of the restrictions on the nature of quotations, I'm not sure that really belongs in this essay, as it doesn't really seem to relate to the topic at hand. A little bit on how to use the text seems important, since knowing when to incorporate quotation and how is an important part of paraphrase, but to me that seems more generally related to quotations, probably better suited to something like WP:MOSQUOTE. --Moonriddengirl (talk) 12:23, 22 January 2013 (UTC)
Moonriddengirl is quite right- the section goes much too far – far beyond the court cases in copyright law dealing with not-for-profit publications like Wikipedia. That makes it a matter of unsourced original research and legal opinions that are entirely made up and not rooted in either secondary sources or court decisions. Rjensen (talk) 13:17, 22 January 2013 (UTC)
  • A high-impact essay like this should never suggest that looser rules than policy are acceptable, whatever WP:IAR says, but can recommend tighter rules. WP:NFCC is explicit: Brief verbatim excerpts are allowed only when specifically indicated as direct quotations. Other non-free content (presumably media other than text) is allowed only when the 10 criteria are met. Indirect quotation may read better, particularly when there is a quotation within the quotation, or the amount quoted is very short. But if WP:NFCC, a legal policy, says the only way to copy non-free content verbatim is as a direct quotation, that is the end of it. Possibly this is a moral rights concern rather than a copyright concern. Anyway, this essay has to reflect policy.
The points in ALT2 (above) that go beyond policy, but that I think are relevant in this essay, are 1) in-text attribution is needed (this is in the WP:CITE guideline); 2) an article should "generally" only use quotation to illustrate what the article subject said, or what opinions people expressed about it; and 3) quotation is not a way to avoid the effort of putting the facts into plain language. I do not see the last two going into WP:MOSQUOTE, which discusses the format of quotes rather than the principles of when they should or should not be used. Point 3 is in my view very important, since quotes are often abused in this way. Between them, points 2 and 3 will eliminate a lot of argument. They will not cover every case, but will cover most of them. This is the natural place to give that advice. Aymatth2 (talk) 14:18, 22 January 2013 (UTC)
  • I am opposed to this essay recommending tighter rules than policy with regards to indirect quotation. Policy does not say that they are forbidden; it simply doesn't mention them at all. If the essay follows suit and does not mention them at all, that's fine with me, but it should not overstate the case. I don't believe that (2) belongs in this essay at all, as we are concerned here with how best to put content into our own words (or not), and not with what material should be quoted. That doesn't relate to close paraphrasing. I agree with you on point 3. --Moonriddengirl (talk) 11:25, 23 January 2013 (UTC)
  • Hi MRG, I've been following this and have a question. I liked the first draft above, marked "Orig", though I did slightly raise my eyebrows at the phrase about indirect quotations. I'm not sure why I think so, but somewhere along the line picked up that indirect quotations are allowed - but now can't find it anywhere, making me think it's been removed or that I was mistaken. I think the question is whether or not we should allow indirect quotations. If yes, then this is probably the place to mention it, if not in text then in a see also. If not, then again, I think it's okay to mention here that indirect quotations aren't allowed (and maybe that has to be decided elsewhere). The problem as I see it is that people read this page and don't come away with enough information. At any rate, I'm very happy to see these discussions. Truthkeeper (talk) 13:24, 23 January 2013 (UTC)
Hi. :) I've watched the discussions and evolution of WP:C and WP:NFC for a number of years, and I don't think indirect quotations have ever been explicitly mentioned in either of those policies (though I'm hardly infallible, I feel pretty comfortable with that). I don't keep an eye on the guidelines related to quotations, etc., so I really don't know what's been said there. My concern is that when it boils down to it, this is only an essay, and while I think it's a valuable supplement to policy, I don't think it should be a policy leader, per se. I'm fine with it mentioning what practice generally is and what's recommended, but nervous about either explicitly permitting or denying something without a document with more gravitas backing it. It's true that policy doesn't mention it but equally true that people do it. I think that should probably be resolved elsewhere before finding its way here. That said, if we do want to go into it more than already is, maybe the thing to do is frankly disclose that. --Moonriddengirl (talk) 13:47, 23 January 2013 (UTC)
It's here: WP:When to cite. Which, as it happens, is also only an essay. I don't know where that leave us. Maybe this essay can link to that? Truthkeeper (talk) 14:02, 23 January 2013 (UTC)

As an editor, I would prefer to be able to use indirect quotation, and always thought it was allowed. But when I read and re-read WP:NFCC, it is quite explicit:

Articles and other Wikipedia pages may, in accordance with the guideline, use brief verbatim textual excerpts from copyrighted media, properly attributed or cited to its original source or author, and specifically indicated as direct quotations

We could argue that WP:NFCC does not explicitly say indirect quotes cannot be used, but I think that is grasping at straws. The legal policy says verbatim textual excerpts must be shown as direct quotations. I can sort of see that may be the safest approach, and can live with it. I agree that this essay should not attempt to define policy, but it should be consistent with policy, and can certainly give advice and opinions on good and bad practice. It is widely referenced, and the clearer it is, the less time will be wasted on arguments. I suppose indirect quotation could maybe, perhaps, possibly, sometimes be seen as a form of acceptable very close paraphrasing though. Below is an attempt at skirting the issue and toning down the advice:

(ALT4) Limited quotation from non-free copyrighted sources is allowed, as discussed in Wikipedia's non-free content policy and guideline. Quotations must have in-text attribution and must be cited to their original source or author (see WP:When to cite). With direct quotation, editors must clearly distinguish the quoted material from the original text of the article following the guidelines for quotations. Extensive use of quotation from non-free sources is generally not acceptable. Even if content is attributed, it can still create copyright problems if the taking is too substantial. To avoid this risk, Wikipedia keeps this—like other non-free content—minimal.

Quotation from non-free sources may be appropriate when the exact words in the source are relevant to the article, not just the facts or ideas given by the source. Examples may include statements made by a person discussed in the article, brief excerpts from a book described in the article, or significant opinions about the subject of the article. Quotation should not, however, be treated as an alternative to extracting facts and presenting them in plain language. Thus:

  • Right: Churchill said, "I have nothing to offer but blood, toil, tears and sweat."[1]
  • Right: The New York Times reviewer found the film "pretentious and boring".[2]
  • Wrong: According to Bulgarian Butterflies, "the patient observer may be fortunate enough to glimpse this rare moth flitting along the mossy banks of a woodland stream."[3]

Comments? Aymatth2 (talk) 14:47, 23 January 2013 (UTC)

I am much more comfortable with that, Aymatth. :) It works for me. --Moonriddengirl (talk) 14:58, 23 January 2013 (UTC)
  • Let's give it a week and see what other views are given. No hurry. (The reason I could see for WP:NFCC not allowing indirect quotation is the boundary between article and source is indistinct. "The reviewer found the film pretentious and boring" could be derived from "I find the film pretentious and boring" or perhaps from "This director has made yet another pretentious and boring film" or could be a dodgy paraphrase of "This is a pompous, overblown and tedious movie.") Aymatth2 (talk) 15:59, 23 January 2013 (UTC)
  • I like this revision as well. Also wanted to say, I'm quite impressed, Aymatth, at the way you're doing this. Truthkeeper (talk) 01:00, 24 January 2013 (UTC)
I know it was recently added to a guideline (WP:CITE was it?) but policy doesn't specify that "Quotations must have in-text attribution"... (emphasis added). Perhaps just "should" as opposed to "must"?—Machine Elf 1735 04:53, 24 January 2013 (UTC)
  • WP:INTEXT, part of the WP:CITE guideline, says:

    In-text attribution is the attribution inside a sentence of material to its source, in addition to an inline citation after the sentence. In-text attribution should be used with direct speech (a source's words between quotation marks); indirect speech (a source's words, modified [see the article], without quotation marks); and close paraphrasing. It can also be used when loosely summarizing a source's position in your own words. It avoids inadvertent plagiarism, and helps the reader see where a position is coming from. An inline citation should follow the attribution, usually at the end of the sentence or paragraph in question.

I don't know how long that has been around, and it is a guideline rather than policy, but it makes sense. It is natural to identify the speaker or writer when quoting them. But I will change "must" to "should" throughout, since an essay cannot lay down the law but can only strongly suggest. Aymatth2 (talk) 13:39, 24 January 2013 (UTC)
Don't know if it's still relevant but it's simply wrong to say "a direct quote is an extremely close paraphrase"... it's neither a "paraphrase" nor is it merely "close".—Machine Elf 1735 05:04, 24 January 2013 (UTC)
  • I am not suggesting that goes into the essay. The point I was clumsily trying to make is that direct and indirect quotation, close and loose(?) paraphrasing are all related to the same legal concept of "copying", and that is really what this essay is about. Aymatth2 (talk) 13:39, 24 January 2013 (UTC)

Just to make a note here since the question was asked at NFC - NFC has very little to do with text content (we only mention it because it gets asked enough). NFC is about making sure images and other media are used in accordance with the Foundation's requirements on non-free media which are purposely stricter than fair use would likely allow for. --MASEM (t) 17:08, 25 January 2013 (UTC)

  • It seems such a basic question: is indirect quotation allowed or not? Still, if there is no clear ruling, ALT4 (with "must" replaced by "should") is presumably safe advice. This essay is not the place to call the shot if the policy and guidelines are silent on the subject, as Moonriddengirl has pointed out. Aymatth2 (talk) 21:04, 25 January 2013 (UTC)
  • I have gone ahead and made the change: ALT4 with "must" replaced by "should". Aymatth2 (talk) 15:32, 31 January 2013 (UTC)

Proposal: When there are a limited number of ways to say the same thing

This is to propose to expand and clarify the section on "When there are a limited number of ways to say the same thing". The current version of this section reads:

(ORIG) Close paraphrasing is also permitted when there are only a limited number of ways to say the same thing. In general, sentences like "Dr. John Smith earned his medical degree at State University" can be rephrased "John Smith earned his M.D. at State University" without copyright problems. Note, however, that closely paraphrasing extensively from a non-free source may be a copyright problem, even if it is difficult to find different means of expression. The more extensively we rely on this exception, the more likely we are to run afoul of compilation protection.[1]

This is to propose clarifying and expanding the section to read as follows:

(ALT1) Close paraphrasing is also permitted when there are only a limited number of ways to say the same thing. This may be the case when there is no reasonable way to avoid using a technical term, and may also be the case with simple statements of fact.

It is acceptable to use a technical term such as "The War of the Spanish Succession" or "Relational Database Management System (RDBMS)" when the term is almost always used by sources that discuss the subject, and when such sources rarely use any other term. In this case, the technical term is considered to be "merged" with the idea expressed. There is no reasonable alternative way of expressing the idea, and since ideas are not subject to copyright the term is also not protected. However, if different sources use different terms for the concept, it is best for the article to use a different term from the source.

An example of closely paraphrased simple statements of fact is given by a biography that relies on two sources for the basic outline. The sources and the article start with:

  • Source1: John Smith was born in Hartford, Connecticut on February 2nd 1949... He attended State University, obtaining an M.D. in 1973.
  • Source2: John Smith was born on 2 February 1949 in Hartford... He graduated with a medical degree from State University in 1973.
  • Article: John Smith was born on 2 February 1949 in Hartford, Connecticut... He studied medicine at State University, and earned an M.D. in 1973.

In this example, the wording of the article is very close to that of both sources. However, the article gives only a plain statement of facts without copying any element of creativity from the sources, so should be acceptable. Note, however, that closely paraphrasing extensively from a non-free source may be a copyright problem, even if it is difficult to find different means of expression. The more extensively we rely on this exception, the more likely we are to run afoul of compilation protection.[1]

Comments? — Preceding unsigned comment added by Aymatth2 (talk) (talkcontribs) 19:18, 16 February 2013‎

Guardedly, I like it. :) I like that you put two sources to show that both use the same sequence, which helps to underline that the order in this case is itself uncreative. My guardedness is to do with the fact that this is where I most often see people pushing boundaries - using one source and following lock-step on the structure of it. If we see issues down the road, we may need to clarify some of the issues currently touched on in the footnote in the text. --Moonriddengirl (talk) 20:29, 16 February 2013 (UTC)
I am guarded about it too - and I just wrote it! I think the "technical term" idea was missing, needed, and the definition does not leave much wriggle-room. But the "mere facts" part is clearly sensitive and I am not sure the wording locks it down enough to avoid abuse. We have to face the issue squarely, which is why I used longer examples. Perhaps this section should again say what would be considered "creative expression." I do not want to rush this change through. This to me is the core section in the whole essay because, as you say, this is where editors may use "mere facts" as an excuse for blatant copying. Whatever it says has to be completely clear and unambiguous. See below for an attempt to spell out reasons why copying the mere facts is acceptable in the example: Aymatth2 (talk) 14:04, 17 February 2013 (UTC)

(ALT2) Close paraphrasing is also permitted when there are only a limited number of ways to say the same thing. This may be the case when there is no reasonable way to avoid using a technical term, and may also be the case with simple statements of fact.

It is acceptable to use a technical term such as "The War of the Spanish Succession" or "Relational Database Management System (RDBMS)" when the term is almost always used by sources that discuss the subject, and when such sources rarely use any other term. In this case, the technical term is considered to be "merged" with the idea expressed. There is no reasonable alternative way of expressing the idea, and since ideas are not subject to copyright the term is also not protected. However, if different sources use different terms for the concept, it is best for the article to use a different term from the source.

An example of closely paraphrased simple statements of fact is given by a biography that relies on two sources for the basic outline. The sources and the article start with:

  • Source1: John Smith was born in Hartford, Connecticut on February 2nd 1949... He attended State University, obtaining an M.D. in 1973.
  • Source2: John Smith was born on 2 February 1949 in Hartford... He graduated with a medical degree from State University in 1973.
  • Article: John Smith was born on 2 February 1949 in Hartford, Connecticut... He studied medicine at State University, and earned an M.D. in 1973.

In this example, the wording of the article is very close to that of both sources. However, the article merely presents standard facts for a topic like this in standard sequence. The article does not copy any creative words or phrases, similes or metaphors, and makes an effort at paraphrasing in the second sentence. Just two short sentences are close to the sources. For these reasons the close paraphrasing should be acceptable. Note, however, that closely paraphrasing extensively from a non-free source may be a copyright problem, even if it is difficult to find different means of expression. The more extensively we rely on this exception, the more likely we are to run afoul of compilation protection.[1]

Comments? Aymatth2 (talk) 14:04, 17 February 2013 (UTC)

Thumbs up from me. I think it could help avoid some of the common confusion. --Moonriddengirl (talk) 14:13, 17 February 2013 (UTC)
  • I have made this change (ALT2), since there were no objections. Aymatth2 (talk) 23:44, 23 February 2013 (UTC)

Accusation of copyright infringement

Hi. Could as many editors here as possible who are well-versed in copyright infringement offer their views in this discussion on the Bill Biggart talk page, in which an editor has accused another editor of copyright infringement? Your viewpoints would be well-appreciated to defuse this disagreement. Thank you. Nightscream (talk) 23:41, 2 March 2013 (UTC)

Responding to the meta questions there. --Moonriddengirl (talk) 00:01, 3 March 2013 (UTC)

Boxing

Was the intention to do a close of discussion box? It's not clear what the archiver is going to do with the div... --Lexein (talk) 15:37, 20 March 2013 (UTC)

  • That is a bit weird. Mizrabot did the boxing. I must have caused the problem by putting headings inside a box inside a section, which got Mizrabot all confused. Probably a </div> got archived. I have tried to sort it out by reformatting. :~) Aymatth2 (talk) 18:03, 20 March 2013 (UTC)

Large-scale constructs

See the discussion at Wikipedia talk:Plagiarism#Large-scale constructs.

Aymatth2 (talk) 20:07, 24 June 2013 (UTC)

Proposed sequence change

I propose to combine the sections "When is close paraphrase permitted?" and "Why is it a problem?" into a section called "Concepts", with the content from "Why is it a problem?" first, and to move them ahead of the "Example" section. Two reasons:

  • The flow seems more natural: why it is a problem - when it is allowed - example of close paraphrasing - how to avoid such problems - how to spot and fix them
  • The more general "Concepts" title allows for addition of sections that discuss concepts such as creative expression, moral rights, structure/sequence/organization, substantiality and so on. It gives a framework for expansion.

The proposed "Concepts" section, at this point strictly a rearrangement of the current text, would be:

(see Wikipedia talk:Close paraphrasing/Archive 2#Concepts for remainder of this section - formatting got the archive bot confused)

Aymatth2 (talk) 20:07, 24 June 2013 (UTC)

Proposal to break out "Moral rights".

I propose to move the existing "Moral rights" section to a separate page (specifically to Wikipedia:Moral rights, which currently redirects to this section). The reason for this request is that the concept of moral rights applies to more than just paraphrasing of text. It also applies to the presentation of artistic works generally. If this move is agreed to, I will then propose language relating to Wikipedia's use and reference to other content to which moral rights applies. Cheers! bd2412 T 17:28, 26 September 2013 (UTC)

Seeing no objection, I am having at it. Cheers! bd2412 T 19:57, 8 October 2013 (UTC)

Close paraphrasing to avoid accusations of biased summaries

I admit that I have closely paraphrased sources in highly controversial articles to avoid conflicts about biased one-sided summaries. I still do not know how to avoid this, except by close paraphrasing. The article in question (Sathya Sai Baba) went to arbitration twice. May be this can be treated in the essay. Andries (talk)

To avoid this, you include a fair use quotation of the actual source in the text of the article, or replace it with a public domain source. Perhaps you can write up a compelling summary of a conflict for the essay. --Hroðulf (or Hrothulf) (Talk) 10:56, 25 February 2014 (UTC)

Paraphrasing that matches Google Translate

I'm reading through Museum of Zoology of the University of São Paulo, which I think was mostly written by a native speaker of Portuguese. The main source is in Portuguese. Since I don't speak Portuguese, I'm using Google Translate to check what the source says, and in a couple of cases I see some strong similarities between the Google translation and the text in the article. However, I've no reason to believe the editor used GT; it's just as likely that the text was translated ad hoc by the editor, and they happened to pick similar vocabulary. I'm inclined to let this go, but I wondered if anyone else had dealt with a similar issue. Mike Christie (talk - contribs - library) 16:27, 13 April 2014 (UTC)

Copyright and "specific terms"

Regarding this text:

".....acceptable to use a technical term such as "The War of the Spanish Succession" or "Relational Database Management System (RDBMS)" when the term is almost always used by sources that discuss the subject, and when such sources rarely use any other term. In this case, the technical term is considered to be "merged" with the idea expressed. There is no reasonable alternative way of expressing the idea, and since ideas are not subject to copyright the term is also not protected.....

That's a correct statement of the law, but it seems a bit misleading. It seems to imply that the reason the term is not copyright protected is that "ideas are not subject to copyright", and that therefore a term that is not "merged" with the idea might somehow be protected by copyright.

It is certainly correct to say that "ideas are not subject to copyright."

However, I would think that a specific term -- such as "The War of the Spanish Succession" -- is generally not protected by copyright at all. A one-word term, or even a six-word term, is not generally considered enough of a creative work to be covered by copyright. (Similarly, the title of a work generally is not protected by copyright, which is why you see many songs with the same title.)

Thoughts, anyone? Famspear (talk) 04:31, 15 July 2014 (UTC)

I agree with Famspear. Terms commonly used, say in history books, are not covered by copyrighted. Names cannot be copyright either. Rjensen (talk) 04:42, 15 July 2014 (UTC)

German Translation

It would be fine to read this in German. Thanks, --Markus (talk) 06:50, 14 September 2014 (UTC)

Beechwood (Vanderlip mansion) - resolving Close paraphrasing concerns

Hello,

I'm working on a Wikipedia:Good article review of Beechwood (Vanderlip mansion) and the only outstanding issue is GA page - close paraphrasing.

It would be great to have input to determine:

  1. If the verbiage solves the close paraphrasing concern for the content from the New York Times article from this version comparison.
  2. Your thoughts regarding next steps for reviewing the books sources.

If someone could help out, it would be much appreciated!--CaroleHenson (talk) 02:56, 28 September 2014 (UTC)

Is close paraphrasing acceptable?

Opinions are needed on the following matter: Wikipedia:Village pump (policy)/Archive 116#Is close paraphrasing acceptable?. A WP:Permalink to that discussion is here. Flyer22 (talk) 19:23, 1 October 2014 (UTC)

Creativity

I have removed this as this is not in line with copyright practice or policy. This essay already quotes Feist v. Rural, which notes that "To qualify for copyright protection, a work must be original to the author.... Original, as the term is used in copyright, means only that the work was independently created by the author (as opposed to copied from other works), and that it possesses at least some minimal degree of creativity.///. To be sure, the requisite level of creativity is extremely low; even a slight amount will suffice. The vast majority of works make the grade quite easily, as they possess some creative spark, "no matter how crude, humble or obvious" it might be.... Originality does not signify novelty; a work may be original even though it closely resembles other works, so long as the similarity is fortuitous, not the result of copying." (See [1], citations omitted.) There is no requirement to show that you are using distinctive language for the very first time to establish copyright. Most non-fiction easily qualifies for copyright protection due to the extremely low requirement for creativity, even if they closely resemble other works. While creativity is required, there is no burden to "demonstrate" creativity. --Moonriddengirl (talk) 23:47, 1 October 2014 (UTC)

Then let's drop all that nonsense about creativity from the text. Rjensen (talk) 00:59, 2 October 2014 (UTC)
It's not nonsense - it's just nuanced. :) Unfortunately, so is the copyright law that we're trying to respect. I don't particularly care about the Belloc stuff myself. I have no issue with its removal. --Moonriddengirl (talk) 01:19, 2 October 2014 (UTC)
agreed--my point is that it's totally irrelevant to the minimal definition of creativity....likewise the warning that editors will get in trouble if they use "fancy" language is false & has nothing to do with the topic here. Rjensen (talk) 01:21, 2 October 2014 (UTC)
Yes, I agree. And it's beyond the scope of the essay to concern itself with the writing style in articles. All that it should care about is whether the fancy language is too closely paraphrasing somebody else. --Moonriddengirl (talk) 01:27, 2 October 2014 (UTC)

Some general copy-edits

In this series of edits, I have made some changes to this essay that are unrelated to the current discussion that I hope will be uncontroversial. Major points of alteration include:

  • Softening the language on moral rights. Rjensen is quite right that we have no official policy on this, and we should not imply that we do. I think it's kind of only tangentially related to the topic of this essay, but can see the point of including it under the right to attribution.
  • I softened the language that said "When using a close paraphrase legitimately, citing a source is in most cases required and always highly recommended." This is not true. When we closely paraphrase from material that is unusable as a source on Wikipedia, we cannot cite it. (Of course, we should only do this when the material otherwise meets our policies and guidelines, but it happens more often than you'd think. For instance, I frequently see close paraphrases of other Wikipedia articles. It requires attribution to meet license terms, but not citation.)
  • The "substantial similarity" section has been overhauled. It was written somewhat backwards, I believe. Fair use is an affirmative defense - while the amount and substantiality test does apply to it, substantial similarity is a different concept; it is established first before the fair use doctrine is considered. See [2], for instance.
  • I replaced what I feel was some rather vague language about the merger doctrine with just some straightforward and cited material from the copyright office itself. If we think elaboration of the concept is appropriate, I don't have any objection, but I wouldn't want to overwhelm the usefulness of the document with such elaboration.

Again, I think these will be uncontroversial. I'm happy to discuss it if I am wrong. My focus on this document has typically always been to make sure that the advice it gives does not contradict copyright policy (to help avoid trouble if it does), and it's been a while since I've done a general read-through. --Moonriddengirl (talk) 11:08, 2 October 2014 (UTC)

  • I am not comfortable with the change on the merger doctrine. Names, titles, or short phrases cannot be copyrighted as stand-alone works. I cannot copyright "John Smith", "Jurassic Park" or "It's finger lickin' good!", although I could try to register them as trademarks. However, an article that used many of the same short phrases as a source might be seen as substantially similar to the source, unless those were standard jargon phrases common to any description of the subject. I see this as a different concept. Aymatth2 (talk) 20:07, 12 June 2015 (UTC)
  • Aymatth2: "Copyright law does not protect names, titles, or short phrases or expressions" is taken direct from the US copyright office publication to which it was sourced; as US copyright law governs Wikipedia, I really don't understand any objection to it. :) I believe your change adds some confusion - "Relational Database Management System (RDBMS)" is not a technical term; it is a name. We can copy all the names we want, unless other factors of creativity are involved. For instance, we can copy a Filmography from IMDb with impunity regardless of the common use of titles with our source; we simply can't copy somebody's "Best of films" list. --Moonriddengirl (talk) 13:32, 13 June 2015 (UTC)
  • The US Copyright Office circular 34 says that "To be protected by copyright, a work must contain a certain minimum amount of authorship in the form of original literary, musical, pictorial, or graphic expression. Names, titles, and other short phrases do not meet these requirements." In other words, they are too short to be copyrighted as stand-alone works. However, a book can be copyrighted even though it may be seen as a large collection of short phrases. It exceeds the "minimum amount of authorship". Substantial copying of short phrases from the book would be a copyright violation.
The merger doctrine, under which a phrase is not considered original expression if it is essentially the only way to identify the concept, allows the use of names, titles and short jargon phrases. We do not have to use circumlocutions to avoid using the name of a film director or the title of a film, and we can freely use technical terms like "inguinal hernia", "unconformably overlies" or "irrational number" because those are standard jargon. But we should not give the impression that an author can freely copy numerous short phrases from the source without violating copyright. Aymatth2 (talk) 14:03, 13 June 2015 (UTC)
"Relational Database Management System (RDBMS)" is a technical term for a type of database management system that implements the relational data model. There are many different RDBMS's, with names such as MY-SQL, Oracle, SQL Server and DB2. Aymatth2 (talk) 14:33, 13 June 2015 (UTC)
I stand corrected on the name of the server. However, it depends on what the short phrases are. Sometimes we can freely copy numerous short phrases from a source. You can produce a discography of every album released by an artist even if you are copying it from your source. I can't copy the definitions from a dictionary of idioms, but they don't own the idioms they are defining; unless creativity has gone into the selection of the idioms themselves ("the best idioms"), the idioms themselves are free for reproduction. I appreciate your caution here, but at the same time I think removing the sourced sentence makes the sentence in the paragraph above referring to titles misleading. You don't have to avoid using titles. So, I've removed that bit and added some clarification below - I don't think we serve the readers of the essay by not giving them that fact, although I of course agree that context is important. In my time on Wikipedia, I have dealt far more frequently with people who think they can copy creative content because as they see it is in't creative than the other way around - but the other way around does happen. --Moonriddengirl (talk) 13:52, 14 June 2015 (UTC)
  • The statement that "Copyright law does not protect names, titles, or short phrases or expressions" is dangerously misleading, since it ignores the context. The Copyright Office is telling people not to try to register copyright in a given name, title or short phrase, because these are too short to be works covered by copyright. "The Copyright Office cannot register claims to exclusive rights in brief combinations of words ... To be protected by copyright, a work must contain a certain minimum amount of authorship." Try next door, at the Trademark Office.
However, an author can claim copyright protection of a longer work that contains names, titles or other short phrases. Short phrases in that longer work are copyright protected if they are creative expression. Editors can copy names, titles and jargon phrases that are covered by the merger doctrine, but should not copy short phrases with creative expression. These short phrases will be copyright protected. To say that short phrases are not protected, only the way they are arranged, is simply incorrect. Aymatth2 (talk) 15:03, 14 June 2015 (UTC)
Obviously, I don't believe that it is out of context, including because it links to the original and people can read it for themselves. More content is available in Circular 1, however, which does more modestly indicate that among works generally ineligible for federal copyright protection, "titles, names, short phrases, and slogans; familiar symbols or designs; mere variations of typographic ornamentation, lettering, or coloring; mere listings of ingredients or contents". If you feel that more context will be helpful, that's one thing, but removing it seems to me likely more misleading than including it. Personally, I would be happy to eliminate the "short phrases" since that typically leads to more trouble than anything else, but it's part of the quote and I'm not comfortable cherry-picking. Neither am I comfortable with a document that may lead people to believe that they cannot copy over a discography or include a list of a biographical subject's job titles because their source does. --Moonriddengirl (talk) 15:33, 14 June 2015 (UTC)
  • It is important that we mention the merger doctrine. Editors writing on technical subjects should not feel they have to paraphrase standard jargon, which could lead to ridiculous results. Obviously the merger doctrine applies to names and titles. The title "Star Wars" is merged with the concept of the film by that name, and is the only way to identify the film, so copying the title from the source is fine. Ditto with names of people, companies etc. I will tweak the essay to make that clear.
A short phrase is not eligible for copyright protection as a work, in the sense that anyone using that short phrase in any circumstances is violating the author's copyright. This is clearly the sense of the circulars. But if editors copy short phrases containing creative expression from a longer copyrighted work, they may well create substantial similarity, discussed elsewhere in the essay. The wording seems to say editors may copy short phrases. It is unlikely that any reader would follow the citation to see the context of what the Copyright Office actually said.
I feel strongly that "As the United States Copyright Office explains, "Copyright law does not protect names, titles, or short phrases or expressions."[4]" should be dropped. This statement opens a can of worms and can only cause problems. The sentence that follows, with the footnote, should be moved up to the section on "Creative expression," tweaked to replace "titles, idioms, jargon or even people's names" by "facts". Aymatth2 (talk) 17:02, 14 June 2015 (UTC)

I've restored the lang-standing footnote pending consensus for its removal. If we want to modify content for clarity, that's fine, but I strongly do not believe that this important information should be removed in the meantime. Copyright on short phrases is highly nuanced - as per [3]. And if you replace "idioms" with "facts" you've changed the meaning of the sentence. Idioms may or may not be facts, but they are common property. --Moonriddengirl (talk) 17:18, 14 June 2015 (UTC)

Thanks for clarifying that you moved the footnote; I'm afraid I found your edit summary confusing and somewhat alarming. :) --Moonriddengirl (talk) 17:25, 14 June 2015 (UTC)
  • I have moved the statement that short phrases etc. cannot be copyrighted up to the section on "Substantial similarity", where it belongs if it belongs at all. It does not belong in "When there are a limited number of ways to say the same thing", since the statement is independent of how easy it would be to paraphrase. I have also expanded to provide more context. I intensely dislike leaving this statement in the essay, but at least this way it is followed by the relevant caution. Aymatth2 (talk) 21:20, 14 June 2015 (UTC)

Creative expression

I have restored the Belloc examples, which seem to have been removed with little or no discussion. To simply say "the test of creativity is minimal" gives no guidance to editors. The most common forms of creativity in non-fiction are fanciful words, figures of speech, metaphors etc. Examples help, particularly one that shows that a violation of creative expression may use completely different words from the original. I propose to also add a paragraph on translation from foreign languages:

A literal translation from a foreign language is a form of paraphrase, since all the words or phrases have been replaced with equivalent English-language words or phrases. This may or may not be acceptable, depending on whether any creative expression – anything other than simple statements of fact – has been taken from the foreign language source. For example, consider two literal translations from the Turkish language:

  1. "Seen through smog, the sun appears red"
  2. "The sun looms through the haze like a red omen"

The first is a simple statement of fact and should be acceptable. The second carries over the figurative expressions "looms through" and "like a red omen", so presumably is not acceptable despite using completely different words from the original.

Comments? Aymatth2 (talk) 01:28, 20 June 2015 (UTC)

  • Since there is no objection, I will make the change. I think it is important to give examples, but any ideas on better examples would be welcome. Aymatth2 (talk) 00:06, 29 June 2015 (UTC)
    • Apologies for not seeing the note. I'm not comfortable with the example given for several reasons and have removed that bit. The first sentence contains fact, but has creativity in formulation. The facts can be expressed in many ways. One could say, "The sun appears red through smog" or "Smog changes the light filtering from the sun so that it appears red" or "Smog adds a red appearance to the sun", for instance. The spark of creativity may be minimal, but it exists. A literal translation of a single sentence is not likely to be much of a problem, but the more you translate and the more closely you translate, the more likely you are to create a copyright problem. I think more nuance is required there to avoid misleading readers of this essay. --Moonriddengirl (talk) 13:23, 29 June 2015 (UTC)
      • I agree, the example is not really addressed to the problem at hand. bd2412 T 13:38, 29 June 2015 (UTC)
        • Usually a "literal" translation is not word-by-word but includes changes to sentence structure and sequence. The French tend to put adjectives after nouns rather than before (an omen red), and the Germans to put qualifying clauses at the front, and the main verb last (Though smog seen the sun red appears). I agree there maybe should be an added caution. But examples are also really helpful. How about the following?

Translation from a foreign language is a form of paraphrase, since all the words or phrases have been replaced with equivalent English-language words or phrases. This may or may not be acceptable, depending on whether any creative expression – anything other than simple statements of fact – has been taken from the foreign language source. For example, consider two translations from the Turkish language:

  1. "Istanbul is a large city"
  2. "The sun looms through the haze like a red omen"

The first is a simple statement of fact and should be acceptable. The second carries over the figurative expressions "looms through" and "like a red omen", so presumably is not acceptable despite using completely different words from the original. But even if you only carry across statements of fact, the more you translate and the more closely you translate, the more likely you are to create a copyright problem.

@Moonriddengirl: @BD2412: Comments? Aymatth2 (talk) 15:46, 29 June 2015 (UTC)
I would be comfortable with that, Aymatth2. --Moonriddengirl (talk) 00:37, 30 June 2015 (UTC)
  • Since nobody else chimed in, I have made the above change. Aymatth2 (talk) 18:11, 7 July 2015 (UTC)

Hillaire Belloc example

But use of the phrases "indolent expression" and "undulating throat" would violate copyright. - it most certainly would not. To quote from the Wikilegal "the amount and substantiality of the portion used in relation to the copyrighted work as a whole"

Moreover the phrases have both been used before: E.G.

I love thy mellow note,
Pealing, so beautifully, from the spray
Whereon thou sitt'st with undulating throat,
Chaunting thy matins to the dawning day.

(1824) All the best: Rich Farmbrough, 19:08, 5 November 2015 (UTC).

  • The phrase "indolent expression and undulating throat", which defines the similarity between the two types of beast, may be seen as the essence of this short work. An article using the terms could be said to have "appropriated almost verbatim the most creative and original aspects" of the work. See Wainwright Securities Inc v. Wall Street Transcript Corporation (18). We should warn editors to avoid copying any fanciful figures of speech. A lawyer friend of mine says you never know what a judge will decide. Best to err on the safe side. This is meant to be an encyclopedia. Boring. As for the thrush poem, I am generally opposed to burning books, but in this case ... Aymatth2 (talk) 23:11, 5 November 2015 (UTC)
  • "You never know what a judge will decide" is a very different animal from "would violate copyright." --Moonriddengirl (talk) 12:44, 6 November 2015 (UTC)
  • Maybe it should say "could violate copyright" then. It is exactly the sort of similarity of phrasing that typically gets jumped on at DYK, with good reason. A short phrase cannot in itself be copyrighted, but using the same uncommon phrase in the same context (e.g. description of a lama) suggests copying. The judge then decides whether there is "substantial" copying of the creative part. She will not say "anything up to 23% the same" is acceptable: the Wainwright v. Transcript case hinged on similar wording of a small but central part of the work copied. There is no way to predict the decision, but to reduce risk it is best to not copy any fanciful wording. Aymatth2 (talk) 14:18, 6 November 2015 (UTC)
  • Original: And second, he says that likely to aid comparisons this year was the surprisingly limited extent to which Fiber Divisions losses shrank last year.
  • Paraphrase: The second development likely to aid comparisons this year was the surprisingly limited extent to which the Fiber Division's losses shrank last year.
This is a run of 19 words, and only one of the taken sections. It is also notable that the judgement is somewhat flawed, describing the appellant's acts as "unprincipled chiselling" which casts doubt on the judge's unbiased interpretation of the law. Some of the obiter from that judgement are very unfortunate.
All the best: Rich Farmbrough, 21:00, 7 November 2015 (UTC).
First a red herring: There's a huge difference between "indolent expression and undulating throat" and "indolent expression" and "undulating throat" - also the context is important.
However there is little doubt that that quoting five words from a poem, even a short poem, would not constitute copyright infringement. It is established law that quotation for commentary is permitted, though there is uncertainty over amount and proportion, there is no reason to adopt extreme measures. A good explanation of the de facto situation can be seen at When Quoting Verse, One Must Be Terse by David Orr. Notably Orr cites the "three or four line standard" as "playing it safe". The "two word" or "less than five word" standards don't even get a look in.
All the best: Rich Farmbrough, 21:00, 7 November 2015 (UTC).
  • Maybe the judge made an error in Wainwright v. Transcript, but it is a sample judgement. Other judges may make the same error. With Salinger v. Random House, Inc. an example of close paraphrasing was
  • Original: "He looks to me like a guy who makes his wife keep a scrapbook for him"
  • Paraphrase: "[Salinger] had fingered [Wilkie] as the sort of fellow who makes his wife keep an album of press clippings."
The wording is different, but the creative concept – more than the facts – is carried across. Perhaps the Belloc example is not great. The idea is that the verse is being used as a (dubious) authority on Lamas, not as a poem with an excerpt quoted for the purpose of critical commentary. The principles that the essay should convey, with examples, is:
  • Avoid reproducing fanciful wording, because that may be interpreted as violating copyright
  • Even if you change the wording, do not copy fanciful concepts, ditto.
Are there better examples that can illustrate the same principles? We are trying to warn editors to stay well on the safe side, even if they could get away with more. Better examples? Aymatth2 (talk) 00:44, 8 November 2015 (UTC)
Our article on the case states "However, the essay illustrates that a judge may be tempted to use copyright law to support an objective other than simply protecting commercial rights." - which is precisely my point above.
Importantly, though, that case is about unpublished works ("the scope of fair use is narrower with respect to unpublished works") and is about rights to expressive content not about literal reproduction.
The scope of copying again is not a dozen words but "often more than ten lines of one letter had been copied in this way".
The extensive paraphrasing was deemed (probably quite rightly) to impact on Salinger's financial interest in his unpublished works
Moreover In 1992 the Copyright Act was amended as a result of the Salinger case to include a sentence at the end of §107 saying that the fact that a work is unpublished "shall not itself bar a finding of fair use if such finding is made upon consideration" of all four fair-use factors.
Now as to how we should advise our editors, it's a complex field and no-one is prepared to give hard and fast guidelines - quite rightly. I would suggest that we give some examples of guidance - and link to them.
Certainly we should also not shy away from our strengths, in the case of commentary on poetry, for example the use is transformative, and it may be legitimate to quote the entire work, especially if it is short. (It is certainly accepted as legitimate to quote the title, even when that is longer than the poem.) Conversely if we take a large amount of text from a reference work for the same purposes, it is not transformative. Fortunately there is little or no expressive content in, for example, reference biographies. In those cases we needn't worry ourselves over-much with whether "close paraphrasing" has occurred (especially if we adhere to NPOV).
As for examples:
  • In general I think the examples in Salinger v Random House make a better illustration than the Belloc one. (It would be better still to have examples that were not from an unpublished materials case - and preferably post 1992.) It should be made clear that on the one hand it took many such examples to constitute a copyright infringement, but on the other it was in the context of a book, not an article.
All the best: Rich Farmbrough, 17:03, 8 November 2015 (UTC).

Some stuff that is more relevant to direct quotes

Here are some quantitative guidelines I have seen:

  1. Anything 10 words or less, almost regardless is going to be fine.
  2. 3 or 4 lines
  3. Up to a quarter of a short poem, 5% of a long one
  4. 2-300 words from a book

I prefer, though, the guidance from The Poetry Foundation (and The Program on Information Justice and Intellectual Property and The Center for Social Media,)

The principles are all subject to a "rule of proportionality." The fair use rights of poets, teachers, scholars, and others extend to the portions of copyrighted works that they need to accomplish their goals. Thus, while in some cases fair use may extend to an entire work, in others relatively brief portions may constitute "too much." Importantly, there are no numerical rules of thumb that can be relied upon in determining whether a use is fair. Code of Best Practices in Fair Use for Poetry

- A document written in response to poet's "general sense that their ability to do their work with confidence was often impeded by institutional regulations based on very straitened interpretations of copyright."

All the best: Rich Farmbrough, 17:03, 8 November 2015 (UTC).

  • I got involved in a discussion a while ago over an article I started on El emigrante, a very short story. I nominated it for DYK and it went on the front page, then got slammed for reproducing the story in its entirety. All four words. The text of the work was deleted, but I could not resist starting an article on ¿Olvida usted algo? – ¡Ojalá!, a work of installation art with a four-word title. Later some bold editor put the text back into the El emigrante article. The first three words appear on public signs all over the place. The story is probably too short to be covered by copyright, and unlikely to go to court. But you never know. Aymatth2 (talk) 20:39, 8 November 2015 (UTC)
  • Numbers do not really work. Best to keep quotes short, and use them only when the precise wording is relevant. A quote of a public statement by a dead politician is safer than a quote from a work of fiction by a living author. Work that has not been published can legally be quoted to a limited degree in the right circumstances, but this essay just says "don't do it". With a typical article that draws facts from a source it depends on whether the judge thinks there is "substantial similarity". The shorter the source, the longer the amount copied and the closer the wording, the more likely. There is also the idea of the "essence" of the work, the core part, having much more weight. Plain and simple statements of fact are always safest. Aymatth2 (talk) 20:39, 8 November 2015 (UTC)
    • The type of material under consideration is important too. One cannot discuss poetry without quotation. "Once, as the snow of the year was beginning to fall" is very original and there is no reason to paraphrase. That is why the Belloc example is poor.
    • For other material the exact wording is sometimes important. "Never in the field of human conflict was so much owed by so many to so few"
    • In other cases still, we want to show that a source supports our statements, in footnotes. This is a widely used technique in academia. And the extent of the quotes often reaches a paragraph of 200 or more words.
    All the best: Rich Farmbrough, 14:20, 11 December 2015 (UTC).
  • The section on Wikipedia:Close paraphrasing#Quotation of non-free text could be tweaked to say that a discussion of a poem may include short quotations, since the exact words are relevant. Agreed that the Belloc example is not great: the fact that it is a poem obscures the idea that it is hypothetically being used as a source of facts. What we need is an example that illustrates a) directly copying a creative choice of words and b) copying a creative figure of speech using different words. Academics can perhaps get away with reproducing excerpts from their sources. It is safer for us to just identify the source, with a url if there is one. The reader has to assume that the cited source supports the statement. Allowing quotation of non-free content to show it supports the statements in an article opens the door to articles that are just cut-and-paste from non-free sources. Not worth the risk. Aymatth2 (talk) 14:55, 11 December 2015 (UTC)

"with or without quotation marks"

Saw this thread come across AN and figured the was struck by what the user quotes from this page: "Limited close paraphrasing is appropriate within reason, as is quoting (with or without quotation marks)". This seems misleading. Quoting without quotation marks is almost always a bad idea. The only two exceptions I can think of off-hand are:

  1. if you're quoting something like, say, a statistic or otherwise a combination of words that does not have a "minimal degree of creativity" (or however we'd like to describe what qualifies for copyright in the first place -- though you'd still want to attribute where you got it, of course);
  2. block quotes

There may be another -- even something obvious I'm overlooking -- but I think it's highly likely this could be misunderstood as "quotation marks are optional", which they are not. I'm going to go ahead and boldly remove that parenthetical. — Rhododendrites talk \\ 13:27, 1 October 2016 (UTC)

  • I support that change to the lead. The section on Quotation of non-free text says, "With direct quotation, editors should clearly distinguish the quoted material from the original text of the article following the guidelines for quotations." There is no need to repeat those guidelines here. Aymatth2 (talk) 14:29, 1 October 2016 (UTC)
  • A case that probably does not need to be spelled out here is quoting someone who spoke in a foreign language. Putting the translation in quotation marks would be misleading, since those are not the exact words, but it may be important to be as close as possible to what they said. Thus, Marilene Ramos said BR-319 could not be treated as a conventional paved road, with no controls. This is a very close paraphrase, a literal translation from the Portuguese original, but quotation marks would be misleading. I think. Aymatth2 (talk) 00:17, 8 October 2016 (UTC)

Compilation protection

" Note, however, that closely paraphrasing extensively from a non-free source may be a copyright problem, even if it is difficult to find different means of expression. The more extensively we rely on this exception, the more likely we are to run afoul of compilation protection." - Can someone please add a citation for this, from what I have seen the compilation issue is mostly for compilations that are trying to protect some kind of copyright like databases. It could include, for example, creating an article "American literature" that copies an anthology of selected works that are in the public domain. From what I have been able to find out, it does not really have anything to do with close paraphrasing. This policy could really use some clarification and improvement.Seraphim System (talk) 20:19, 1 January 2018 (UTC)

Guideline?

I really think it's overdue for this to be brought up to guideline standards. I just found out that this is still an essay and I am very surprised. I always thought it was a policy based on how often it is cited in very serious discussions as something sanctionable. I don't know if there's interest but I think many editors have found it extremely frustrating to be facing sanctions and then be referred to an incomplete essay that hasn't been vetted by the community. I have a law background, so if there is interest from others I think we try to should get it done since this essay is highly cited in the context of sanctions discussions. Seraphim System (talk) 23:35, 1 January 2018 (UTC)