Feature Proposal: Add controls and documentation to reduce infringing topics.

Motivation

It is possible for users to post infringing material into topics. Some of this is best handled through "user education" but there are some things that Foswiki can do to make it more difficult.

In addition, some extensions have a default configuration that can cause infringement.

Description and Documentation

Provide uniform filtering and control of markup that displays content from external sites. This includes:

  • INCLUDE URLs
  • inline URLs. (This could block a lot of spam if fw refused to render http links
  • inline img src links
  • IMAGE macros for external images
  • redirect URLs
  • Interaction with Interwiki links

Include configuration settings:
  • HREF whitelist / blacklist: Defines rules for rendering <a href ... links
  • IMG whitelist / blacklist: Defines rules that permit inclusion of external images into &=lt;img src= tags, or in the ImagePlugin cached / non-thumbnail images.
  • REDIRECT whitelist / blacklist: Rules for redirect

INCLUDE of URLs need some further thought. We need some reliable mechanism to be sure that an include cannot recursively render the same page, even indirectly. Include of URLs is very dangerous and can DoS the server when enabled.

Related proposals: ConfigurableURLIncludes

Examples

Impact

%WHATDOESITAFFECT%
edit

Implementation

GuardAgainstCopyrightInfringement

This covers the Foswiki core as well as plugins such as ImagePlugin that allow to embed images into the wiki by rendering img-src markup. A copyright violation occurs when img-src either deep-links to 3rd party material as well as when mirroring external material hosting it locally. Same holds for iframing or INCLUDE-ing.

On the other hand Foswiki, when used in the open, should provide means to protect others to deep-linking content originally owned by Foswiki authors and covered by respective copyrights. Foswiki should be able to return a HTTP 403 Forbidden Access code if an image is referred to by an external site.

Discussion below from another topic:

Removed the ImagePlugin task from the list of pending security work as it is clearly not security related. Mirroring external images is required to produce thumbnails. Put in other words: there's no way not to download an external image yet still want to compute a thumbnail for it.

-- MichaelDaum - 11 May 2014

I guess you are right, it isn't really a security issue in the traditional sense. However I don't agree that it isn't an issue that should be addressesd.. I was finding our web hosting spam / drug ad images. Someone inserts a link to it, and we helpfully locally mirror it and create thumbnails. It took me quite a while to track down that they were being uploaded and attached by a plugin. ImagePlugin. It's not just a thumbnail, the entire image is mirrored, without any regard to content or copyright. It exposes the site to possible takedown request, and in some cases IP blocking or even loss of domain. I consider wholesale copyright violation a significant security risk. Creating thumbnails might be considered fair use. Mirroring the full resolution images is not.

It's fine that this plugin does this, but it certainly should NOT be the default behavior. The admin should need to decide that mirroring is acceptable for their individual circumstances, not just have it "happen" when the plugin is installed. i also think it's safer if external mirroring was an "opt-in" list instead of an exclusion list.

-- GeorgeClark - 11 May 2014

I have to agree with George here; the potential for a copyright violation takedown request, or worse, is significant. Whether they are thumbnails or not, copying and hosting copyright images is unacceptable.

-- CrawfordCurrie - 12 May 2014

So what do you think should the following code do:
%IMAGE{"http://link-to-copyright-protected-image.png" size="200"}%

For now, it downloads the image, generates a thumbnail and then insert an HTML img tag.

Compare this to the way Foswiki deals with any link to an external image by its own.

The result is the same for both in that they display an HTML img tag on the page.

Well there is a difference:

<img src="%ATTACHURLPATH%/copyright-protected-image.png" width="200" height="200" />

versus

<img src="http://link-to-copyright-protected-image.png"  />

Aren't they both forbidden?

My point here is: it is not ImagePlugin or Foswiki that is violating copyrights: it is the user embedding copyright-protected material on his web page.

The means he is using for these actions don't matter.

-- MichaelDaum - 12 May 2014

There is a subtle difference. The two rendered pages may appear the same, but the latter (I think is called "deep linking") doesn't actually host the content on the foswiki server. A "copyright enforcement bot" looking for illegal signatures fetches the image from the authorized source. I guess court rulings on this deep link practice have been mixed.

The %IMAGE tag however is clearly a copyright violation. The offending image is now hosted on the foswiki server. Nothing fuzzy about it.

-- GeorgeClark - 12 May 2014

Read here and here about embedding images from 3rd party sites. There's no legal difference despite common sense considerations about the physical location of image material. What counts is that the image appears to be part of the web page it is visible on, no matter where this image is technically hosted.

Which means: we even have to disable generating img-src code in the core for 3rd party image links. Or better integrate and extend some of ImagePlugin 's features to the core.

These are:

  1. {AutoAttachExternalImages} attach downloaded images yes/no ... Problem: the image is always downloaded onto the Foswiki server no matter what the user specifies here; if enabled according META data is added to the topic; if disabled the image hangs around in the /pub/ area as is; if the image isn't allowed to be downloaded, rendering thumbnails is not possible and thus should be disabled (not sure if that's what the user expects).
  2. {RenderExternalImageLinks} render img-src links yes/no ... Problem: Foswik still generates an img-src link no matter what the user specifies here ... if enabled ImagePlugin does, if disabled the Foswiki core does
  3. {Exclude} regex of external links to exclude from being processed ... Problem: there should be an additional {Include} parameter to fine-tune which sites are okay to embed their images.
  4. ImagePlugin does not download process or render any img-src when the link is surrounded by <noautolink> markup ... Problem: the Foswiki core still generates an img-src link; so the user doesn't have a chance not to violate copyrights on a per-url decision inside the wiki text on his own.

My proposal: add (2-4) to the core. Improve ImagePlugin according to (3).

-- MichaelDaum - 12 May 2014

We could probably use a more unified treatment of:
  • INCLUDE URLs
  • inline URLs. (This could block a lot of spam if fw refused to render http links
  • inline img src links
  • IMAGE macros for external images
  • redirect URLs

Are there any others?

Regarding IMAGE, rendering only a low resolution size restricted image is probably okay... how does Google image search get away with it. That could be done by pulling the image into /tmp and discarding it after generating the thumbnail. Maybe recording enough info so that a refresh could get a 302 if it isn't changed.

-- GeorgeClark - 12 May 2014

This is what the courts say about Google Image Search: No copyright infringement by Google through display of thumbnails.

There are two contradictory behaviors of image owners on the web:

  1. authors add meta data to images for a search engine to index them accordingly
  2. copyright holders try to prohibit images to be altered or displayed without the express permission; computing a thumbnail is a form of editing images and thus forbidden as well

The courts decided by weighting (1) stronger. However the reasoning followed yet a different trajectory: there are technical means to prevent deep image linking, i.e. by returning a HTTP 403 error code analyzing the referrer address. As the author didn't take steps in these directions she lost the case against google.

... besides any different verdict would have been a neck-breaker for google and any other search engine, all of them making money with other's content. Even more so youtube or facebook. Facebook is a good example of iframing and deep linking images en mass on everybody's timeline. The big sites somehow are allowed to display other's content for the purpose of making it public. Small sites or even personal wikis are forbidden to deep link images or even host them for the purpose of thumbnailing them. Who said things are fair.

Coming back to Foswiki: all we can do is (a) give users the means to protect their own images (b) let them decide on their own whether or not they want to violate copyrights, at least not putting them in a bad situation without their knowledge. ImagePlugin comes pretty close to it, not so the core.

I've rewritten the chapter about ImagePlugin to be more generic and to cover the case of Foswiki being on either end of a potential copyright infringement.

-- MichaelDaum - 12 May 2014

After a bit of research, I can see how the likes of Google have resolved this problem in the USA, through use of the DMCA (Digital Millenium Copyright Act) Safe Harbor http://www.chillingeffects.org/dmca512/faq.cgi

While this is pretty much reflected in law for all WIPO signatories, the actual takedown process is different for every country. (F***ing politicians/lawyers).

Google has pretty complete terms of service (as you'd expect) that include a process for takedown that is consistent with the DMCA. My reading is that anything that Google links or caches is subject to this process. If a copyright owner feels a link or cache violates their copyright, they can issue a takedown notice. https://support.google.com/legal/troubleshooter/1114905?hl=en

What has not been resolved in law (and probably never will be) is the deep linking question. This is untested in the courts of many countries. Best efforts and US precedent are likely to apply, however, in any determination. Our best bet is to comply with the DMCA Safe Harbor and acknowledge and support takedown notices (whether we ever get any or not).

This requires us to filter outbound URLs. The URL to be filtered has to be specified by the takedown notice. Since the frequency of such notices is likely to be very low, I think all we need to do is to publish a page in our terms of service that states we will take down any content at the copyright owner's request.

-- CrawfordCurrie - 13 May 2014

Regarding takedown notices: in Germany, there is no such process. The usual way in which unscrupulous rights holders go about things is (keeping in mind I'm not actually a lawyer, so don't take this as gospel) to have a lawyer send a formal letter called "Abmahnung". It basically contains a form for you to sign that has you promise you won't ever infringe on the rights holder's IP again (in turn the rights holder basically promises to not sue you), and a sizable bill from the lawyer for taking the five minutes to send that thing. I hear it's a huge pain dealing with these, especially if you're an individual without the resources to get into a legal fight. Consequently there's a whole industry around Abmahnungen – allegedly some lawyers make their money doing nothing but that.

So, relying exclusively on the protective power of takedown notices may not be everyone's cup of tea. I believe that whether there's any such thing as Safe Harbor is still up for legal debate in Germany. This means that further work on preventing accidental hosting of external images will probably be appreciated.

-- JanKrueger - 13 May 2014

Indeed. However my reading (which may well be ill informed) is that the concern lies in the hosting country, in this case the Netherlands, where takedown notices are enshrined in a code of practice (though not in law AFAICT). Yes, further work on hosting external links/images is a good idea; but the least we can do is respond reasonably to takedowns.

http://www.plagiarismtoday.com/2006/05/15/us-vs-europe-notice-and-takedown/ says: Article 14 of the [European Declaration on Electronic Commerce] is the one that deals with Web hosts directly. The five paragraph, three-item article simply states that hosts are not liable for copyright infringement on their servers if they are not aware of the infringement and “upon obtaining such knowledge or awareness acts expeditiously to remove or to disable access to the information.”

-- CrawfordCurrie - 13 May 2014
 
Topic revision: r3 - 04 Sep 2017, RandyKramer
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy