Feature Proposal: Explicitly control the storage location of temporary files used by Foswiki

Motivation

Tasks.Item10408 has exposed an issue caused when multiple foswiki installations are hosted on the same server. File collisions in /tmp with different owners were causing failures.

It is not possible to resolve this using the $ENV{TEMPDIR} setting because the parameter is ignored by File::Spec and File::Temp when taint checking is enabled.

See also Tasks.Item9233 for windows temporary file issues.

Description and Documentation

Foswiki has a deprecated / hidden temporary file location - $Foswiki::cfg{TempfileDir} that is documented as retained for possible use by plugins.

This proposal is to:

  • Define {TempfileDir} in Foswiki.spec as an expert parameter. Default $Foswiki::cfg{WorkingDir}/tmp (no change from current default)
  • Update any modules using File::Temp and File::Spec to use a configurable directory.
  • Use a temporary root for each use of temporary files. These would be implemented as constants in their respective modules rather than adding more config variables. For example {TempfileDir}/sessions for cgi session files created by LoginManager? , {TempfileDir}/meta for files created during attachment handling. etc.
Searching for references to File::Spec and File::Temp - the following modules appear to use temporary files.

  • Foswiki::Store::VC::Handler::mkTmpFilename (This does not appear to be referenced anywhere, is it a dead function? The only place it is referenced is in the RCS unit tests, and in the GitPlugin? store.)
  • Foswiki::Sandbox::sysCommand() Cache to capture STDERR
  • Foswiki::Plugins::EmptyPlugin Contains examples of using File::Temp
  • Foswiki::Meta::attach() Temporary storage during attachment processing
  • Foswiki::Configure::Package Stores extension files fetched from the repository
  • Foswiki::Configure::Util Storage for expanded archive files.
  • Foswiki::Cache (temporary file storage is already explicitly set in the configuration. $Foswiki::cfg{Cache}{RootDir} = '$Foswiki::cfg{WorkingDir}/tmp/cache'; )

Examples

Impact

WhatDoesItAffect? : %WHATDOESITAFFECT%

Implementation

-- Contributors: GeorgeClark - 25 Feb 2011

Discussion

I'm curious about when to use File::Spec or just blah/blah to maintain portability... for example, $Foswiki::cfg{Cache}{RootDir} = '$Foswiki::cfg{WorkingDir}/tmp/cache'; could be re-written as File::catdir($Foswiki::cfg{WorkingDir}, 'tmp', 'cache')

-- PaulHarvey - 26 Feb 2011

we should always use File::Spec - that way, the code is more likely to work with fewer modifications on platforms we're not using.. like, mmm, someone had tmwiki running on VMS once, and another on some s/360 etc - and who knows what will happen in future...

ok, additionally, if you cna figure out howto get rcs to use the set dir, rather than /tmp, quite a few admins will thank you - as that has made them cry a few times.

-- SvenDowideit - 26 Feb 2011

I'd forgotten about rcs, but according to some man pages I've found searching around:

Temporary files are created in the directory containing the working file, and also in the temporary directory (see TMPDIR under ENVIRONMENT ). ... TMPDIR

Name of the temporary directory. If not set, the environment variables TMP and TEMP are inspected instead and the first value found is taken; if none of them are set, a host-dependent default is used, typically /tmp.

Does this not work? I pulled down rcs source to take a look: Could we set TMPDIR prior to invoking RCS commands and make sure sandbox passes it through into the environment?

       if (!s
                &&  !(s = cgetenv("TMPDIR"))    /* Unix tradition */
                &&  !(s = cgetenv("TMP"))       /* DOS tradition */
                &&  !(s = cgetenv("TEMP"))      /* another DOS tradition */

-- GeorgeClark - 26 Feb 2011

/tmp (and by implication File::Spec->tempdir() is usually defined as being "a scratch area which you can use to hold files and directories for short periods of time" and "cleared whenever the system is "booted up" and by the system administrator when the directory gets full". Most of us regard /tmp as a relatively small, server-specific, transitional partition that can be cleared as and when we feel like it. Because /tmp is local, we tend to regard it as "fast".

Does the way we use working/tmp (ignoring configure) correspond to this view?
  • Originally working/tmp started out as a home for session files (which is how it got the name tmp, because these files were moved there from /tmp. Session files are not true temp files, because they (can) persist well beyond the end of process activation / request handling.
    • Killing session files arbitrarily would (1) force users to log in again and (2) cause loss of session variables.
  • It also serves as the home of the ip2sid map - used on very, very few installs, I suspect, but a persistent file and definitely not /tmp material.
    • Killing ip2sid arbitrarily would require connected users to log in again.
  • Next came passthru files which were closer to "true" tmp files, in that they have a strictly defined life cycle.
    • Killing passthru files could break requests, especially authentication.
So what we have is close to /tmp but not quite the same; it's not really a scratch area, it's a managed storage area.

So, what about other uses of /tmp? George captured them:
  1. Foswiki::Store::VC::Handler::mkTmpFilename is used on Windows only, IIRC, for very short-lived files created during checkin
  2. Foswiki::Sandbox::sysCommand() Cache to capture STDERR
  3. Foswiki::Plugins::EmptyPlugin Contains examples of using File::Temp
  4. Foswiki::Meta::attach() Temporary storage during attachment processing
  5. Foswiki::Cache (temporary file storage is already explicitly set in the configuration. $Foswiki::cfg{Cache}{RootDir} = '$Foswiki::cfg{WorkingDir}/tmp/cache'; )
1, 2 and 4 are temp files held only for the duration of a single request - true temp files that can be purged almost as soon as they are closed. Arbitrary deletion isn't going to do them any favours. But they are all server-local and need to be fast (which is why they were left in /tmp). 3 and 5 I'm not so sure about, but in general:
  • Foswiki uses /tmp as fast, local, request-specific store. Files created there are not expected to live beyond the end of a request, and are specific to a single request.
  • working/tmp on the other hand is for longer-lived, Foswiki-managed files that are expected persist over many requests.
Can these two file types coexist in a single directory? I'm not so sure. If we need to provide a cushion for /tmp then I'd prefer (subject to someone persuading me otherwise) to add working/request_tmp.

-- CrawfordCurrie - 26 Feb 2011

working/request_tmp sounds fine. I wonder about renaming /tmp to session_tmp at least for new installations might make sense. I guess I'm guilty of not reading the README in working/tmp but I had just "assumed" that anything /tmp would be for any temporary file use.

Okay, how about separate temporary files into explicit "Life of session" working/session_tmp and "Life of request" working/request_tmp directories. and define them with two expert configuration parameters - {sessionTmp} and {requestTmp}. This way the two classes of transient storage are documented in the configuration, and can be modified to accommodate requirements on shared hosting or other unique installation.
  • In Foswiki.pm,
    • if sessionTmp is undefined, default to working/tmp. Upgraded sites then would not have any loss of session data.
    • if requestTmp is undefined, guess per the current rules, such as using File::Spec.
  • In configure
    • If sessionTmp undefined, checker determines if working/tmp exists and contains other than the README. If yes, set to working/tmp otherwise create the working/session_tmp directory and use that for session files.
    • if requestTmp is s undefined, checker can guess using the File::Spec tempdir setting, or as appropriate for the platform. This way there is no significant change for simple installations. Sites with multiple foswiki's installed under different users, or with other unique requirements can set the expert parameter.
And the modules identified above, use the configured sessionTmp or requestTmp explicitly in all temp file request (and set into the environment for rcs). Document the use of requestTmp in EmptyPlugin? .pm.

-- GeorgeClark - 27 Feb 2011

The title of this topic implies that explicit control of the temp directory is the best solution. There is nothing inherently wrong with using /tmp if the name is sufficiently redundant to avoid conflicts between installs, or is there? One solution is to add something like the current time, for example, as I suggested in Tasks.Item10408. Is there some other benefit to a require a more substantial fix?

-- RaymondLutz - 10 Mar 2011

The cleanup scripts might well be simplified by putting the various types of temp files in separate directories.

Temporary files are a tricky business - intuitive approaches such as using PIDs or adding times have multiple dangerous failure modes. Use File::Temp; use file handles (and never temp file names). See http://perldoc.perl.org/File/Temp.html (read the whole thing, especially the warnings), the Security::Temporary Files section of the camel book, and open( ..,'>+', undef) for some basic information.

Please don't re-invent solutions for uniqueness - you'll have a painful experience, open security holes, rediscover portability issues - and consume time that can be applied to more productive uses.

-- TimotheLitt - 20 Apr 2011

This advice is good, however there are applications where there is no way to pass through a file handle. I don't see as we have any choice but to pass through a filename. For example when capturing the STDERR / STDOUT from a Sandbox script. The file name is opened in a different thread. And it's also not good to leave the file handle open by multiple writers, so we have to close it. If you have suggestions for a portable solution here, it would be appreciated.
    # Note:  Use of the file handle $fh returned here would be safer than
    # using the file name. But it is less portable, so filename will have to do.
    my ( $fh, $stderrCache  ) = tempfile(
     "STDERR.$$.XXXXXXXXXX",
     DIR    => "$Foswiki::cfg{WorkingDir}/tmp",
     UNLINK => 0
    );
    close $fh;

This is the use case that triggered this work. Also to another comment above, /tmp is a good location, except on Windows, and possibly some other platforms. There have been cases where the temporary files all end up in the C:/ root location, or when that location is not writable, Foswiki crashes.

-- GeorgeClark - 20 Apr 2011
Topic revision: 23 Feb 2012, GeorgeClark
 
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. see CopyrightStatement. Creative Commons LicenseGet Foswiki at sourceforge.net. Fast, secure and Free Open Source software downloads