fork/exec & close file descriptors

classic Classic list List threaded Threaded
44 messages Options
123
Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Skip Montanaro
Due to presumed bugs in an underlying library over which I have no control,
I'm considering a restart in the wee hours of the morning. The basic
fork/exec dance is not a problem, but how do I discover all the open file
descriptors in the new child process to make sure they get closed? Do I
simply start at fd 3 and call os.close() on everything up to some largish
fd number? Some of these file descriptors will have been opened by stuff
well below the Python level, so I don't know them a priori.

Thx,

Skip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20150519/82d243c9/attachment.html>

Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Chris Angelico
On Tue, May 19, 2015 at 10:59 PM, Skip Montanaro
<skip.montanaro at gmail.com> wrote:
> Due to presumed bugs in an underlying library over which I have no control,
> I'm considering a restart in the wee hours of the morning. The basic
> fork/exec dance is not a problem, but how do I discover all the open file
> descriptors in the new child process to make sure they get closed? Do I
> simply start at fd 3 and call os.close() on everything up to some largish fd
> number? Some of these file descriptors will have been opened by stuff well
> below the Python level, so I don't know them a priori.

What Python version are you targeting? Are you aware of PEP 446?

https://www.python.org/dev/peps/pep-0446/

tl;dr: As of Python 3.4, the default is to close file descriptors on
exec automatically - and atomically, where possible.

I'm not sure if there's a 2.7 backport available, though even if there
is, you'll need to manually set your files non-inheritable; AIUI the
default in 2.7 can't be changed for backward compatibility reasons
(the change in 3.4 *will* break code that was relying on automatic
inheritability of FDs - they now have to be explicitly tagged).

ChrisA


Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Skip Montanaro
On Tue, May 19, 2015 at 8:33 AM, Chris Angelico <rosuav at gmail.com> wrote:

> What Python version are you targeting? Are you aware of PEP 446?


Yeah, I'm still on 2.7, and am aware of PEP 446. Note that many of the file
descriptors will not have been created by my Python code. They will have
been created by underlying C/C++ libraries, so I can't guarantee which
flags were set on file open.

I'm going to continue to pursue solutions which won't require a restart for
now, but would like to have a sane restart option in my back pocket should
it become necessary.

Thx,

Skip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20150519/b5b4dce1/attachment.html>

Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Chris Angelico
On Tue, May 19, 2015 at 11:44 PM, Skip Montanaro
<skip.montanaro at gmail.com> wrote:

>
> On Tue, May 19, 2015 at 8:33 AM, Chris Angelico <rosuav at gmail.com> wrote:
>>
>> What Python version are you targeting? Are you aware of PEP 446?
>
>
> Yeah, I'm still on 2.7, and am aware of PEP 446. Note that many of the file
> descriptors will not have been created by my Python code. They will have
> been created by underlying C/C++ libraries, so I can't guarantee which flags
> were set on file open.
>
> I'm going to continue to pursue solutions which won't require a restart for
> now, but would like to have a sane restart option in my back pocket should
> it become necessary.

Fair enough. What you MAY be able to do is preempt it by going through
your FDs and setting them all CLOEXEC, but it won't make a lot of
difference compared to just going through them all and closing them
between fork and exec.

On Linux (and possibly some other Unixes), /proc/self/fd may be of
use. Enumerating files in that should tell you about your open files.
How useful that is I don't know, though.

ChrisA


Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Skip Montanaro
On Tue, May 19, 2015 at 8:54 AM, Chris Angelico <rosuav at gmail.com> wrote:

> On Linux (and possibly some other Unixes), /proc/self/fd may be of
> use.
>

Good point. Yes, /proc/PID/fd appears to contain all the entries for open
file descriptors (I am on Linux).

Skip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20150519/f0e77ce0/attachment.html>

Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Jon Ribbens-5
In reply to this post by Chris Angelico
On 2015-05-19, Skip Montanaro <skip.montanaro at gmail.com> wrote:
> Yeah, I'm still on 2.7, and am aware of PEP 446. Note that many of the file
> descriptors will not have been created by my Python code. They will have
> been created by underlying C/C++ libraries, so I can't guarantee which
> flags were set on file open.

There is no portable way to do this, the problem is Unix not Python.
The below code is a reasonable stab at it, but there is no 100%
guaranteed method. The code is untested but you get the idea.


  import errno
  import os


  def closeall(min=0, max=4096, keep=frozenset()):
      """Close all open file descriptors except for the given exceptions.

      Any file descriptors below or equal to `min`, or in the set `keep`
      will not be closed. Any file descriptors above `max` *might* not be
      closed.
      """
      # First try /proc/$$/pid
      try:
          for fd in os.listdir("/proc/%d/fd" % (os.getpid())):
              try:
                  fd = int(fd)
              except ValueError:
                  continue
              if fd >= min and fd not in keep:
                  os.close(int(fd))
          return
      except OSError as exc:
          if exc[0] != errno.ENOENT:
              raise
      # If /proc was not available, fall back to closing a lot of descriptors.
      for fd in range(min, max):
          if fd not in keep:
              try:
                  os.close(fd)
              except OSError as exc:
                  if exc[0] != errno.EBADF:
                      raise


Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Chris Angelico
In reply to this post by Skip Montanaro
On Wed, May 20, 2015 at 12:31 AM, Skip Montanaro
<skip.montanaro at gmail.com> wrote:
> On Tue, May 19, 2015 at 8:54 AM, Chris Angelico <rosuav at gmail.com> wrote:
>>
>> On Linux (and possibly some other Unixes), /proc/self/fd may be of
>> use.
>
>
> Good point. Yes, /proc/PID/fd appears to contain all the entries for open
> file descriptors (I am on Linux).

Yes, and /proc/self is usually a magic symlink to /proc/<your_pid> so
you can just look at /proc/self/fd instead of explicitly calling up
your own PID.

ChrisA


Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Ethan Furman-2
In reply to this post by Skip Montanaro
On 05/19/2015 05:59 AM, Skip Montanaro wrote:

> Due to presumed bugs in an underlying library over which I have no control, I'm considering a restart in the wee hours of the morning. The basic fork/exec dance is not a problem, but how do I discover
> all the open file descriptors in the new child process to make sure they get closed? Do I simply start at fd 3 and call os.close() on everything up to some largish fd number? Some of these file
> descriptors will have been opened by stuff well below the Python level, so I don't know them a priori.

Pandaemonium [1] (and I believe Ben Finney's daemon [2]) use something akin to the following:

def close_open_files(exclude):
     max_files = resource.getrlimit(resource.RLIMIT_NOFILE)[1]
     keep = set()
     for file in exclude:
         if isinstance(file, baseint):
             keep.add(file)
         elif hasattr(file, 'fileno'):
             keep.add(file.fileno())
         else:
             raise ValueError(
                     'files to not close should be either an file descriptor, '
                     'or a file-type object, not %r (%s)' % (type(file), file))
     for fd in range(max_files, -1, -1):
         if fd in keep:
             continue
         try:
             os.close(fd)
         except OSError:
             exc = sys.exc_info()[1]
             if exc.errno == errno.EBADF:
                 continue
             raise

So, yeah, basically a brute-force method.

--
~Ethan~


[1] https://pypi.python.org/pypi/pandaemonium
[2] https://pypi.python.org/pypi/python-daemon


Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Gregory Ewing
In reply to this post by Chris Angelico
> On Tue, May 19, 2015 at 8:54 AM, Chris Angelico <rosuav at gmail.com
> <mailto:rosuav at gmail.com>> wrote:
>
>     On Linux (and possibly some other Unixes), /proc/self/fd may be of
>     use.

On MacOSX, /dev/fd seems to be the equivalent of this.

--
Greg


Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Ian Kelly-2
On Tue, May 19, 2015 at 7:10 PM, Gregory Ewing
<greg.ewing at canterbury.ac.nz> wrote:
>> On Tue, May 19, 2015 at 8:54 AM, Chris Angelico <rosuav at gmail.com
>> <mailto:rosuav at gmail.com>> wrote:
>>
>>     On Linux (and possibly some other Unixes), /proc/self/fd may be of
>>     use.
>
>
> On MacOSX, /dev/fd seems to be the equivalent of this.

Not a perfect equivalent. On Linux, ls -lF /proc/self/fd shows the
contents as symlinks, which is handy since you can just read the links
to see what they're pointing to. On OSX, ls -lF /dev/fd shows three
ttys and two directories.

Though I also note that on my Ubuntu Trusty system, /dev/fd is itself
a symlink to /proc/self/fd.


Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Skip Montanaro
In reply to this post by Skip Montanaro
Reviving (and concluding) a thread I started a couple weeks ago, I asked:

> The basic fork/exec dance is not a problem, but how do I discover
> all the open file descriptors in the new child process to make sure
> they get closed? Do I simply start at fd 3 and call os.close() on
> everything up to some largish fd number?

I wanted this again today (for different reasons than before).
Googling for "python close all file descriptors" returned the os
module docs as the first hit, and lo and behold, what do I see
documented? os.closerange (new in 2.6):

    os.closerange(fd_low, fd_high)
    Close all file descriptors from fd_low (inclusive) to fd_high
    (exclusive), ignoring errors.

Guido's time machine strikes again...

Skip

Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Marko Rauhamaa
In reply to this post by Skip Montanaro
Skip Montanaro <skip.montanaro at gmail.com>:

>     os.closerange(fd_low, fd_high)
>     Close all file descriptors from fd_low (inclusive) to fd_high
>     (exclusive), ignoring errors.
>
> Guido's time machine strikes again...

The only problem is that you don't know how high you need to go in
general.


Marko

Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Skip Montanaro
On Tue, Jun 2, 2015 at 10:28 AM, Marko Rauhamaa <marko at pacujo.net> wrote:
>
> The only problem is that you don't know how high you need to go in
> general.

Sure, but I didn't know anyway, so no matter what upper bound I choose
(or what function I choose/implement), it's just going to be a guess.
os.closerange just codifies the straightforward procedure.

Skip

Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Alain Ketterlin-2
In reply to this post by Skip Montanaro
Skip Montanaro <skip.montanaro at gmail.com> writes:

> Reviving (and concluding) a thread I started a couple weeks ago, I asked:
>
>> The basic fork/exec dance is not a problem, but how do I discover
>> all the open file descriptors in the new child process to make sure
>> they get closed? Do I simply start at fd 3 and call os.close() on
>> everything up to some largish fd number?
>
> I wanted this again today (for different reasons than before).
> Googling for "python close all file descriptors" returned the os
> module docs as the first hit, and lo and behold, what do I see
> documented? os.closerange (new in 2.6):
>
>     os.closerange(fd_low, fd_high)
>     Close all file descriptors from fd_low (inclusive) to fd_high
>     (exclusive), ignoring errors.

The close(2) manpage has the following warning on my Linux system:

| Not checking the return value of close() is a common  but  nevertheless
| serious  programming error.  It is quite possible that errors on a pre?
| vious write(2) operation are first reported at the final close().   Not
| checking the return value when closing the file may lead to silent loss
| of data.  This can especially be observed with NFS and with disk quota.
|

(I haven't followed the thread, but if your problem is to make sure fds
are closed on exec, you may be better off using the... close-on-exec
flag. Or simply do the bookkeeping.)

-- Alain.

Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Marko Rauhamaa
In reply to this post by Marko Rauhamaa
Skip Montanaro <skip.montanaro at gmail.com>:

> On Tue, Jun 2, 2015 at 10:28 AM, Marko Rauhamaa <marko at pacujo.net> wrote:
>>
>> The only problem is that you don't know how high you need to go in
>> general.
>
> Sure, but I didn't know anyway, so no matter what upper bound I choose
> (or what function I choose/implement), it's just going to be a guess.
> os.closerange just codifies the straightforward procedure.

Under linux, the cleanest way seems to be going through the files under
/proc/self/fd:

    def close_fds(leave_open=[0, 1, 2]):
        fds = os.listdir(b'/proc/self/fd')
        for fdn in fds:
            fd = int(fdn)
            if fd not in leave_open:
                os.close(fd)

No need for a upper bound.


Marko

Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Jon Ribbens-5
On 2015-06-02, Marko Rauhamaa <marko at pacujo.net> wrote:

> Skip Montanaro <skip.montanaro at gmail.com>:
>
>> On Tue, Jun 2, 2015 at 10:28 AM, Marko Rauhamaa <marko at pacujo.net> wrote:
>>>
>>> The only problem is that you don't know how high you need to go in
>>> general.
>>
>> Sure, but I didn't know anyway, so no matter what upper bound I choose
>> (or what function I choose/implement), it's just going to be a guess.
>> os.closerange just codifies the straightforward procedure.
>
> Under linux, the cleanest way seems to be going through the files under
> /proc/self/fd:
>
>     def close_fds(leave_open=[0, 1, 2]):
>         fds = os.listdir(b'/proc/self/fd')
>         for fdn in fds:
>             fd = int(fdn)
>             if fd not in leave_open:
>                 os.close(fd)
>
> No need for a upper bound.

Or use the more generic code that I already posted in this thread:

  def closeall(min=0, max=4096, keep=()):
      """Close all open file descriptors except for the given exceptions.

      Any file descriptors below or equal to `min`, or in the set `keep`
      will not be closed. Any file descriptors above `max` *might* not be
      closed.
      """
      # First try /proc/self/pid
      try:
          for fd in os.listdir("/proc/self/fd"):
              try:
                  fd = int(fd)
              except ValueError:
                  continue
              if fd >= min and fd not in keep:
                  os.close(int(fd))
          return
      except OSError as exc:
          if exc[0] != errno.ENOENT:
              raise
      # If /proc was not available, fall back to closing a lot of descriptors.
      for fd in range(min, max):
          if fd not in keep:
              try:
                  os.close(fd)
              except OSError as exc:
                  if exc[0] != errno.EBADF:
                      raise

This function could use os.closerange(), but if the documentation is
correct and it ignores *all* errors and not just EBADF, then it
sounds like os.closerange() should not in fact ever be used for any
purpose.

Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Marko Rauhamaa
In reply to this post by Alain Ketterlin-2
Alain Ketterlin <alain at universite-de-strasbourg.fr.invalid>:

> The close(2) manpage has the following warning on my Linux system:
>
> | Not checking the return value of close() is a common but
> | nevertheless serious programming error. It is quite possible that
> | errors on a previous write(2) operation are first reported at the
> | final close(). Not checking the return value when closing the file
> | may lead to silent loss of data. This can especially be observed
> | with NFS and with disk quota.
> |
>
> (I haven't followed the thread, but if your problem is to make sure
> fds are closed on exec, you may be better off using the...
> close-on-exec flag. Or simply do the bookkeeping.)

The quoted man page passage is a bit untenable.

First, if close() fails, what's a poor program to do? Try again? How
do you get rid of an obnoxious file descriptor? How would close-on-exec
help? Would exec*() fail?

What if an implicit close() fails on _exit(), will _exit() fail then?
(The man page doesn't allow it.)

The need to close all open file descriptors comes between fork() and
exec*(). The kernel (module) does not see the close() system call unless
the reference count drops to zero. Normally, those function calls
between fork() and exec*() are therefore no-ops.

However, there's no guarantee of that. So the parent process might get
to call close() before the child that is about to call exec*(). Then,
the parent would not get the error that the man page talks about.
Instead, the error goes to the child, which has no reasonable way of
dealing with the situation.

I think having NFS et al postpone their I/O errors till close() is
shifting the blame to the victim.


Marko

Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Alain Ketterlin-2
Marko Rauhamaa <marko at pacujo.net> writes:

> Alain Ketterlin <alain at universite-de-strasbourg.fr.invalid>:
>
>> The close(2) manpage has the following warning on my Linux system:
>>
>> | Not checking the return value of close() is a common but
>> | nevertheless serious programming error. It is quite possible that
>> | errors on a previous write(2) operation are first reported at the
>> | final close(). Not checking the return value when closing the file
>> | may lead to silent loss of data. This can especially be observed
>> | with NFS and with disk quota.
>> |
>>
>> (I haven't followed the thread, but if your problem is to make sure
>> fds are closed on exec, you may be better off using the...
>> close-on-exec flag. Or simply do the bookkeeping.)
>
> The quoted man page passage is a bit untenable.
>
> First, if close() fails, what's a poor program to do?

Warn the user? Not assume everything went well?  It all depends on the
application, and what the file descriptor represents.

> Try again?

Could be a good idea on NFS or other kind of mounts.

> How do you get rid of an obnoxious file descriptor?

You don't, you check everything before closing the file, with fsync()
for example.

I've no idea what the OP's program was doing, so I'm not going to split
hairs. I can't imagine why one would like to mass-close an arbitrary set
of file descriptors, and I think APIs like os.closerange() are toxic and
an appeal to sloppy programming.

-- Alain.

Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Marko Rauhamaa
Alain Ketterlin <alain at universite-de-strasbourg.fr.invalid>:

> Marko Rauhamaa <marko at pacujo.net> writes:
>> First, if close() fails, what's a poor program to do?
>
> Warn the user? Not assume everything went well? It all depends on the
> application, and what the file descriptor represents.

The problem here is in the system call contract, which is broken.
There's no fix. The man page admonition is just hand-waving without
constructive advice.

>> Try again?
> Could be a good idea on NFS or other kind of mounts.

Maybe close() will fail for ever.

> I can't imagine why one would like to mass-close an arbitrary set of
> file descriptors,

That's standard practice before execking a file. Failure to do that can
seriously hurt the parent process. For example, the parent (or child)
will never read an EOF from file descriptor if its duplicate is open in
an unwitting child process. Also, the number of open files in the system
may grow over all limits or simply waste kernel resources.

Close-on-exec is nice, maybe. However, you don't have control over all
file descriptors. Loggers, high-level library calls and others open
files without the application programmer knowing or having direct
control over.

> and I think APIs like os.closerange() are toxic and an appeal to
> sloppy programming.

And you recommend what instead?


Marko

Reply | Threaded
Open this post in threaded view
|

fork/exec & close file descriptors

Chris Angelico
In reply to this post by Alain Ketterlin-2
On Wed, Jun 3, 2015 at 7:06 AM, Alain Ketterlin
<alain at universite-de-strasbourg.fr.invalid> wrote:
> I've no idea what the OP's program was doing, so I'm not going to split
> hairs. I can't imagine why one would like to mass-close an arbitrary set
> of file descriptors, and I think APIs like os.closerange() are toxic and
> an appeal to sloppy programming.

When you fork, you get a duplicate referent to every open file in both
parent and child. Closing them all in the child is very common, as it
allows the parent to continue owning those file descriptors (so that
when you close it in the parent, the resource is really closed). One
notable example is with listening sockets; bind/listen in the parent,
then fork (maybe to handle a client), then terminate the parent
process. You now cannot restart the parent without aborting the child,
as the child now owns that listening socket (even if it never wants to
use it). There are some specific ways around this, but not on all OSes
(eg Linux only added support for SO_REUSEPORT in 3.9), and the best
way has always been to make sure the children don't hang onto the
listening socket. (There are other good reasons for doing this, too.)

ChrisA

123