should we close in-memory file-like objects (StringIO, BytesIO, etc.)?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

should we close in-memory file-like objects (StringIO, BytesIO, etc.)?

Tim Graham-2
Andriy proposed a patch to close objects like StringIO and BytesIO in our test suite [1]. I am not sure how much benefit this gives (frees "a few bytes of RAM" according to [2]) and it seems to add a lot of verbosity. Looking at Python's own test suite, it doesn't appear these objects are closed there (I spot checked BytesIO and StringIO). Any thoughts on this?

[1] https://github.com/django/django/pull/4938
[2] https://github.com/django/django/pull/4248

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/7579ee1c-ecc6-48d1-9002-733810e64754%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: should we close in-memory file-like objects (StringIO, BytesIO, etc.)?

Andriy Sokolovskiy (coldmind)-2
My vision of that was to have clean code (even if these are few bytes).
Python allows to have memory leaks like this, without throwing warnings,
but if we know that we can close it, I think we should do that.

We're usually writing new tests by copying logic from sibling test (if
new test need to do similar thing).
If that test have unclosed stream or file, new test likely also will not
close streams and files, so this old test is a "bad example",
which causing more unclean code.

So, it question about code style than about bytes savings.

On 7/3/15 16:59, Tim Graham wrote:

> Andriy proposed a patch to close objects like StringIO and BytesIO in
> our test suite [1]. I am not sure how much benefit this gives (frees "a
> few bytes of RAM" according to [2]) and it seems to add a lot of
> verbosity. Looking at Python's own test suite, it doesn't appear these
> objects are closed there (I spot checked BytesIO and StringIO). Any
> thoughts on this?
>
> [1] https://github.com/django/django/pull/4938
> [2] https://github.com/django/django/pull/4248
>
> --
> You received this message because you are subscribed to the Google
> Groups "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [hidden email]
> <mailto:[hidden email]>.
> To post to this group, send email to [hidden email]
> <mailto:[hidden email]>.
> Visit this group at http://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/7579ee1c-ecc6-48d1-9002-733810e64754%40googlegroups.com
> <https://groups.google.com/d/msgid/django-developers/7579ee1c-ecc6-48d1-9002-733810e64754%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/559697A3.2010002%40asokolovskiy.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: should we close in-memory file-like objects (StringIO, BytesIO, etc.)?

Łukasz Rekucki
Hello ,

If you do mean CPython test suite then it's probably not the best
place to look at because the reference counting *is there* and those
object will be closed at the end of test method. Django doesn't have
this comfort when run with PyPy.

That said, I wouldn't recommend using contextlib.closing() all over
the place. StringIO seems to implement the context manager protocol
just fine[1]. And in Python 3.4+ so does urlopen().

[1]: http://bugs.python.org/issue4039


On 3 July 2015 at 16:09, Andriy Sokolovskiy <[hidden email]> wrote:

> My vision of that was to have clean code (even if these are few bytes).
> Python allows to have memory leaks like this, without throwing warnings,
> but if we know that we can close it, I think we should do that.
>
> We're usually writing new tests by copying logic from sibling test (if
> new test need to do similar thing).
> If that test have unclosed stream or file, new test likely also will not
> close streams and files, so this old test is a "bad example",
> which causing more unclean code.
>
> So, it question about code style than about bytes savings.
>
> On 7/3/15 16:59, Tim Graham wrote:
>> Andriy proposed a patch to close objects like StringIO and BytesIO in
>> our test suite [1]. I am not sure how much benefit this gives (frees "a
>> few bytes of RAM" according to [2]) and it seems to add a lot of
>> verbosity. Looking at Python's own test suite, it doesn't appear these
>> objects are closed there (I spot checked BytesIO and StringIO). Any
>> thoughts on this?
>>
>> [1] https://github.com/django/django/pull/4938
>> [2] https://github.com/django/django/pull/4248
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "Django developers (Contributions to Django itself)" group.
>> To unsubscribe from this group and stop receiving emails from it, send
>> an email to [hidden email]
>> <mailto:[hidden email]>.
>> To post to this group, send email to [hidden email]
>> <mailto:[hidden email]>.
>> Visit this group at http://groups.google.com/group/django-developers.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/django-developers/7579ee1c-ecc6-48d1-9002-733810e64754%40googlegroups.com
>> <https://groups.google.com/d/msgid/django-developers/7579ee1c-ecc6-48d1-9002-733810e64754%40googlegroups.com?utm_medium=email&utm_source=footer>.
>> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> To post to this group, send email to [hidden email].
> Visit this group at http://groups.google.com/group/django-developers.
> To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/559697A3.2010002%40asokolovskiy.com.
> For more options, visit https://groups.google.com/d/optout.



--
Łukasz Rekucki

--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAEZs-ELtiFRdGEyeXcde8L315tXO-BkcGLPCEiTPFmNMS1BM9g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: should we close in-memory file-like objects (StringIO, BytesIO, etc.)?

Berker Peksağ
In reply to this post by Tim Graham-2
On Fri, Jul 3, 2015 at 4:59 PM, Tim Graham <[hidden email]> wrote:
> Andriy proposed a patch to close objects like StringIO and BytesIO in our
> test suite [1]. I am not sure how much benefit this gives (frees "a few
> bytes of RAM" according to [2]) and it seems to add a lot of verbosity.
> Looking at Python's own test suite, it doesn't appear these objects are
> closed there (I spot checked BytesIO and StringIO). Any thoughts on this?

I agree with you on the StringIO and BytesIO cases, but I agree with
Andriy on the other usages (e.g. ZipFile, GzipFile). Many contributors
follow the existing practices to write new tests, so it would be nice
to use best practices if they don't produce too much noise.

--Berker

--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAF4280KRmDBLm%2BKMywDHsvajOMNq0th%2BhHrt77LV%2BQStNRx4zQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: should we close in-memory file-like objects (StringIO, BytesIO, etc.)?

Aymeric Augustin
2015-07-03 17:10 GMT+02:00 Berker Peksağ <[hidden email]>:
I agree with you on the StringIO and BytesIO cases, but I agree with
Andriy on the other usages (e.g. ZipFile, GzipFile). Many contributors
follow the existing practices to write new tests, so it would be nice
to use best practices if they don't produce too much noise.

StringIO and BytesIO are different from files and sockets. The former only use
RAM while the latter use file descriptors, which are in much scarcer supply
(perhaps 1024 per process).

Leaking file descriptors can cause the process to crash because it exceed the
systems limit, so it's really bad.

Leaking StringIO or BytesIO doesn't change anything. They'll be closed when
the garbage collector releases them.

That's why I don't believe the analogy holds.

--
Aymeric.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CANE-7mXog8YE8LMSxwokaMddjDjcK949wUNzEd_NnLtcknWE_A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: should we close in-memory file-like objects (StringIO, BytesIO, etc.)?

Ned Batchelder
On 7/3/15 1:55 PM, Aymeric Augustin wrote:
2015-07-03 17:10 GMT+02:00 Berker Peksağ <[hidden email]>:
I agree with you on the StringIO and BytesIO cases, but I agree with
Andriy on the other usages (e.g. ZipFile, GzipFile). Many contributors
follow the existing practices to write new tests, so it would be nice
to use best practices if they don't produce too much noise.

StringIO and BytesIO are different from files and sockets. The former only use
RAM while the latter use file descriptors, which are in much scarcer supply
(perhaps 1024 per process).

Leaking file descriptors can cause the process to crash because it exceed the
systems limit, so it's really bad.

Leaking StringIO or BytesIO doesn't change anything. They'll be closed when
the garbage collector releases them.

That's why I don't believe the analogy holds.

Many of the StringIO and BytesIO changes in the pull requests won't change anything.  The with-statement extends to the end of the test method, which is when the object would be reclaimed anyway:
@@ -31,10 +32,10 @@ def write(self, data):
             self.bytes_sent += len(data)
 
         # XXX check Content-Length and truncate if too many bytes written?
-        data = BytesIO(data)
-        for chunk in iter(lambda: data.read(MAX_SOCKET_CHUNK_SIZE), b''):
-            self._write(chunk)
-            self._flush()
+        with contextlib.closing(BytesIO(data)) as data:
+            for chunk in iter(lambda: data.read(MAX_SOCKET_CHUNK_SIZE), b''):
+                self._write(chunk)
+                self._flush()
 
     def error_output(self, environ, start_response):
         super(ServerHandler, self).error_output(environ, start_response)

I don't think it makes sense to blindly use contextlib.closing on anything that acts like a file, even when we know it is a harmless StringIO.  Better to have developers think about what is happening, and use the appropriate construct.

When I see " contextlib.closing(BytesIO(data))", I think, this person is following a pattern, and doesn't understand.

--Ned.

--
Aymeric.
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CANE-7mXog8YE8LMSxwokaMddjDjcK949wUNzEd_NnLtcknWE_A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/5597D718.9090701%40nedbatchelder.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: should we close in-memory file-like objects (StringIO, BytesIO, etc.)?

Tim Graham-2
Thanks for the info and feedback. We'll omit these changes for StringIO and Bytes IO then.

On Saturday, July 4, 2015 at 8:52:52 AM UTC-4, Ned Batchelder wrote:
On 7/3/15 1:55 PM, Aymeric Augustin wrote:
2015-07-03 17:10 GMT+02:00 Berker Peksağ <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="y6EcHRaYmX8J" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">berker...@...>:
I agree with you on the StringIO and BytesIO cases, but I agree with
Andriy on the other usages (e.g. ZipFile, GzipFile). Many contributors
follow the existing practices to write new tests, so it would be nice
to use best practices if they don't produce too much noise.

StringIO and BytesIO are different from files and sockets. The former only use
RAM while the latter use file descriptors, which are in much scarcer supply
(perhaps 1024 per process).

Leaking file descriptors can cause the process to crash because it exceed the
systems limit, so it's really bad.

Leaking StringIO or BytesIO doesn't change anything. They'll be closed when
the garbage collector releases them.

That's why I don't believe the analogy holds.

Many of the StringIO and BytesIO changes in the pull requests won't change anything.  The with-statement extends to the end of the test method, which is when the object would be reclaimed anyway:
@@ -31,10 +32,10 @@ def write(self, data):
             self.bytes_sent += len(data)
 
         # XXX check Content-Length and truncate if too many bytes written?
-        data = BytesIO(data)
-        for chunk in iter(lambda: data.read(MAX_SOCKET_CHUNK_SIZE), b''):
-            self._write(chunk)
-            self._flush()
+        with contextlib.closing(BytesIO(data)) as data:
+            for chunk in iter(lambda: data.read(MAX_SOCKET_CHUNK_SIZE), b''):
+                self._write(chunk)
+                self._flush()
 
     def error_output(self, environ, start_response):
         super(ServerHandler, self).error_output(environ, start_response)

I don't think it makes sense to blindly use contextlib.closing on anything that acts like a file, even when we know it is a harmless StringIO.  Better to have developers think about what is happening, and use the appropriate construct.

When I see " contextlib.closing(BytesIO(data))", I think, this person is following a pattern, and doesn't understand.

--Ned.

--
Aymeric.
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="y6EcHRaYmX8J" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">django-develop...@googlegroups.com.
To post to this group, send email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="y6EcHRaYmX8J" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">django-d...@googlegroups.com.
Visit this group at <a href="http://groups.google.com/group/django-developers" target="_blank" rel="nofollow" onmousedown="this.href='http://groups.google.com/group/django-developers';return true;" onclick="this.href='http://groups.google.com/group/django-developers';return true;">http://groups.google.com/group/django-developers.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/django-developers/CANE-7mXog8YE8LMSxwokaMddjDjcK949wUNzEd_NnLtcknWE_A%40mail.gmail.com?utm_medium=email&amp;utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/msgid/django-developers/CANE-7mXog8YE8LMSxwokaMddjDjcK949wUNzEd_NnLtcknWE_A%40mail.gmail.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/django-developers/CANE-7mXog8YE8LMSxwokaMddjDjcK949wUNzEd_NnLtcknWE_A%40mail.gmail.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/django-developers/CANE-7mXog8YE8LMSxwokaMddjDjcK949wUNzEd_NnLtcknWE_A%40mail.gmail.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/4290eed6-ea4c-4516-a21e-5f4991ded232%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.