Why does the first loop go wrong with Python3

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Why does the first loop go wrong with Python3

Cecil Westerhof
I have the following code:
    from __future__     import division, print_function

    import subprocess

    p = subprocess.Popen('ls -l', shell = True, stdout = subprocess.PIPE)
    for line in iter(p.stdout.readline, ''):
        print(line.rstrip().decode('utf-8'))

    p = subprocess.Popen('ls -l', shell = True, stdout = subprocess.PIPE)
    for line in p.stdout.readlines():
        print(line.rstrip().decode('utf-8'))

This works in Python2. (Both give the same output.)

But when I execute this in Python3, then the first loop is stuck in a
loop where it continually prints a empty string. The second loop is
executed correctly in Python3.

In the current case it is not a problem for me, but when the output
becomes big, the second solution will need more memory. How can I get
the first version working in Python3?

--
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof


Reply | Threaded
Open this post in threaded view
|

Why does the first loop go wrong with Python3

Oscar Benjamin-2
On 19 May 2015 at 13:24, Cecil Westerhof <Cecil at decebal.nl> wrote:

> I have the following code:
>     from __future__     import division, print_function
>
>     import subprocess
>
>     p = subprocess.Popen('ls -l', shell = True, stdout = subprocess.PIPE)
>     for line in iter(p.stdout.readline, ''):
>         print(line.rstrip().decode('utf-8'))
>
>     p = subprocess.Popen('ls -l', shell = True, stdout = subprocess.PIPE)
>     for line in p.stdout.readlines():
>         print(line.rstrip().decode('utf-8'))
>
> This works in Python2. (Both give the same output.)
>
> But when I execute this in Python3, then the first loop is stuck in a
> loop where it continually prints a empty string. The second loop is
> executed correctly in Python3.
>
> In the current case it is not a problem for me, but when the output
> becomes big, the second solution will need more memory. How can I get
> the first version working in Python3?

The problem is that Python 3 carefully distinguishes between the bytes
that come when reading from the stdout of a process and text which
must be decoded from the bytes. You're using iter(f, sentinel) and
checking for a sentinel value of ''. However in Python 3 the sentinel
returned will be b''.

Consider:
$ python3
Python 3.2.3 (default, Feb 27 2014, 21:31:18)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> '' == b''
False

If you change it from '' to b'' it will work.

However the normal way to do this is to iterate over stdout directly:

     p = subprocess.Popen('ls -l', shell = True, stdout = subprocess.PIPE)
     for line in p.stdout:
         print(line.rstrip().decode('utf-8'))


--
Oscar


Reply | Threaded
Open this post in threaded view
|

Why does the first loop go wrong with Python3

Thomas Rachel-3
In reply to this post by Cecil Westerhof
Am 19.05.2015 um 15:16 schrieb Oscar Benjamin:

> However the normal way to do this is to iterate over stdout directly:

Depends. There may be differences when it comes to buffering etc...


Thomas


Reply | Threaded
Open this post in threaded view
|

Why does the first loop go wrong with Python3

Cecil Westerhof
In reply to this post by Cecil Westerhof
Op Tuesday 19 May 2015 15:16 CEST schreef Oscar Benjamin:

> On 19 May 2015 at 13:24, Cecil Westerhof <Cecil at decebal.nl> wrote:
>> I have the following code:
>> from __future__     import division, print_function
>>
>> import subprocess
>>
>> p = subprocess.Popen('ls -l', shell = True, stdout =
>> subprocess.PIPE) for line in iter(p.stdout.readline, ''):
>> print(line.rstrip().decode('utf-8'))
>>
>> p = subprocess.Popen('ls -l', shell = True, stdout =
>> subprocess.PIPE) for line in p.stdout.readlines():
>> print(line.rstrip().decode('utf-8'))
>>
>> This works in Python2. (Both give the same output.)
>>
>> But when I execute this in Python3, then the first loop is stuck in
>> a loop where it continually prints a empty string. The second loop
>> is executed correctly in Python3.
>>
>> In the current case it is not a problem for me, but when the output
>> becomes big, the second solution will need more memory. How can I
>> get the first version working in Python3?
>
> The problem is that Python 3 carefully distinguishes between the
> bytes that come when reading from the stdout of a process and text
> which must be decoded from the bytes. You're using iter(f, sentinel)
> and checking for a sentinel value of ''. However in Python 3 the
> sentinel returned will be b''.
>
> Consider: $ python3 Python 3.2.3 (default, Feb 27 2014, 21:31:18)
> [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or
> "license" for more information.
>>>> '' == b''
> False
>
> If you change it from '' to b'' it will work.
>
> However the normal way to do this is to iterate over stdout
> directly:
>
> p = subprocess.Popen('ls -l', shell = True, stdout =
> subprocess.PIPE) for line in p.stdout:
> print(line.rstrip().decode('utf-8'))

Works like a charm.

I looked at the documentation. Is it necessary to do a:
    p.wait()
afterwards?

--
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof


Reply | Threaded
Open this post in threaded view
|

Why does the first loop go wrong with Python3

Ian Kelly-2
On Tue, May 19, 2015 at 8:44 AM, Cecil Westerhof <Cecil at decebal.nl> wrote:
> I looked at the documentation. Is it necessary to do a:
>     p.wait()
> afterwards?

It's good practice to clean up zombie processes by waiting on them,
but they will also get cleaned up when your script exits.


Reply | Threaded
Open this post in threaded view
|

Why does the first loop go wrong with Python3

Cecil Westerhof
In reply to this post by Cecil Westerhof
Op Tuesday 19 May 2015 17:49 CEST schreef Ian Kelly:

> On Tue, May 19, 2015 at 8:44 AM, Cecil Westerhof <Cecil at decebal.nl> wrote:
>> I looked at the documentation. Is it necessary to do a:
>> p.wait()
>> afterwards?
>
> It's good practice to clean up zombie processes by waiting on them,
> but they will also get cleaned up when your script exits.

You are right. I played a little with ipython3, which made finding
things out a lot easier. ;-)

In my case it is a script, that terminates very soon after being
finished with p, but it is certainly good practise to do it myself.

I always did a free in my C programming days. I was always told it was
not necessary, but I found it better to do it anyway.


By the way, what also works is:
    p = None

But it was just a try in ipython3. I would never do this in real code.
I was just curious if this would be handled correctly and it is. :-)

--
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof


Reply | Threaded
Open this post in threaded view
|

Why does the first loop go wrong with Python3

Chris Angelico
On Wed, May 20, 2015 at 2:39 AM, Cecil Westerhof <Cecil at decebal.nl> wrote:
> By the way, what also works is:
>     p = None
>
> But it was just a try in ipython3. I would never do this in real code.
> I was just curious if this would be handled correctly and it is. :-)

That _may_ work, but it depends on their not being any other
references to it, and also depends on it being garbage-collected
promptly. Neither is guaranteed. Explicitly wait()ing on it is a
guarantee.

Simply dropping the object is a good way to "probably dispose" of
something that you don't care about. For instance, you asynchronously
invoke VLC to play some audio alert, and the subprocess might finish
before you're done with the current function, or might finish after.
You don't really care about its actual termination, and certainly
don't want to delay anything waiting for it, but you do want to clean
up resources at some point. Dropping the object (keeping no references
to it, returning from the function you called it from, unsetting any
references you have, whatever makes sense) will normally mean that it
gets garbage collected and cleaned up _at some point_, without really
guaranteeing exactly when; for instance, if you have an alert like
this once per hour and watch 'top' for the number of zombies, you'll
probably see some now and then, but they'll never get to apocalyptic
numbers.

ChrisA