I considered using the .net functions for i-o and byte arrays and
decided not to. The reason: the module I am talking about is used in an
environment of everyday programming, often in a quick and dirty manner.
The data used are 1-byte characters and are mostly text, but with mixed
in binary sections. The module users are used to treating chars as being
of no special meaning. Umlauts, encodings, and the like do not need to
be considered; they are interpreted elsewhere, and it is only expected
that these characters are *preserved*, i.e. simply not changed.
On the other hand we need string functions like slicing, find, replace,
startswith and the like for interpretating the relevant parts. Byte
arrays do not fit good there, and even worse would be an environment of
mixed bytes and unicode with frequent conversions. Unwanted encoding and
decoding problem would probably show up. It would be somewhat confusing
and it would require a too technical view in a context focused on
In the moment, I can only hope that no information is lost during the
implicit conversions of file i-o. That means I hope these conversion
simply add a 0-byte to every byte read in to make it unicode and only
discards the 0-byte when writing to a file.
On Wed, Jun 1, 2011 at 2:50 PM, Peter Schwalm <[hidden email]> wrote:
> In the moment, I can only hope that no information is lost during the
> implicit conversions of file i-o. That means I hope these conversion simply
> add a 0-byte to every byte read in to make it unicode and only discards the
> 0-byte when writing to a file.
I'm 99% sure that's how it works.
One other option that I forgot about is the 2.7 io module:
f = io.open('data.txt', 'rb')
b = f.read()
Passing the 'b' flag to open causes it to return bytes instead of
strings, which should give you what you want.