Quantcast

encoder performance

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

encoder performance

skunkwerk
I'm using the pyamf.amf3.encode function on a list of lists that's rather large (tens of thousands of lists, each one with 3 numbers in it)
It takes over a minute to encode on my machine, which is fairly fast...
  1. is there any way to optimize this?
  2. If I wrote an encoder in C++ would it be much faster?
I was looking at the api - specifically the source of the encoder's writeList function, didn't seem complicated though i'm not sure how it works.
I don't feel it should take that long, if all it needs to do is iterate through the lists and append the values to a string with certain AMF-specific array delimiters separating values. 

thanks again


_______________________________________________
PyAMF users mailing list - [hidden email]
http://lists.pyamf.org/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: encoder performance

Arnar Birgisson
On Tue, Jun 24, 2008 at 05:40, skunkwerk <[hidden email]> wrote:
> is there any way to optimize this?

Probably :) I think Nick is working on optimizations often, but I
believe the code is rather fast as it is. How big is the data in
kilobytes?

> If I wrote an encoder in C++ would it be much faster?

Yes, probably. That's the motivation behind cPyAMF, http://pyamf.org/ticket/225.

> I don't feel it should take that long, if all it needs to do is iterate
> through the lists and append the values to a string with certain
> AMF-specific array delimiters separating values.

It definitely needs to do a little more than that. For one thing, AMF
uses object references (and lists are objects in the AMF sense) so
that the same object is never encoded twice. Since you are writing a
large number of lists, this reference management can take some toll.

As a workaround, until cPyAMF, since I gather your lists are fairly
structured, you could try custom-encoding them to a ByteArray, which
can even be zlib compressed as it passes the wire.

Arnar
_______________________________________________
PyAMF users mailing list - [hidden email]
http://lists.pyamf.org/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: encoder performance

Thijs Triemstra
In reply to this post by skunkwerk
Hi,

akaihola did some profiling tests a while ago [1] (around v0.2) and with that data [2] it was pretty easy to find and eliminate the performance bottlenecks. You could give those tools a try and open a ticket if there are any obvious spots in the code that need improvement. Also have you tried other versions of PyAMF and compare their performance? Like Arnar said, refer to ticket 225 [3] for cPyAMF ideas and progress.

Cheers,

Thijs


I'm using the pyamf.amf3.encode function on a list of lists that's rather large (tens of thousands of lists, each one with 3 numbers in it)
It takes over a minute to encode on my machine, which is fairly fast...
  1. is there any way to optimize this?
  2. If I wrote an encoder in C++ would it be much faster?
I was looking at the api - specifically the source of the encoder's writeList function, didn't seem complicated though i'm not sure how it works.
I don't feel it should take that long, if all it needs to do is iterate through the lists and append the values to a string with certain AMF-specific array delimiters separating values. 

thanks again

_______________________________________________
PyAMF users mailing list - [hidden email]
http://lists.pyamf.org/mailman/listinfo/users


_______________________________________________
PyAMF users mailing list - [hidden email]
http://lists.pyamf.org/mailman/listinfo/users

PGP.sig (201 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: encoder performance

skunkwerk
In reply to this post by Arnar Birgisson
thanks Arnar,
    the data when zipped with zlib is 500 KB - but flash can unzip and parse it within a second.  how is a bytearray different from a regular amf3 encoded object?  i looked at the bytearray sample, but wasn't sure how to create a bytearray of an object - do I use the _init_ function?

cheers

On Tue, Jun 24, 2008 at 2:58 AM, Arnar Birgisson <[hidden email]> wrote:
On Tue, Jun 24, 2008 at 05:40, skunkwerk <[hidden email]> wrote:
> is there any way to optimize this?

Probably :) I think Nick is working on optimizations often, but I
believe the code is rather fast as it is. How big is the data in
kilobytes?

> If I wrote an encoder in C++ would it be much faster?

Yes, probably. That's the motivation behind cPyAMF, http://pyamf.org/ticket/225.

> I don't feel it should take that long, if all it needs to do is iterate
> through the lists and append the values to a string with certain
> AMF-specific array delimiters separating values.

It definitely needs to do a little more than that. For one thing, AMF
uses object references (and lists are objects in the AMF sense) so
that the same object is never encoded twice. Since you are writing a
large number of lists, this reference management can take some toll.

As a workaround, until cPyAMF, since I gather your lists are fairly
structured, you could try custom-encoding them to a ByteArray, which
can even be zlib compressed as it passes the wire.

Arnar
_______________________________________________
PyAMF users mailing list - [hidden email]
http://lists.pyamf.org/mailman/listinfo/users


_______________________________________________
PyAMF users mailing list - [hidden email]
http://lists.pyamf.org/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: encoder performance

Arnar Birgisson
Hey there.

ByteArray is basically a type in AMF for raw byte data. What I meant
was that you'd encode the data yourself to such a buffer (see below)
and send that instead of letting AMF encode the list of lists (AMF
lists are very generic, if your data has a simple structure you may be
able to encode it more efficiently yourself).

On the Python side, you create a ByteArray with:

from pyamf.amf3 import ByteArray

buf = ByteArray()
buf.write(yourdata) # you can call this repeatedly (buf is file-like)
buf.compress()      # to force compression

Then return the buf instance from your gateway method, it'll be
compressed and sent and will appear on the Flash/Flex side as an
instance of the ByteArray AS3 class.

hth,
Arnar

On Tue, Jun 24, 2008 at 16:51, skunkwerk <[hidden email]> wrote:

> thanks Arnar,
>     the data when zipped with zlib is 500 KB - but flash can unzip and parse
> it within a second.  how is a bytearray different from a regular amf3
> encoded object?  i looked at the bytearray sample, but wasn't sure how to
> create a bytearray of an object - do I use the _init_ function?
>
> cheers
>
> On Tue, Jun 24, 2008 at 2:58 AM, Arnar Birgisson <[hidden email]> wrote:
>>
>> On Tue, Jun 24, 2008 at 05:40, skunkwerk <[hidden email]> wrote:
>> > is there any way to optimize this?
>>
>> Probably :) I think Nick is working on optimizations often, but I
>> believe the code is rather fast as it is. How big is the data in
>> kilobytes?
>>
>> > If I wrote an encoder in C++ would it be much faster?
>>
>> Yes, probably. That's the motivation behind cPyAMF,
>> http://pyamf.org/ticket/225.
>>
>> > I don't feel it should take that long, if all it needs to do is iterate
>> > through the lists and append the values to a string with certain
>> > AMF-specific array delimiters separating values.
>>
>> It definitely needs to do a little more than that. For one thing, AMF
>> uses object references (and lists are objects in the AMF sense) so
>> that the same object is never encoded twice. Since you are writing a
>> large number of lists, this reference management can take some toll.
>>
>> As a workaround, until cPyAMF, since I gather your lists are fairly
>> structured, you could try custom-encoding them to a ByteArray, which
>> can even be zlib compressed as it passes the wire.
>>
>> Arnar
>> _______________________________________________
>> PyAMF users mailing list - [hidden email]
>> http://lists.pyamf.org/mailman/listinfo/users
>
>
> _______________________________________________
> PyAMF users mailing list - [hidden email]
> http://lists.pyamf.org/mailman/listinfo/users
>
>
_______________________________________________
PyAMF users mailing list - [hidden email]
http://lists.pyamf.org/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: encoder performance

skunkwerk
thanks Arnar,
    i was looking at this page, which claims some pretty remarkable compression using the bytearray (compresses to less than one percent of the original):
http://onrails.org/articles/2007/11/27/flash-utils-bytearray-compressing-4-1mb-to-20k
can anyone confirm this?  i've used python's zlib on text files and have been getting outputs 15% of the original, but nowhere near 1%.

when i call buf.write() do I pass in a string?
if so, what might be the best way to encode a list as a string?  i've been using newline characters and spaces to delimit things... is there a simpler way?
secondly, I'm hesitant to use my own coding scheme because then I'd have to spend time in Flash decoding it, whereas its built-in array type is very fast for even a large amount of data.  any thoughts?

thanks,
imran

On Wed, Jun 25, 2008 at 4:28 AM, Arnar Birgisson <[hidden email]> wrote:
Hey there.

ByteArray is basically a type in AMF for raw byte data. What I meant
was that you'd encode the data yourself to such a buffer (see below)
and send that instead of letting AMF encode the list of lists (AMF
lists are very generic, if your data has a simple structure you may be
able to encode it more efficiently yourself).

On the Python side, you create a ByteArray with:

from pyamf.amf3 import ByteArray

buf = ByteArray()
buf.write(yourdata) # you can call this repeatedly (buf is file-like)
buf.compress()      # to force compression

Then return the buf instance from your gateway method, it'll be
compressed and sent and will appear on the Flash/Flex side as an
instance of the ByteArray AS3 class.

hth,
Arnar

On Tue, Jun 24, 2008 at 16:51, skunkwerk <[hidden email]> wrote:
> thanks Arnar,
>     the data when zipped with zlib is 500 KB - but flash can unzip and parse
> it within a second.  how is a bytearray different from a regular amf3
> encoded object?  i looked at the bytearray sample, but wasn't sure how to
> create a bytearray of an object - do I use the _init_ function?
>
> cheers
>
> On Tue, Jun 24, 2008 at 2:58 AM, Arnar Birgisson <[hidden email]> wrote:
>>
>> On Tue, Jun 24, 2008 at 05:40, skunkwerk <[hidden email]> wrote:
>> > is there any way to optimize this?
>>
>> Probably :) I think Nick is working on optimizations often, but I
>> believe the code is rather fast as it is. How big is the data in
>> kilobytes?
>>
>> > If I wrote an encoder in C++ would it be much faster?
>>
>> Yes, probably. That's the motivation behind cPyAMF,
>> http://pyamf.org/ticket/225.
>>
>> > I don't feel it should take that long, if all it needs to do is iterate
>> > through the lists and append the values to a string with certain
>> > AMF-specific array delimiters separating values.
>>
>> It definitely needs to do a little more than that. For one thing, AMF
>> uses object references (and lists are objects in the AMF sense) so
>> that the same object is never encoded twice. Since you are writing a
>> large number of lists, this reference management can take some toll.
>>
>> As a workaround, until cPyAMF, since I gather your lists are fairly
>> structured, you could try custom-encoding them to a ByteArray, which
>> can even be zlib compressed as it passes the wire.
>>
>> Arnar
>> _______________________________________________
>> PyAMF users mailing list - [hidden email]
>> http://lists.pyamf.org/mailman/listinfo/users
>
>
> _______________________________________________
> PyAMF users mailing list - [hidden email]
> http://lists.pyamf.org/mailman/listinfo/users
>
>
_______________________________________________
PyAMF users mailing list - [hidden email]
http://lists.pyamf.org/mailman/listinfo/users


_______________________________________________
PyAMF users mailing list - [hidden email]
http://lists.pyamf.org/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: encoder performance

Arnar Birgisson
Hey there,

On Fri, Jun 27, 2008 at 05:09, skunkwerk <[hidden email]> wrote:
>     i was looking at this page, which claims some pretty remarkable
> compression using the bytearray (compresses to less than one percent of the
> original):
> http://onrails.org/articles/2007/11/27/flash-utils-bytearray-compressing-4-1mb-to-20k
> can anyone confirm this?  i've used python's zlib on text files and have
> been getting outputs 15% of the original, but nowhere near 1%.

ByteArray uses zlib I think. However, compression ratio depends mainly
on the data. If you have a lot of repetitions (try writing the word
"hello" a few thousand times in a text file) zlib might very well be
able to compress it to 1%.

> when i call buf.write() do I pass in a string?

Yes

> if so, what might be the best way to encode a list as a string?  i've been
> using newline characters and spaces to delimit things... is there a simpler
> way?

If you want to make the data as small as possible, have a look at the
struct module [1]. I'm afraid I can't help you on the ActionScript
side, perhaps Thijs can pitch in.

[1] http://docs.python.org/lib/module-struct.html

> secondly, I'm hesitant to use my own coding scheme because then I'd have to
> spend time in Flash decoding it, whereas its built-in array type is very
> fast for even a large amount of data.  any thoughts?

You would definitely decode it to the built-in array type. The
decoding might be slower than AMF though (remember we're trying to
speed up encoding type on the Python side) as AMF is native to the
flash player.

There's even a chance this might not work at all, I'm just thinking out loud :)

cheers,
Arnar
_______________________________________________
PyAMF users mailing list - [hidden email]
http://lists.pyamf.org/mailman/listinfo/users
Loading...