Fil v0.11.0, a memory profiler for scientists and data scientists

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Fil v0.11.0, a memory profiler for scientists and data scientists

Itamar Turner-Trauring-4
Your code reads some data, processes it, and uses too much memory. In order to reduce memory usage, you need to figure out:

 1. Where peak memory usage is, also known as the high-water mark.
 2. What code was responsible for allocating the memory that was present at that peak moment.
That's exactly what Fil will help you find.

Fil an open source memory profiler designed for data processing applications written in Python, and includes native support for Jupyter. It is designed to be high-performance and easy to use.
At the moment it only runs on Linux and macOS.

You can learn more about Fil at https://pythonspeed.com/fil or on GitHub at https://github.com/pythonspeed/filprofiler/.

v0.11 includes performance improvements and less intrusive behavior under Jupyter.

Fil vs. other Python memory tools

There are two distinct patterns of Python usage, each with its own source of memory problems.

In a long-running server, memory usage can grow indefinitely due to memory leaks. That is, some memory is not being freed.

 * If the issue is in Python code, tools like `tracemalloc` <https://docs.python.org/3/library/tracemalloc.html> and Pympler <https://pypi.org/project/Pympler/> can tell you which objects are leaking and what is preventing them from being leaked.
 * If you're leaking memory in C code, you can use tools like Valgrind <https://valgrind.org/>.
Fil, however, is not aimed at memory leaks, but at the other use case: data processing applications. These applications load in data, process it somehow, and then finish running.

The problem with these applications is that they can, on purpose or by mistake, allocate huge amounts of memory. It might get freed soon after, but if you allocate 16GB RAM and only have 8GB in your computer, the lack of leaks doesn't help you.

Fil will therefore tell you, in an easy to understand way:

 1. Where peak memory usage is, also known as the high-water mark.
 2. What code was responsible for allocating the memory that was present at that peak moment.
 3. This includes C/Fortran/C++/whatever extensions that don't use Python's memory allocation API (`tracemalloc` only does Python memory APIs).
_______________________________________________
Python-announce-list mailing list -- [hidden email]
To unsubscribe send an email to [hidden email]
https://mail.python.org/mailman3/lists/python-announce-list.python.org/
Member address: [hidden email]