Session file locking not working properly in multithreaded environment

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Session file locking not working properly in multithreaded environment

Ralph Heinkel-2
I have a very simple web-page which makes 20 (parallel) connections to my cherrypy server via <img> tags targetting images which are calculated programmatically on-the-fly when accessed:

<html>
<img src="/CP/getPlot?image_id=i_1"/>
<img src="/CP/getPlot?image_id=i_2"/>
[.. 18 more entries ]
<img src="/CP/getPlot?image_id=i_20"/>
</html>

In some rare, random cases some of the 'getPlot' requests fail  because cherrypy receives an incomplete session object from a session file, i.e. data that cannot be unpickled. In such a case cherrypy creates a new session object. Since authentication is bound to the session, the request then fails due to missing privileges, and hence all followup requests to 'getPlot' fail as well.

When digging in the sessions and lockfile code of cherrypy I added some log statements to see in which order things happen when a session is opened and closed. This way I found that the same lock file was locked simultaneously by different parallel threads. This of course leads to the case where a session was partially written by one thread while another thread started reading from it, causing it to receive incomplete session data, and hence unpickle failed.

The log reveilled what is happening:

UnixLockFile._lock_file called for /fs/local/run/sessions/session-3146212444067902ec35e41a4af0db64bcbacf4b.lock on fileno 32 (thread=139877983168256)
Loaded session with data: ({'csrf_token': 'c8cc04fa-6cec-4dc2-a690-d3d1398f6322', 'login': 'rh'}, datetime.datetime(2019, 5, 10, 19, 29, 26, 591346))
saving session data
: dict_items([('csrf_token', 'c8cc04fa-6cec-4dc2-a690-d3d1398f6322'), ('login', 'rh')])
UnixLockFile.release called for /fs/local/run/sessions/session-3146212444067902ec35e41a4af0db64bcbacf4b.lock (thread=139877983168256)
2019-05-10 11:29:27,141:pid=22163:127.0.0.1 - rh300070 [10/May/2019:11:29:27] "GET /CP/getPlot?image_id=i_1 HTTP/1.1" 200 14708 "http://localhost:8080/CP/static/quant.html"

UnixLockFile._lock_file called for /fs/local/run/sessions/session-3146212444067902ec35e41a4af0db64bcbacf4b.lock on fileno 36 (thread=139877479868160)
Loaded session with data: ({'csrf_token': 'c8cc04fa-6cec-4dc2-a690-d3d1398f6322', 'login': 'rh'}, datetime.datetime(2019, 5, 10, 19, 29, 27, 146440))
saving session data
: dict_items([('csrf_token', 'c8cc04fa-6cec-4dc2-a690-d3d1398f6322'), ('login', 'rh')])
UnixLockFile.release called for /fs/local/run/sessions/session-3146212444067902ec35e41a4af0db64bcbacf4b.lock (thread=139877479868160)
2019-05-10 11:29:27,153:pid=22163:127.0.0.1 - rh300070 [10/May/2019:11:29:27] "GET /CP/getPlot?image_id=i_7 HTTP/1.1" 200 14708 "http://localhost:8080/CP/static/quant.html"

UnixLockFile._lock_file called for /fs/local/run/sessions/session-3146212444067902ec35e41a4af0db64bcbacf4b.lock on fileno 32 (thread=139877454690048)
UnixLockFile._lock_file called for /fs/local/run/sessions/session-3146212444067902ec35e41a4af0db64bcbacf4b.lock on fileno 36 (thread=139877479868160)
[...]

The first two requests looked entirely ok, the lock was acquired, the session was loaded, then after the request the session got saved, and the lock released.

But then suddenly two threads obtain a lock to the same file! How is this possible? I have to mention that this does not happen every time I load the main page, sometimes I have to reload 30 or 40 times until this behavior shows up.

Looking at the code I have no idea how this can happen, the code (it uses fcntl.flock() with correct flags) looks entirely OK to me.

Any help is very much appreciated.

PS: I work under Centos7, Python 3.6.4, cherrypy ~=15.0.0 (but also the latest 18.x.x version of cherrypy shows the same behavior)

--
You received this message because you are subscribed to the Google Groups "cherrypy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/cherrypy-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/cherrypy-users/5460092d-e7e7-49a3-974d-d3cdc1c05885%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Session file locking not working properly in multithreaded environment

Tim Roberts
Ralph Heinkel wrote:

The first two requests looked entirely ok, the lock was acquired, the session was loaded, then after the request the session got saved, and the lock released.

But then suddenly two threads obtain a lock to the same file! How is this possible? I have to mention that this does not happen every time I load the main page, sometimes I have to reload 30 or 40 times until this behavior shows up.

Looking at the code I have no idea how this can happen, the code (it uses fcntl.flock() with correct flags) looks entirely OK to me.

Ah, you've stumbled across one of the classic blunders, as they say in Princess Bride.  The Unix "flock" call only blocks simultaneous access between different processes.  It doesn't protect multiple threads within a single process.  A single process can call flock repeatedly on a single file, and all of the calls will succeed.

If you want to protect a resource between threads of a single process, you need to use something other than flock, like a  Lock from the "threading" module.

-- 
Tim Roberts, [hidden email]
Providenza & Boekelheide, Inc.

--
You received this message because you are subscribed to the Google Groups "cherrypy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/cherrypy-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/cherrypy-users/1bb257d1-4160-0e4b-4621-4575fa3f3bc7%40probo.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Session file locking not working properly in multithreaded environment

Ralph Heinkel-2
I'm not completely convinced that this is true. Here is an example where a second lock on the same file fails within the same thread:

Python 3.6.5 (default, Mar 31 2018, 19:45:04) [GCC] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import fcntl
>>> flags = fcntl.LOCK_EX | fcntl.LOCK_NB
>>> fp1 = open('x.lock', 'a+')  # first file handle and file descriptor for the lock file
>>> fp2 = open('x.lock', 'a+')  # second file handle and descriptor for the lock file
>>> lock1 = fcntl.flock(fp1.fileno(), flags)
>>> lock2 = fcntl.flock(fp2.fileno(), flags)
Traceback (most recent call last):
 
File "<stdin>", line 1, in <module>
BlockingIOError: [Errno 11] Resource temporarily unavailable

Only if lock2 was obtained from fp1.fileno() then no exception would have been raised. As soon as a separate file descriptor (for the same file) is used a second lock also fails.
But this is exaclty the point that makes me wonder why it then works in cherrypy. From the log you can see that each thread uses a different file descriptor for the lock file, so it should fail there as well...  I'm still puzzled.

On Friday, 10 May 2019 18:33:27 UTC+2, Tim Roberts wrote:

Ah, you've stumbled across one of the classic blunders, as they say in Princess Bride.  The Unix "flock" call only blocks simultaneous access between different processes.  It doesn't protect multiple threads within a single process.  A single process can call flock repeatedly on a single file, and all of the calls will succeed.

If you want to protect a resource between threads of a single process, you need to use something other than flock, like a  Lock from the "threading" module.

-- 
Tim Roberts, <a href="javascript:" target="_blank" gdf-obfuscated-mailto="RymfcqF5AQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">ti...@...
Providenza & Boekelheide, Inc.

--
You received this message because you are subscribed to the Google Groups "cherrypy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/cherrypy-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/cherrypy-users/82728770-ea14-4a5a-ac5b-cad92c18854c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Session file locking not working properly in multithreaded environment

Tim Roberts
Ralph Heinkel wrote:

> I'm not completely convinced that this is true. Here is an example
> where a second lock on the same file fails within the same thread:
> ...
> Only if lock2 was obtained from fp1.fileno() then no exception would
> have been raised. As soon as a separate file descriptor (for the same
> file) is used a second lock also fails.
> But this is exaclty the point that makes me wonder why it then works
> in cherrypy. From the log you can see that each thread uses a
> different file descriptor for the lock file, so it should fail there
> as well...  I'm still puzzled.
Yes, you're right.  The behavior I described is only true if you try to
flock the same file descriptor, or file descriptors that are "dup"s of
each other.  If the file descriptors came from two separate opens, then
the flock should work.

The zc.lockfile I'm looking at certainly opens a new fd each time, but
the source I'm looking at doesn't have a UnixLockFile class, as your log
shows.

Your log shows to calls to _lock_file, but it doesn't show that the both
succeed.  Do they?

--
Tim Roberts, [hidden email]
Providenza & Boekelheide, Inc.


--
You received this message because you are subscribed to the Google Groups "cherrypy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/cherrypy-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/cherrypy-users/4244c768-db58-011b-2a64-edba1bcedb38%40probo.com.
For more options, visit https://groups.google.com/d/optout.

smime.p7s (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Session file locking not working properly in multithreaded environment

Ralph Heinkel-2
On Friday, 10 May 2019 19:31:28 UTC+2, Tim Roberts wrote:

The zc.lockfile I'm looking at certainly opens a new fd each time, but
the source I'm looking at doesn't have a UnixLockFile class, as your log
shows.

Your log shows to calls to _lock_file, but it doesn't show that the both
succeed.  Do they?


yes, both succeed. The log I showed was truncated a little bit too early. Both threads then worked on the same session file simultaneously, resulting in the described behavior that one thread reads incomplete data from the session file because the other thread hasn't finished yet writing to it.


As I said I was still working on cherrypy 15. And you are right, in the most current cherrypy version zc.lockfile.LockFile() is used. However I run into the same problems with cherrypy 18. That is probably due to the fact that "The zc.lockfile package provides a basic portable implementation of interprocess locks using lock files" as its project description states.


Since in most cases people run cherrypy with multiple threads we might need to think about a locking implementation that also supports inter-thread locking of a file. Otherwise IMO the locking for file sessions only makes half sense.

 

--
You received this message because you are subscribed to the Google Groups "cherrypy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/cherrypy-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/cherrypy-users/14769e69-3b9b-4e73-917a-8ab0a4ee754e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Session file locking not working properly in multithreaded environment

Ralph Heinkel-2
In reply to this post by Ralph Heinkel-2
I've found the reason for the session lock race condition and file a bug on github:

https://github.com/cherrypy/cherrypy/issues/1779

--
You received this message because you are subscribed to the Google Groups "cherrypy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/cherrypy-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/cherrypy-users/1ae1e49d-5317-4f02-9be2-420aba380dfd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.