Replicating complex query with two dense_rank() functions using ORM

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Replicating complex query with two dense_rank() functions using ORM

Brad Buran
I have a "puzzle of the day" that users can answer. I track the puzzle of the day using the Puzzle model. I track the answers using the PuzzleAnswer model. I would like to calculate the number of consecutive puzzles a particular user (i.e., the author) gets right in a row. The current SQL I use that can calculate the start date of the streak, end date of the streak and the number of days in the streak. As you can see, it does a dens_rank over the puzzles (to count them in order), then does a join with the PuzzleAnswer, then does a second dense rank over the merged tables. I figured out how to use the DenseRank function in the Django ORM on the Puzzle manager, but I cannot figure out how to do the left join next. Any advice?

SELECT min(s.id) AS id,
   count(s.date) AS streak,
   min(s.date) AS start_streak,
   max(s.date) AS end_streak,
   s.author_id
  FROM ( SELECT dense_rank() OVER (ORDER BY ROW(pa.author_id, pr.rank)) AS id,
           pa.created AS date,
           (pr.rank - dense_rank() OVER (ORDER BY ROW(pa.author_id, pr.rank))) AS g,
           pa.author_id
          FROM (( SELECT "POTD_puzzle".id,
                   dense_rank() OVER (ORDER BY "POTD_puzzle".published) AS rank
                  FROM public."POTD_puzzle") pr
            JOIN public."POTD_puzzleanswer" pa ON ((pr.id = pa.puzzle_id)))
         WHERE pa.correct) s
 GROUP BY s.author_id, s.g
 ORDER BY count(s.date) DESC;

The models are:

class PuzzleAnswer(models.Model):        
   puzzle = models.ForeignKey(Puzzle, editable=True, on_delete=models.CASCADE)
   answer = models.CharField(max_length=64)                          
   correct = models.BooleanField(editable=False)       
   created = models.DateTimeField(auto_now_add=True)
   author = models.ForeignKey(settings.AUTH_USER_MODEL, blank=True, null=True,
                              on_delete=models.SET_NULL)                                                                                          
class Puzzle(models.Model):
   category = models.ForeignKey(PuzzleCategory, on_delete=models.CASCADE, help_text=category_help)   
   notation = models.CharField(max_length=64)            
   correct_answer = models.CharField(max_length=64)
   published = models.DateField(blank=True, null=True, db_index=True, unique=True)                                                                                                                
        

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/3d8106a7-df44-4697-91fe-06c861132ae6n%40googlegroups.com.
Reply | Threaded
Open this post in threaded view
|

Re: Replicating complex query with two dense_rank() functions using ORM

Ramon NHB
Hi,

In Django queryset you can query PuzzleAnswer, filter by author (assuming that is the user), order_by('created)' and return the created dates only: values_list('created', flat=True). Then take the len() of that list and check the [0] and [-1] entries for first and last date. The rest is date math.

In general I try to keep the number of database accesses as low as possible. A pre-calculated answer takes less resources to present than calculating this over and over again, especially if you have many users. So you could do the above and store the result in a new table every time a new PuzzleAnswer has been added for a user.

Ramon

Op woensdag 4 november 2020 om 05:34:22 UTC+1 schreef Brad Buran:
I have a "puzzle of the day" that users can answer. I track the puzzle of the day using the Puzzle model. I track the answers using the PuzzleAnswer model. I would like to calculate the number of consecutive puzzles a particular user (i.e., the author) gets right in a row. The current SQL I use that can calculate the start date of the streak, end date of the streak and the number of days in the streak. As you can see, it does a dens_rank over the puzzles (to count them in order), then does a join with the PuzzleAnswer, then does a second dense rank over the merged tables. I figured out how to use the DenseRank function in the Django ORM on the Puzzle manager, but I cannot figure out how to do the left join next. Any advice?

SELECT min(s.id) AS id,
   count(s.date) AS streak,
   min(s.date) AS start_streak,
   max(s.date) AS end_streak,
   s.author_id
  FROM ( SELECT dense_rank() OVER (ORDER BY ROW(pa.author_id, pr.rank)) AS id,
           pa.created AS date,
           (pr.rank - dense_rank() OVER (ORDER BY ROW(pa.author_id, pr.rank))) AS g,
           pa.author_id
          FROM (( SELECT "POTD_puzzle".id,
                   dense_rank() OVER (ORDER BY "POTD_puzzle".published) AS rank
                  FROM public."POTD_puzzle") pr
            JOIN public."POTD_puzzleanswer" pa ON ((pr.id = pa.puzzle_id)))
         WHERE pa.correct) s
 GROUP BY s.author_id, s.g
 ORDER BY count(s.date) DESC;

The models are:

class PuzzleAnswer(models.Model):        
   puzzle = models.ForeignKey(Puzzle, editable=True, on_delete=models.CASCADE)
   answer = models.CharField(max_length=64)                          
   correct = models.BooleanField(editable=False)       
   created = models.DateTimeField(auto_now_add=True)
   author = models.ForeignKey(settings.AUTH_USER_MODEL, blank=True, null=True,
                              on_delete=models.SET_NULL)                                                                                          
class Puzzle(models.Model):
   category = models.ForeignKey(PuzzleCategory, on_delete=models.CASCADE, help_text=category_help)   
   notation = models.CharField(max_length=64)            
   correct_answer = models.CharField(max_length=64)
   published = models.DateField(blank=True, null=True, db_index=True, unique=True)                                                                                                                
        

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/953151f0-3e19-4111-ab08-e330169908bdn%40googlegroups.com.
Reply | Threaded
Open this post in threaded view
|

Re: Replicating complex query with two dense_rank() functions using ORM

charettes
This is unfortunately not possible to do through the ORM right now due to lack of support for filtering against window expressions[0]

Until this ticket is solved you'll have to rely on raw SQL.

Cheers,
Simon

[0] ticket https://code.djangoproject.com/ticket/28333

Le mercredi 4 novembre 2020 à 12:25:30 UTC-5, [hidden email] a écrit :
Hi,

In Django queryset you can query PuzzleAnswer, filter by author (assuming that is the user), order_by('created)' and return the created dates only: values_list('created', flat=True). Then take the len() of that list and check the [0] and [-1] entries for first and last date. The rest is date math.

In general I try to keep the number of database accesses as low as possible. A pre-calculated answer takes less resources to present than calculating this over and over again, especially if you have many users. So you could do the above and store the result in a new table every time a new PuzzleAnswer has been added for a user.

Ramon

Op woensdag 4 november 2020 om 05:34:22 UTC+1 schreef Brad Buran:
I have a "puzzle of the day" that users can answer. I track the puzzle of the day using the Puzzle model. I track the answers using the PuzzleAnswer model. I would like to calculate the number of consecutive puzzles a particular user (i.e., the author) gets right in a row. The current SQL I use that can calculate the start date of the streak, end date of the streak and the number of days in the streak. As you can see, it does a dens_rank over the puzzles (to count them in order), then does a join with the PuzzleAnswer, then does a second dense rank over the merged tables. I figured out how to use the DenseRank function in the Django ORM on the Puzzle manager, but I cannot figure out how to do the left join next. Any advice?

SELECT min(s.id) AS id,
   count(s.date) AS streak,
   min(s.date) AS start_streak,
   max(s.date) AS end_streak,
   s.author_id
  FROM ( SELECT dense_rank() OVER (ORDER BY ROW(pa.author_id, pr.rank)) AS id,
           pa.created AS date,
           (pr.rank - dense_rank() OVER (ORDER BY ROW(pa.author_id, pr.rank))) AS g,
           pa.author_id
          FROM (( SELECT "POTD_puzzle".id,
                   dense_rank() OVER (ORDER BY "POTD_puzzle".published) AS rank
                  FROM public."POTD_puzzle") pr
            JOIN public."POTD_puzzleanswer" pa ON ((pr.id = pa.puzzle_id)))
         WHERE pa.correct) s
 GROUP BY s.author_id, s.g
 ORDER BY count(s.date) DESC;

The models are:

class PuzzleAnswer(models.Model):        
   puzzle = models.ForeignKey(Puzzle, editable=True, on_delete=models.CASCADE)
   answer = models.CharField(max_length=64)                          
   correct = models.BooleanField(editable=False)       
   created = models.DateTimeField(auto_now_add=True)
   author = models.ForeignKey(settings.AUTH_USER_MODEL, blank=True, null=True,
                              on_delete=models.SET_NULL)                                                                                          
class Puzzle(models.Model):
   category = models.ForeignKey(PuzzleCategory, on_delete=models.CASCADE, help_text=category_help)   
   notation = models.CharField(max_length=64)            
   correct_answer = models.CharField(max_length=64)
   published = models.DateField(blank=True, null=True, db_index=True, unique=True)                                                                                                                
        

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/7282e7dc-e098-462e-b9aa-050dfcbde60dn%40googlegroups.com.