Skip to content Skip to sidebar Skip to footer

How To Optimize/simplify Heapsorting Django Objects? (can't Use Modules And Db Sorts)

I have to ask for some help with an assignment I got as a test for a django internship. I had to make and imaginary api with rabbits and their carrots. Each rabbit was supposed to

Solution 1:

This is the classic N+1 queries problem. You perform a single query to fetch all the bunnies, but then you go on to do bunnies[child].vegetables.get(vegetableType=vegetableType) for each bunny, which performs an additional query, and thus an additional database roundtrip, for each bunny. So you perform 1 query for N bunnies, plus around N queries to get all the vegetables (hence the N+1).

Database roundtrips are one of the most expensive resource available to web developers. While comparisons take somewhere in the order of nanoseconds, a database roundtrip takes in the order of milliseconds. Do ~20K queries and this will soon add up to take several minutes.

The quick solution is to use prefetch_related('vegetables') and exclusively use bunny.getVegetable('carrot') to get the carrot. prefetch_related() will perform a single query to get all the vegetables for all bunnies and cache them, so iterating self.vegetables.all() in getVegetables() won't perform any additional queries.


There are better solutions, though. In this case, it seems that each bunny should have at most 1 Vegetable object of a specific vegetableType. If you enforce this at the database level, you won't have to worry about errors in your sorting algorithm when someone decides to add a second Vegetable of type 'carrot' to a bunny. Instead, the database will stop them from doing that in the first place. To do this, you need a unique_together constraint:

classVegetable(models.Model):
    ...
    classMeta:
        unique_together = [
            ('vegetableType', 'bunny'),
        ]

Then, rather than fetching all bunnies and prefetching all related vegetables, you can fetch all vegetables of type "carrot" and join the related bunnies. Now you will only have a single query:

carrots = Vegetable.objects.filter(vegetableType='carrot').select_related('bunny')

Since the combination of vegetableType and bunny is unique, you won't get any duplicate bunnies, and you will still get all bunnies that have some carrots.

Of course you'd have to adapt your algorithm to work with the vegetables rather than the bunnies.

Post a Comment for "How To Optimize/simplify Heapsorting Django Objects? (can't Use Modules And Db Sorts)"