Django Lessons Learnt

My rough notes for when I want to get back on the Django horse

Posted by Tanya Dixit on September 15, 2021 · 7 mins read

The most important lessons learned:

  • Django is not a silver bullet for anything. If you want to create a very standard CRUD app, Django is probably the fastest framework out there to flesh out a prod-ready app, but customization takes more time. So, be patient.

  • If you can't be patient, you can't build a customized app, so you kinda have to be patient.

  • Django documentation is good, but sometimes it's hard to find stuff. So for that, look in stack overflow, or read some books like:

  • Django for APIs

  • Django for profs

  • currently reading - two scoops of django

Now onto the meat (technical stuff)

  • Querysets and managers are really handy, use them, and you will have to write way less code

  • Make sure to break each functionality in these patterns again and again - what url the user hits, what view you're gonna present, what data is in that view, and how you are gonna present it. Some people start with writing url, then view, then model, but I advise you to think in terms of what you want your data to have, write a model (or a mdatabase schema), and then take it from there.

Now, my very rough notes on Django

High Performance Django

Probably the best book I have read on managing high performance websites. The most important advice - Keep it simple, do not reinvent the wheel where it's not needed, and use time and battle tested tools wherever possible. Avoid writing your own ones.

  • Django doesn't scale? Well neither does anything else. You need to properly use caching and load balancing

  • For our team at Lincoln Loop, the guiding philosophy in designing high-traffic Django sites is simplicity. Without making a conscious effort to fight complexity at every turn, it is too easy to waste time building complex, unmaintainable monstrosities that will bite you down the road.

Simplicity means:

1. Using as few moving parts as possible to make it all work. "Moving parts" may be

servers, services or third-party software.

2. Choosing proven and dependable moving parts instead of the new hotness.

3. Using a proven and dependable architecture instead of blazing your own trail.

4. Deflecting traffic away from complex parts and toward fast, scalable, and simple parts.

SQL vs NoSQL

Major difference is SQL is table based whereas nosql is key value based, and you can't really do joins on NoSQL databases because there is no concept of foreign keys so you can't do a join faster as in a SQL database. SQL database is indexed according to the key and hence it's easy to join. But a single entity lookup is probably gonna be faster in NoSQL.

Proxy vs reverse-proxy - https://www.cloudflare.com/en-au/learning/cdn/glossary/reverse-proxy/

SSL encryption either on the load balancer or on the reverse-proxy servers.

12 factor philosophy to manage configurations - https://12factor.net/

Config management using the env file - https://platform.sh/blog/2021/we-need-to-talk-about-the-env/

When someone asks you, have you optimised your app? You say, well I have optimized stuff that will be performed at each step, but always do an analysis of what is called the most and then optimise that first. Follow the 80-20 rule.

https://uwsgi-docs.readthedocs.io/en/latest/articles/TheArtOfGracefulReloading.html

Caching - https://blog.bluzelle.com/things-you-should-know-about-database-caching-2e8451656c2d

Cache stampede - https://en.wikipedia.org/wiki/Cache_stampede

Cache invalidation

Caching can be done at various levels. At the database level, it can be done alongside the data (cache aside) where requests are made to the cache, but if it's a miss, the new request is made by the application server to the database. Or we can do read through, where all reads are done from the cache, in case of a miss, the cache has to load the relevant data from the database, or we can do a write through, where same as read through, but for writes, or we can do a write behind - a delayed write which is good for write heavy apps because all the writes are to the cache, and after a specific time, let's say 2 mins, the writes go to the database.

Cache stampede - Multiple servers concurrently asking for one resource. The resource is cached either on their app servers or on caches between the servers and the database. What if it is changed or it expires, then at once all these servers are gonna make a request to get fresh data from the database and that database is gonna be inundated. If there is a timeout, then again, the stampede occurs because the servers are gonna request again.

Invalidation - Invalidation is striking out a cache record because it is no longer fresh. Now this presents an issue - how do you know which cached stuff to invalidate?

Invalidating cached web representations when their underlying data changes can be very simple. For instance, invalidate /articles/123 when article 123 is updated. However, data usually is represented not in one but in multiple representations. Article 123 could also be represented on the articles index (/articles), the list of articles in the current year (/articles/current) and in search results (/search?name=123). In this case, when article 123 is changed, a lot more is involved in invalidating all of its representations. In other words, invalidation adds a layer of complexity to your application.

Johnny Cache implements invalidation by invalidating all entries linked to a particular database when something is written into that database.

Salt config manager manages configuration and keeps the same config for all our deployment servers. Also, it's important to keep dev and prod env separate but identical, because nothing more annoying than a bug that comes up in prod because of an env mismatch.

Also, prod envs are generally more secure, bigger, and have load balancers and stuff. Dev environments are looser with security.