This video is available to students only

Improving Performance

In web applications, performance is actually a feature that will improve the overall usability of your product and influence the success of your project. Pages that are slow to load lead to frustrated users and can result in users abandoning your site. The Google search ranking algorithm now explicitly accounts for the performance of web pages. In e-commerce applications, like Yumroad's storefronts and product page, page speed is directly linked to the bounce rate of visitors. A study by the analytics company CrazyEgg, found that a one second delay leads to a 16% decrease in customer satisfaction and a 7% decrease in conversions. Pinterest, an online discovery tool, was able to increase their sign-ups by 15% and increased their organic search traffic by reducing the perceived waiting time for users.

There's a three step process to improving performance in your web application: Measure, Optimize, and Cache. We will go through each step within Yumroad and measure our improvements as we go along.

Measurement

There are a few core metrics we want to measure when it comes to performance: the time it took Flask to render a response, the amount of time Flask spent to query the database, and what queries SQLAlchemy issued to the database. While it's possible to measure all of these manually on a one-off basis, in order to continually measure and optimize this as we continue to develop our application, we will need to integrate these measurements into our development workflow.

Flask Debug Toolbar

Flask-DebugToolbar provides all of this information and embeds it into a helpful toolbar accessible in our development environment.

Set it up by adding flask-debugtoolbar to the requirements.txt file and then initializing it.

Within yumroad/extensions.py, initialize the extension.

from flask_debugtoolbar import DebugToolbarExtension
...
debug_toolbar = DebugToolbarExtension()

By default, in addition to adding additional content to the page, the debug toolbar intercepts redirects and presents an HTML page detailing the redirect, so we will will disable that in our test configuration. If you find redirection interceptions are not useful, you can also disable it in your development environment.

In yumroad/config.py, add the following settings:

class DevConfig(BaseConfig):
    DEBUG_TB_INTERCEPT_REDIRECTS = False # Optional

class TestConfig(BaseConfig):
    DEBUG_TB_ENABLED = False

Now in yumroad/__init__.py, you can initialize the application. By default, the debug toolbar will only load when Flask is set the debug mode (which happens in the development environment by default)

from yumroad.extensions import (..., debug_toolbar)

def create_app(environment_name='dev'):
    ...
    debug_toolbar.init_app(app)
    ...

Now when you run your application (flask run) in development mode, you will see a toolbar that you can hide on the right side of your application. Flask-DebugToolbar injects HTML into the templates that are rendered to add this to our pages.

When the toolbar is hidden, it looks like this:

![Debug Toolbar Hidden](./public/images/15-performance/logged out view.png)

The full toolbar comes with a complement of tools:

Debug Toolbar
Debug Toolbar

Under "Time", we can see the time it took to render the page (19 ms in this case) and clicking into it gives us more details about the time it took to render the page.

CPU Time
CPU Time

The toolbar also gives us the ability to profile the execution of requests and get a detailed trace of where Flask was spending time. By default, it is not enabled, but if you click on "Profiler" it becomes enabled and will start tracing future requests. Once it's enabled, if you refresh the page and pull up the Profiler again, you will see a detailed trace of where Flask was spending time.

Trace
Trace

This becomes very useful if you want to focus on optimizing any functions within your own application that are a bottleneck. In this trace, everything seems pretty fast, but in practice a lot of web application optimization comes from thinking about database usage.

To do check how well we are doing on the database front, we'll have to look at the actual queries that SQLAlchemy issued. The SQLAlchemy tab will show us what queries are being issued, how long they took, and what lines are triggering them. The landing blueprint in blueprints/landing.py simply grabs up to three stores, and renders them alongside a count of the number of products.

@landing_bp.route('/')
def index():
    stores = Store.query.limit(3).all()
    return render_template('landing/index.html', stores=stores)
Before Optimization
Before Optimization

If we limit our query to a single store, the queries tab looks like this:

Before Optimization (One Store)
Before Optimization (One Store)

If I change the login in the landing route to render all three of the stores in my local database, the queries tab, looks like this.

Before Optimization (Three Stores)
Before Optimization (Three Stores)

The query to the Users table occurs as part of the call to render_template (according to the context column in the log above). For every store we render on the homepage, we access the user of the store and we'll need to issue a new query to get that user's email address.

This is known as the N+1 anti-pattern.

The N+1 Problem

Why is it called N+1? In this example, if we had N stores, we would issue one query to fetch all of the stores, and N queries to the User table. The total number of queries would be N+1. This will slow down your application as you will need to issue a lot of queries and wait for a response to each one.

Since databases are often not on the same machine as the web server and must communicate the network, it introduces a lot of unnecessary latency.

N+1s in the real world. Imagine you are cooking a recipe and for each ingredient ingredient you stopped everything you were doing, ran to the grocery store, and only got that specific ingredient, and then came back only to repeat the same process for every ingredient. It'd be painfully slow. Instead to optimize that process, you would want to get a list of everything you need and purchase all of the necessary ingredients at once.

To resolve this N+1 problem, we need to tell SQLAlchemy to load the User table through a join when loading the Store query. One way we can do that is to do what we did for Product, by specifying that the relationship should be loaded via a join (instead of being lazily loaded only when we attempt to access the user attribute, like it is now). The types of loading patterns that are available to are detailed within SQLALchemy documentation for relationships.

We will need to tell SQLAlchemy to perform a joined load on the user relationship. To do that we will import joinedload from sqlalchemy.orm and pass it in as an option to the query.

from sqlalchemy.orm import joinedload

from yumroad.models import Store
...
@landing_bp.route('/')
def index():
    stores = Store.query.options(
        joinedload(Store.products)
    ).limit(3).all()
    return render_template('landing/index.html', stores=stores)

Once we make this change, we see that we are now only executing one query and have improved the page load speed.

After optimization
After optimization
.

joinedload vs join

A common point of confusion is why we don't just use a normal join like we saw earlier in Chapter 9, and why we have to use a joinedload that results in a slightly more complicated query.

The issue is that adding a join to a SQL query can result in the results being substantially different than without a join.

For example Store.query.all() would emit a SQL query similar to select * from store and the results might look like this:

Store IDStore Name
1The newline store
2Sue's Magic Store
3Sam's Books

If we however tell SQLAlchemy to issue a query using Store.query.join(Product).limit(3).all() or Store.query.join(Store.product).limit(3).all(), this causes an issue because we will see repeated rows for each product in each store. In our case, "The newline store" has four products, so it will show up four times in this query. The SQL query emitted is SELECT store.id AS store_id, store.name AS store_name, store.user_id AS store_user_id FROM store JOIN product ON store.id = product.store_id LIMIT 3, which produces something like this:

Store IDStore Name
1The newline store
1The newline store
1The newline store

Behind the scenes, each row in that table represents a product in "The newline store" . The product details do not show up since we didn't select for any of the attributes on the product table. Even if we do end up accessing the product attribute later in our code, SQLAlchemy will have to issue another query to get the associated products. We end up with two problems: incorrect data and we still have additional queries.

How would we load the joined table fields if we wanted to? We would need to tell SQLAlchemy about the relationship passing an option using the contains_eager from sqlalchemy.orm. Example: Store.query.join(Store.products).options(contains_eager(Store.products)) would emit something similiar to select product.*, store.* FROM store JOIN product ON store.id = product.store_id

A joinedload is something that SQLAlchemy provides to solve this problem, by allowing us to not have to rewrite queries and instead having SQLAlchemy manage the complexity of figuring out how to manage the joins so that the intent of our original query still remains intact.

Our simple "pythonic" query, turns into a long SQL command that would be tedious to write by hand

    stores = Store.query.options(
        joinedload(Store.products)
    ).limit(3).all()
SELECT
   anon_1.store_id AS anon_1_store_id,
   anon_1.store_name AS anon_1_store_name,
   anon_1.store_user_id AS anon_1_store_user_id,
   product_1.id AS product_1_id,
   product_1.name AS product_1_name,
   product_1.description AS product_1_description,
   product_1.store_id AS product_1_store_id,
   product_1.creator_id AS product_1_creator_id,
   product_1.price_cents AS product_1_price_cents,
   product_1.picture_url AS product_1_picture_url
FROM
   (
      SELECT
         store.id AS store_id,
         store.name AS store_name,
         store.user_id AS store_user_id
      FROM
         store LIMIT ? OFFSET ?
   )
   AS anon_1
   LEFT OUTER JOIN
      product AS product_1
      ON anon_1.store_id = product_1.store_id

By default joinedload performs left outer joins, but you can configure it to perform inner joins, which could perform better in specific scenarios. We won't delve too deeply into the depths of SQL here, but suffice it to say that the interface SQLAlchemy provides can save you a lot of time in handcrafting queries and improving performance.

Isn't there a more efficient query here since we only end up using the count of products on the template?

Yes (only selecting the count of products directly within our SQL query) but if we do ever use product attributes (like we do on the store page), this relationship will come in handy.

Preventing N+1s

The performance impact of N+1 queries can be severe at scale. To automatically add warnings to your application, an open source extension called nplusone can automatically warm you about queries that appear to fall into the N+1 trap.

If you have a relationship where you imagine most of the queries involving it will reference the relationship, you can configure SQLAlchemy to avoid "lazyloading" and automatically do a joined load.

To do that, we'd pass in lazy='joined' as an argument to the relationship operator on SQLAlchemy. Now whenever we query the Store table, SQLAlchemy will automatically perfom the "joined load" we manually specified above.

Start a new discussion. All notification go to the author.