Zápisník Josefa Rouska

Python a React.js programátor, zaměřený na Shopify integrace

Pivot Table in Django Admin using Postgres crosstab

As a person responsible for the process that runs every week for multiple merchants, I wanted to gain insight into the reliability of our solution. Specifically, I wanted to ensure that we had not missed any runs. To accomplish this, I decided to create a pivot table that would allow me to look back a few weeks and see all of the runs that had occurred during that time period.

The query to generate pivot table looks like this and consists of following sections.

SELECT final_result.* FROM crosstab(
$$
    SELECT
        merchant.myshopify_domain,
        EXTRACT('week' FROM (analysis."created" + interval '13 hour') AT TIME ZONE 'Europe/Prague'),
        MAX(analysis."created")
    FROM
        merchant_admin_analysis analysis
        left join merchant_admin_authappshopuser merchant on analysis.shop_id = merchant.id
    WHERE analysis."created" > (NOW() - interval '7 weeks')
    GROUP BY
    1,
    2;
$$,
$$
    SELECT extract('week' from generate_series((NOW() - interval '7 week'), now(), '1 week')::date)::int;
$$
) AS final_result(shop_id varchar, "48" timestamp with time zone,"49" timestamp with time zone,"50" timestamp with time zone,"51" timestamp with time zone,"52" timestamp with time zone,"1" timestamp with time zone,"2" timestamp with time zone,"3" timestamp with time zone);
  1. The first section of this query is the main SELECT statement which selects the final result of a crosstab query. The crosstab function is used to create a pivot table from the results of a SELECT statement.
  2. The first argument of the crosstab function is a SELECT statement that retrieves data from the “merchant_admin_analysis” table and joins it with the “merchant_admin_authappshopuser” table on the “shop_id” column. The query extracts the week from the “created” column, adds 13 hours to it, and converts the time zone to ‘Europe/Prague’. This is to make sure we’re covering entire world. I’m sure there’s a better way to do this. It also gets the maximum “created” value for each merchant and week, where the analysis “created” is greater than 7 weeks ago. The query groups the results by the merchant’s myshopify_domain and the extracted week.
  3. The second argument of the crosstab function is another SELECT statement that generates a series of weeks starting from 7 weeks ago until now, and extracts the week number from it. This is used as the column names in the resulting crosstab table.
  4. The final result of the crosstab query is given the column names “shop_id” of type varchar, and “48” to “3” of type timestamp with time zone. These column names correspond to the weeks in the generate series of the second argument of the crosstab function and represent the maximum created column of the first query in each week.

One of the challenges I encountered in this endeavor was that we are using Postgres as our database, and the crosstab function built into Postgres does not support dynamic columns. To work around this limitation, I decided to split the select into two. First I generate a list of weeks, and then use that list to generate the crosstab query in Python.

import logging

from django.db import connection
from django.shortcuts import render
from django.views import View

logger = logging.getLogger(__name__)


def generate_crosstab_query(weeks):
    columns = ",".join(f'"{week}" timestamp with time zone' for week in weeks)
    return f"""
    SELECT final_result.* FROM crosstab(
    $$
        SELECT
            merchant.myshopify_domain,
            EXTRACT('week' FROM (analysis."created" + interval '13 hour') AT TIME ZONE 'Europe/Prague'),
            MAX(analysis."created")
        FROM
            merchant_admin_analysis analysis
            left join merchant_admin_authappshopuser merchant on analysis.shop_id = merchant.id
        WHERE analysis."created" > (NOW() - interval '7 weeks')
        GROUP BY
        1,
        2;
    $$,
    $$
        SELECT extract('week' from generate_series((NOW() - interval '7 week'), now(), '1 week')::date)::int;
    $$
    ) AS final_result(shop_id varchar, {columns});
    """


WEEKS_QUERY = """
SELECT extract('week' from generate_series((NOW() - interval '7 week'), now(), '1 week')::date)::int;
"""


def use_raw_query(query):
    with connection.cursor() as cursor:
        cursor.execute(query)
        rows = cursor.fetchall()
    return rows


def auto_analysis_report():
    weeks = use_raw_query(WEEKS_QUERY)
    weeks = [row[0] for row in weeks]
    report = use_raw_query(generate_crosstab_query(weeks))
    return weeks, report


class AutoAnalysisReportView(View):
    admin_site = None

    def get(self, request):
        context = dict(self.admin_site.each_context(request))
        weeks, report = auto_analysis_report()
        context |= {"report": report, "weeks": weeks}

        return render(request, "admin/report.html", context)

The template is simple table extending {% extends "admin/base_site.html" %}.


{% extends "admin/base_site.html" %}

{% block title %}
Order statuses {{ block.super }}
{% endblock %}

{% block breadcrumbs %}
<div class="breadcrumbs">
    <a href="{% url 'admin:index' %}">Home</a>
</div>
{% endblock %}

{% block content %}
<div id="content-main">
    <hr style="margin: 2vh 0 2vh 0;">

    <table>
        <thead>
            <th>Shop</th>
            {% for week in weeks %}
            <th>{{week}}</th>
            {% endfor %}
            <th>inactive more than 180 days</th>
        </thead>

        <tbody>
            {% for row in report %}
            <tr>
                {% for item in row %}
                <td>{{ item|default:"-" }}</td>
                {% endfor %}
            </tr>
            {% endfor %}
        </tbody>
    </table>
</div>
{% endblock %}

Then to add the view to Django Admin sidebar. There’s no easy/standard/right way to do this. I kinda liked the custom admin site approach i found at StackOverflow.


from django.contrib import admin
from django.urls import path, reverse

import merchant_admin.admin_views as views


class CustomAdminSite(admin.AdminSite):

    def get_urls(self):
        urls = super().get_urls()

        custom_urls = [
            path(
                "auto-analysis-report",
                self.admin_view(views.AutoAnalysisReportView.as_view(admin_site=self)),
                name="auto_analysis_report",
            ),
        ]
        return custom_urls + urls

    def _build_app_dict(self, request, label=None):
        # we create manually a dict to fake a model for our view 'auto_analysis_report'
        # this is an example how the dict should look like, i.e. which keys
        # should be present, the actual values may vary
        stats = {
            "name": "Auto analysis report",
            "admin_url": reverse("admin:auto_analysis_report"),
            "object_name": "Auto analysis report",
            "perms": {"delete": False, "add": False, "change": False},
            "add_url": "",
        }
        # get the app dict from the parent method
        app_dict = super()._build_app_dict(request, label)
        # check if there is value for label, then the app_index will be rendered
        if label:
            # append the manually created dictionary 'stats'
            app_dict["models"].append(stats)
        else:
            app = app_dict.get("merchant_admin", None)
            # if an entry for 'merchant_admin' has been found
            # we append our manually created dictionary
            if app:
                app["models"].append(stats)
        return app_dict