Generate PDF reports#

The Python ecosystem has everything needed to create PDF reports with a custom layout.

This shows how to generate a PDF report extracting data from an Atoti session and displaying it as a table spanning over multiple pages and a chart.

Creating the session#

Let’s start by creating an Atoti session and loading some data into it:

[1]:

from pathlib import Path

import atoti as tt

resources_directory = Path().cwd().parent / "getting_started" / "tutorial" / "data"

[2]:

session = tt.Session.start()

[3]:

sales_table = session.read_csv(resources_directory / "sales.csv", keys=["Sale ID"])

[4]:

products_table = session.read_csv(
    resources_directory / "products.csv",
    keys=["Product"],
)

[5]:

sales_table.join(products_table, sales_table["Product"] == products_table["Product"])

[6]:

cube = session.create_cube(sales_table)
l, m = cube.levels, cube.measures

Querying the session#

The first query will retrieve the values of a few measures for each Sub category and Brand.

Styling tables#

The result of this first query will be displayed as a table in our report. We’ll use three different techniques to style it:

Defining cell properties in the MDX query#

In the query below, we use calculated measures to style the table cells. In particular, the following portion of the query ensures that the cells for the contributors.COUNT measure are colored red if their value is less than 300: WITH MEMBER [Measures].[Count] as [Measures].[contributors.COUNT], FORE_COLOR=IIf([Measures].[Count] < 300, RGB(255, 135, 135), NULL)

[7]:

count_quantity_and_unit_prices_by_sub_category_and_brand_df = session.query_mdx(
    """
    WITH
    MEMBER [Measures].[Count] as [Measures].[contributors.COUNT],
    FORE_COLOR=IIf([Measures].[Count] < 300, RGB(255, 135, 135), NULL)
    MEMBER [Measures].[Quantity] as [Measures].[Quantity.SUM],
    FORE_COLOR=IIf([Measures].[Quantity.SUM] > 700, RGB(50, 150, 50), NULL)
    SELECT
      NON EMPTY Crossjoin(
        Hierarchize(
          Descendants(
            {
              [Products].[Sub category].[AllMember]
            },
            1,
            SELF_AND_BEFORE
          )
        ),
        Hierarchize(
          Descendants(
            {
              [Products].[Brand].[AllMember]
            },
            1,
            SELF_AND_BEFORE
          )
        )
      ) ON ROWS,
      NON EMPTY {
        [Measures].[Count],
        [Measures].[Quantity],
        [Measures].[Unit price.MEAN],
        [Measures].[Unit price.SUM]
      } ON COLUMNS
    FROM [Sales]
    CELL PROPERTIES
      FORE_COLOR,
      FORMATTED_VALUE,
      VALUE
    """,
    keep_totals=True,
)
count_quantity_and_unit_prices_by_sub_category_and_brand_df.style

[7]:

		Count	Quantity	Unit price.MEAN	Unit price.SUM
Sub category	Brand
Total		5000	8077.0	119.38	596,922.00
Bed		984	1588.0	321.25	316,110.00
	Basic	492	793.0	337.50	166,050.00
	Mega	492	795.0	305.00	150,060.00
Chair		246	401.0	60.00	14,760.00
	Basic	164	266.0	60.00	9,840.00
	Mega	82	135.0	60.00	4,920.00
Hoodie		738	1186.0	47.33	34,932.00
	Basic	164	264.0	47.00	7,708.00
	Mega	164	261.0	46.50	7,626.00
	NewBrand	164	262.0	48.50	7,954.00
	Over	246	399.0	47.33	11,644.00
Shoes		1310	2123.0	60.00	78,600.00
	Basic	409	661.0	60.00	24,540.00
	Mega	328	532.0	60.00	19,680.00
	NewBrand	327	531.0	60.00	19,620.00
	Over	246	399.0	60.00	14,760.00
T-shirt		1230	1991.0	22.00	27,060.00
	Basic	328	533.0	22.00	7,216.00
	Mega	328	529.0	21.50	7,052.00
	NewBrand	328	528.0	22.50	7,380.00
	Over	246	401.0	22.00	5,412.00
Table		492	788.0	255.00	125,460.00
	Basic	246	392.0	240.00	59,040.00
	Mega	246	396.0	270.00	66,420.00

Applying style with pandas#

pandas provides many ways to style a DataFrame.

Here, we’ll apply a background color to a specified row:

[8]:

import pandas as pd

[9]:

index_slice = pd.IndexSlice
subset = index_slice[index_slice["Chair", "Mega"], :]

count_quantity_and_unit_prices_by_sub_category_and_brand_style = (
    count_quantity_and_unit_prices_by_sub_category_and_brand_df.style.set_properties(
        **{"background-color": "#ffff92"},
        subset=subset,
    )
)

count_quantity_and_unit_prices_by_sub_category_and_brand_style

[9]:

		Count	Quantity	Unit price.MEAN	Unit price.SUM
Sub category	Brand
Total		5000	8077.0	119.38	596,922.00
Bed		984	1588.0	321.25	316,110.00
	Basic	492	793.0	337.50	166,050.00
	Mega	492	795.0	305.00	150,060.00
Chair		246	401.0	60.00	14,760.00
	Basic	164	266.0	60.00	9,840.00
	Mega	82	135.0	60.00	4,920.00
Hoodie		738	1186.0	47.33	34,932.00
	Basic	164	264.0	47.00	7,708.00
	Mega	164	261.0	46.50	7,626.00
	NewBrand	164	262.0	48.50	7,954.00
	Over	246	399.0	47.33	11,644.00
Shoes		1310	2123.0	60.00	78,600.00
	Basic	409	661.0	60.00	24,540.00
	Mega	328	532.0	60.00	19,680.00
	NewBrand	327	531.0	60.00	19,620.00
	Over	246	399.0	60.00	14,760.00
T-shirt		1230	1991.0	22.00	27,060.00
	Basic	328	533.0	22.00	7,216.00
	Mega	328	529.0	21.50	7,052.00
	NewBrand	328	528.0	22.50	7,380.00
	Over	246	401.0	22.00	5,412.00
Table		492	788.0	255.00	125,460.00
	Basic	246	392.0	240.00	59,040.00
	Mega	246	396.0	270.00	66,420.00

Adding style with CSS#

Before being a PDF, our report will be an HTML file so we can use CSS to style it:

[10]:

table_css = """
    table {
      border-collapse: collapse;
    }

    td {
      border: 1px solid black;
      padding: 8px;
      text-align: right;
    }

    th {
      border: 1px solid black;
      padding: 8px;
      text-align: center;
    }

    thead {
      background-color: lightgray;
    }
    """

Creating the chart#

Many Python libraries can create charts from pandas DataFrames. In this how-to, we’ll use seaborn which is a wrapper around Matplotlib:

[11]:

quantity_by_sub_category_df = cube.query(m["Quantity.SUM"], levels=[l["Sub category"]])

[12]:

import matplotlib_inline.backend_inline

[13]:

# Render charts as SVG to make them look crisp.
matplotlib_inline.backend_inline.set_matplotlib_formats("svg")

[14]:

import io

import matplotlib.pyplot as plt
import seaborn as sns

[15]:

sns.barplot(
    data=quantity_by_sub_category_df.reset_index(),
    x="Sub category",
    y="Quantity.SUM",
).set(title="Quantity.SUM by Sub category")

chart = io.BytesIO()
plt.savefig(chart, format="svg")

../_images/how_tos_generate_pdf_reports_19_0.svg

Creating the HTML report#

We’ll use Jinja to render this template:

[16]:

from pathlib import Path

from IPython.display import Code

[17]:

template_path = Path("__resources__") / "generate_pdf_reports" / "template.html"

Code(filename=template_path, language="html+jinja")

[17]:

<html>
  <head>
    <style>
      {{ table_css }}
    </style>
  </head>
  <body>
    <header
      style="
        display: flex;
        align-items: center;
        justify-content: space-between;
        width: 100%;
      "
    >
      <h1>{{ report_name }}</h1>
      <h5>Report Date: {{ as_of_date }}</h5>
    </header>
    <div>
      <h3>{{ table_title }}</h3>
      <div>{{ table }}</div>
    </div>
    <br />
    <div>{{ chart }}</div>
  </body>
</html>

With this data:

[18]:

from datetime import datetime

[19]:

template_data = {
    "as_of_date": datetime.now().strftime("%Y/%m/%d"),
    "chart": chart.getvalue().decode(),
    "report_name": "Atoti PDF Report",
    "table": count_quantity_and_unit_prices_by_sub_category_and_brand_style.to_html(),
    "table_css": table_css,
    "table_title": "Measures by Sub category and Brand",
}

[20]:

from jinja2 import Environment, FileSystemLoader

[21]:

environment = Environment(
    autoescape=False,  # noqa: S701
    loader=FileSystemLoader("."),
)
template = environment.get_template(str(template_path))
html = template.render(template_data)

Exporting the report as a PDF#

WeasyPrint can render HTML as a PDF:

[22]:

import weasyprint
from IPython.display import IFrame

[23]:

pdf_path = "../_static/report.pdf"

document = weasyprint.HTML(string=html)
document.write_pdf(pdf_path)

IFrame(pdf_path, width=800, height=800)

Going further#

The report could be sent by email periodically using the schedule library or:

Task Scheduler on Windows
cron on Linux or macOS
A Lambda function in the cloud