Generate PDF reports#

The Python ecosystem has everything needed to create PDF reports with a custom layout.

This shows how to generate a PDF report extracting data from an Atoti session and displaying it as a table spanning over multiple pages and a chart.

Creating the session#

Let’s start by creating an Atoti session and loading some data into it:

[1]:
import atoti as tt
[2]:
session = tt.Session()
[3]:
sales_table = session.read_csv(
    "../../atoti/atoti/tutorial/data/sales.csv", keys=["Sale ID"]
)
[4]:
products_table = session.read_csv(
    "../../atoti/atoti/tutorial/data/products.csv", keys=["Product"]
)
[5]:
sales_table.join(products_table, sales_table["Product"] == products_table["Product"])
[6]:
cube = session.create_cube(sales_table)
l, m = cube.levels, cube.measures

Querying the session#

The first query will retrieve the values of a few measures for each Sub category and Brand.

Styling tables#

The result of this first query will be displayed as a table in our report. We’ll use three different techniques to style it:

Defining cell properties in the MDX query#

In the query below, we use calculated measures to style the table cells. In particular, the following portion of the query ensures that the cells for the contributors.COUNT measure are colored red if their value is less than 300: WITH MEMBER [Measures].[Count] as [Measures].[contributors.COUNT], FORE_COLOR=IIf([Measures].[Count] < 300, RGB(255, 135, 135), NULL)

[7]:
count_quantity_and_unit_prices_by_sub_category_and_brand_df = session.query_mdx(
    """
    WITH
    MEMBER [Measures].[Count] as [Measures].[contributors.COUNT],
    FORE_COLOR=IIf([Measures].[Count] < 300, RGB(255, 135, 135), NULL)
    MEMBER [Measures].[Quantity] as [Measures].[Quantity.SUM],
    FORE_COLOR=IIf([Measures].[Quantity.SUM] > 700, RGB(50, 150, 50), NULL)
    SELECT
      NON EMPTY Crossjoin(
        Hierarchize(
          Descendants(
            {
              [Products].[Sub category].[AllMember]
            },
            1,
            SELF_AND_BEFORE
          )
        ),
        Hierarchize(
          Descendants(
            {
              [Products].[Brand].[AllMember]
            },
            1,
            SELF_AND_BEFORE
          )
        )
      ) ON ROWS,
      NON EMPTY {
        [Measures].[Count],
        [Measures].[Quantity],
        [Measures].[Unit price.MEAN],
        [Measures].[Unit price.SUM]
      } ON COLUMNS
    FROM [Sales]
    CELL PROPERTIES
      FORE_COLOR,
      FORMATTED_VALUE,
      VALUE
    """,
    keep_totals=True,
)
count_quantity_and_unit_prices_by_sub_category_and_brand_df.style
[7]:
    Count Quantity Unit price.MEAN Unit price.SUM
Sub category Brand        
Total 5000 8077.0 119.38 596,922.00
Bed 984 1588.0 321.25 316,110.00
Basic 492 793.0 337.50 166,050.00
Mega 492 795.0 305.00 150,060.00
Chair 246 401.0 60.00 14,760.00
Basic 164 266.0 60.00 9,840.00
Mega 82 135.0 60.00 4,920.00
Hoodie 738 1186.0 47.33 34,932.00
Basic 164 264.0 47.00 7,708.00
Mega 164 261.0 46.50 7,626.00
NewBrand 164 262.0 48.50 7,954.00
Over 246 399.0 47.33 11,644.00
Shoes 1310 2123.0 60.00 78,600.00
Basic 409 661.0 60.00 24,540.00
Mega 328 532.0 60.00 19,680.00
NewBrand 327 531.0 60.00 19,620.00
Over 246 399.0 60.00 14,760.00
T-shirt 1230 1991.0 22.00 27,060.00
Basic 328 533.0 22.00 7,216.00
Mega 328 529.0 21.50 7,052.00
NewBrand 328 528.0 22.50 7,380.00
Over 246 401.0 22.00 5,412.00
Table 492 788.0 255.00 125,460.00
Basic 246 392.0 240.00 59,040.00
Mega 246 396.0 270.00 66,420.00

Applying style with pandas#

pandas provides many ways to style a DataFrame.

Here, we’ll apply a background color to a specified row:

[8]:
import pandas as pd
[9]:
index_slice = pd.IndexSlice
subset = index_slice[index_slice["Chair", "Mega"], :]

count_quantity_and_unit_prices_by_sub_category_and_brand_style = (
    count_quantity_and_unit_prices_by_sub_category_and_brand_df.style.set_properties(
        **{"background-color": "#ffff92"},
        subset=subset,
    )
)

count_quantity_and_unit_prices_by_sub_category_and_brand_style
[9]:
    Count Quantity Unit price.MEAN Unit price.SUM
Sub category Brand        
Total 5000 8077.0 119.38 596,922.00
Bed 984 1588.0 321.25 316,110.00
Basic 492 793.0 337.50 166,050.00
Mega 492 795.0 305.00 150,060.00
Chair 246 401.0 60.00 14,760.00
Basic 164 266.0 60.00 9,840.00
Mega 82 135.0 60.00 4,920.00
Hoodie 738 1186.0 47.33 34,932.00
Basic 164 264.0 47.00 7,708.00
Mega 164 261.0 46.50 7,626.00
NewBrand 164 262.0 48.50 7,954.00
Over 246 399.0 47.33 11,644.00
Shoes 1310 2123.0 60.00 78,600.00
Basic 409 661.0 60.00 24,540.00
Mega 328 532.0 60.00 19,680.00
NewBrand 327 531.0 60.00 19,620.00
Over 246 399.0 60.00 14,760.00
T-shirt 1230 1991.0 22.00 27,060.00
Basic 328 533.0 22.00 7,216.00
Mega 328 529.0 21.50 7,052.00
NewBrand 328 528.0 22.50 7,380.00
Over 246 401.0 22.00 5,412.00
Table 492 788.0 255.00 125,460.00
Basic 246 392.0 240.00 59,040.00
Mega 246 396.0 270.00 66,420.00

Adding style with CSS#

Before being a PDF, our report will be an HTML file so we can use CSS to style it:

[10]:
table_css = """
    table {
      border-collapse: collapse;
    }

    td {
      border: 1px solid black;
      padding: 8px;
      text-align: right;
    }

    th {
      border: 1px solid black;
      padding: 8px;
      text-align: center;
    }

    thead {
      background-color: lightgray;
    }
    """

Creating the chart#

Many Python libraries can create charts from pandas DataFrames. In this how-to, we’ll use seaborn which is a wrapper around Matplotlib:

[11]:
quantity_by_sub_category_df = cube.query(m["Quantity.SUM"], levels=[l["Sub category"]])
[12]:
import matplotlib_inline.backend_inline
[13]:
# Render charts as SVG to make them look crisp.
matplotlib_inline.backend_inline.set_matplotlib_formats("svg")
[14]:
import io

import matplotlib.pyplot as plt
import seaborn as sns
[15]:
sns.barplot(
    data=quantity_by_sub_category_df.reset_index(), x="Sub category", y="Quantity.SUM"
).set(title="Quantity.SUM by Sub category")

chart = io.BytesIO()
plt.savefig(chart, format="svg")
../_images/how_tos_generate_pdf_reports_19_0.svg

Creating the HTML report#

We’ll use Jinja to render this template:

[16]:
from pathlib import Path

from IPython.display import Code
[17]:
template_path = Path() / "resources" / "generate_pdf_reports" / "template.html"

Code(filename=template_path, language="html+jinja")
[17]:
<html>
  <head>
    <style>
      {{ table_css }}
    </style>
  </head>
  <body>
    <header
      style="
        display: flex;
        align-items: center;
        justify-content: space-between;
        width: 100%;
      "
    >
      <h1>{{ report_name }}</h1>
      <h5>Report Date: {{ as_of_date }}</h5>
    </header>
    <div>
      <h3>{{ table_title }}</h3>
      <div>{{ table }}</div>
    </div>
    <br />
    <div>{{ chart }}</div>
  </body>
</html>

With this data:

[18]:
from datetime import datetime
[19]:
template_data = {
    "as_of_date": datetime.now().strftime("%Y/%m/%d"),
    "chart": chart.getvalue().decode("utf8"),
    "report_name": "Atoti PDF Report",
    "table": count_quantity_and_unit_prices_by_sub_category_and_brand_style.to_html(),
    "table_css": table_css,
    "table_title": "Measures by Sub category and Brand",
}
[20]:
from jinja2 import Environment, FileSystemLoader
[21]:
environment = Environment(
    autoescape=False,  # noqa: S701
    loader=FileSystemLoader("."),
)
template = environment.get_template(str(template_path))
html = template.render(template_data)

Exporting the report as a PDF#

WeasyPrint can render HTML as a PDF:

[22]:
import weasyprint
from IPython.display import IFrame
[23]:
pdf_path = "../_static/report.pdf"

document = weasyprint.HTML(string=html)
document.write_pdf(pdf_path)

IFrame(pdf_path, width=800, height=800)
[23]:

Going further#

The report could be sent by email periodically using the schedule library or:

  • Task Scheduler on Windows

  • cron on Linux or macOS

  • A Lambda function in the cloud