Generate PDF reports#
The Python ecosystem has everything needed to create PDF reports with a custom layout.
This shows how to generate a PDF report extracting data from an Atoti session and displaying it as a table spanning over multiple pages and a chart.
Creating the session#
Let’s start by creating an Atoti session and loading some data into it:
[1]:
from pathlib import Path
import atoti as tt
resources_directory = Path().cwd().parent / "getting_started" / "tutorial" / "data"
[2]:
session = tt.Session.start()
[ ]:
sales_table = session.read_csv(
resources_directory / "sales.csv",
keys={"Sale ID"},
table_name="Sales",
)
[ ]:
products_table = session.read_csv(
resources_directory / "products.csv",
keys={"Product"},
table_name="Products",
)
[5]:
sales_table.join(products_table, sales_table["Product"] == products_table["Product"])
[6]:
cube = session.create_cube(sales_table)
l, m = cube.levels, cube.measures
Querying the session#
The first query will retrieve the values of a few measures for each Sub category and Brand.
Styling tables#
The result of this first query will be displayed as a table in our report. We’ll use three different techniques to style it:
Defining cell properties in the MDX query#
In the query below, we use calculated measures to style the table cells. In particular, the following portion of the query ensures that the cells for the contributors.COUNT measure are colored red if their value is less than 300: WITH MEMBER [Measures].[Count] as [Measures].[contributors.COUNT], FORE_COLOR=IIf([Measures].[Count] < 300, RGB(255, 135, 135), NULL)
[7]:
count_quantity_and_unit_prices_by_sub_category_and_brand_df = session.query_mdx(
"""
WITH
MEMBER [Measures].[Count] as [Measures].[contributors.COUNT],
FORE_COLOR=IIf([Measures].[Count] < 300, RGB(255, 135, 135), NULL)
MEMBER [Measures].[Quantity] as [Measures].[Quantity.SUM],
FORE_COLOR=IIf([Measures].[Quantity.SUM] > 700, RGB(50, 150, 50), NULL)
SELECT
NON EMPTY Crossjoin(
Hierarchize(
Descendants(
{
[Products].[Sub category].[AllMember]
},
1,
SELF_AND_BEFORE
)
),
Hierarchize(
Descendants(
{
[Products].[Brand].[AllMember]
},
1,
SELF_AND_BEFORE
)
)
) ON ROWS,
NON EMPTY {
[Measures].[Count],
[Measures].[Quantity],
[Measures].[Unit price.MEAN],
[Measures].[Unit price.SUM]
} ON COLUMNS
FROM [Sales]
CELL PROPERTIES
FORE_COLOR,
FORMATTED_VALUE,
VALUE
""",
keep_totals=True,
)
count_quantity_and_unit_prices_by_sub_category_and_brand_df.style
[7]:
Count | Quantity | Unit price.MEAN | Unit price.SUM | ||
---|---|---|---|---|---|
Sub category | Brand | ||||
Total | 5000 | 8077.0 | 119.38 | 596,922.00 | |
Bed | 984 | 1588.0 | 321.25 | 316,110.00 | |
Basic | 492 | 793.0 | 337.50 | 166,050.00 | |
Mega | 492 | 795.0 | 305.00 | 150,060.00 | |
Chair | 246 | 401.0 | 60.00 | 14,760.00 | |
Basic | 164 | 266.0 | 60.00 | 9,840.00 | |
Mega | 82 | 135.0 | 60.00 | 4,920.00 | |
Hoodie | 738 | 1186.0 | 47.33 | 34,932.00 | |
Basic | 164 | 264.0 | 47.00 | 7,708.00 | |
Mega | 164 | 261.0 | 46.50 | 7,626.00 | |
NewBrand | 164 | 262.0 | 48.50 | 7,954.00 | |
Over | 246 | 399.0 | 47.33 | 11,644.00 | |
Shoes | 1310 | 2123.0 | 60.00 | 78,600.00 | |
Basic | 409 | 661.0 | 60.00 | 24,540.00 | |
Mega | 328 | 532.0 | 60.00 | 19,680.00 | |
NewBrand | 327 | 531.0 | 60.00 | 19,620.00 | |
Over | 246 | 399.0 | 60.00 | 14,760.00 | |
T-shirt | 1230 | 1991.0 | 22.00 | 27,060.00 | |
Basic | 328 | 533.0 | 22.00 | 7,216.00 | |
Mega | 328 | 529.0 | 21.50 | 7,052.00 | |
NewBrand | 328 | 528.0 | 22.50 | 7,380.00 | |
Over | 246 | 401.0 | 22.00 | 5,412.00 | |
Table | 492 | 788.0 | 255.00 | 125,460.00 | |
Basic | 246 | 392.0 | 240.00 | 59,040.00 | |
Mega | 246 | 396.0 | 270.00 | 66,420.00 |
Applying style with pandas#
pandas provides many ways to style a DataFrame.
Here, we’ll apply a background color to a specified row:
[8]:
import pandas as pd
[9]:
index_slice = pd.IndexSlice
subset = index_slice[index_slice["Chair", "Mega"], :]
count_quantity_and_unit_prices_by_sub_category_and_brand_style = (
count_quantity_and_unit_prices_by_sub_category_and_brand_df.style.set_properties(
**{"background-color": "#ffff92"},
subset=subset,
)
)
count_quantity_and_unit_prices_by_sub_category_and_brand_style
[9]:
Count | Quantity | Unit price.MEAN | Unit price.SUM | ||
---|---|---|---|---|---|
Sub category | Brand | ||||
Total | 5000 | 8077.0 | 119.38 | 596,922.00 | |
Bed | 984 | 1588.0 | 321.25 | 316,110.00 | |
Basic | 492 | 793.0 | 337.50 | 166,050.00 | |
Mega | 492 | 795.0 | 305.00 | 150,060.00 | |
Chair | 246 | 401.0 | 60.00 | 14,760.00 | |
Basic | 164 | 266.0 | 60.00 | 9,840.00 | |
Mega | 82 | 135.0 | 60.00 | 4,920.00 | |
Hoodie | 738 | 1186.0 | 47.33 | 34,932.00 | |
Basic | 164 | 264.0 | 47.00 | 7,708.00 | |
Mega | 164 | 261.0 | 46.50 | 7,626.00 | |
NewBrand | 164 | 262.0 | 48.50 | 7,954.00 | |
Over | 246 | 399.0 | 47.33 | 11,644.00 | |
Shoes | 1310 | 2123.0 | 60.00 | 78,600.00 | |
Basic | 409 | 661.0 | 60.00 | 24,540.00 | |
Mega | 328 | 532.0 | 60.00 | 19,680.00 | |
NewBrand | 327 | 531.0 | 60.00 | 19,620.00 | |
Over | 246 | 399.0 | 60.00 | 14,760.00 | |
T-shirt | 1230 | 1991.0 | 22.00 | 27,060.00 | |
Basic | 328 | 533.0 | 22.00 | 7,216.00 | |
Mega | 328 | 529.0 | 21.50 | 7,052.00 | |
NewBrand | 328 | 528.0 | 22.50 | 7,380.00 | |
Over | 246 | 401.0 | 22.00 | 5,412.00 | |
Table | 492 | 788.0 | 255.00 | 125,460.00 | |
Basic | 246 | 392.0 | 240.00 | 59,040.00 | |
Mega | 246 | 396.0 | 270.00 | 66,420.00 |
Adding style with CSS#
Before being a PDF, our report will be an HTML file so we can use CSS to style it:
[10]:
table_css = """
table {
border-collapse: collapse;
}
td {
border: 1px solid black;
padding: 8px;
text-align: right;
}
th {
border: 1px solid black;
padding: 8px;
text-align: center;
}
thead {
background-color: lightgray;
}
"""
Creating the chart#
Many Python libraries can create charts from pandas DataFrames. In this how-to, we’ll use seaborn which is a wrapper around Matplotlib:
[11]:
quantity_by_sub_category_df = cube.query(m["Quantity.SUM"], levels=[l["Sub category"]])
[12]:
import matplotlib_inline.backend_inline
[13]:
# Render charts as SVG to make them look crisp.
matplotlib_inline.backend_inline.set_matplotlib_formats("svg")
[14]:
import io
import matplotlib.pyplot as plt
import seaborn as sns
[15]:
sns.barplot(
data=quantity_by_sub_category_df.reset_index(),
x="Sub category",
y="Quantity.SUM",
).set(title="Quantity.SUM by Sub category")
chart = io.BytesIO()
plt.savefig(chart, format="svg")
Creating the HTML report#
We’ll use Jinja to render this template:
[16]:
from pathlib import Path
from IPython.display import Code
[17]:
template_path = Path("__resources__") / "generate_pdf_reports" / "template.html"
Code(filename=template_path, language="html+jinja")
[17]:
<html>
<head>
<style>
{{ table_css }}
</style>
</head>
<body>
<header
style="
display: flex;
align-items: center;
justify-content: space-between;
width: 100%;
"
>
<h1>{{ report_name }}</h1>
<h5>Report Date: {{ as_of_date }}</h5>
</header>
<div>
<h3>{{ table_title }}</h3>
<div>{{ table }}</div>
</div>
<br />
<div>{{ chart }}</div>
</body>
</html>
With this data:
[18]:
from datetime import datetime
[19]:
template_data = {
"as_of_date": datetime.now().strftime("%Y/%m/%d"),
"chart": chart.getvalue().decode(),
"report_name": "Atoti PDF Report",
"table": count_quantity_and_unit_prices_by_sub_category_and_brand_style.to_html(),
"table_css": table_css,
"table_title": "Measures by Sub category and Brand",
}
[20]:
from jinja2 import Environment, FileSystemLoader
[21]:
environment = Environment(
autoescape=False, # noqa: S701
loader=FileSystemLoader("."),
)
template = environment.get_template(str(template_path))
html = template.render(template_data)
Exporting the report as a PDF#
WeasyPrint can render HTML as a PDF:
[22]:
import weasyprint
from IPython.display import IFrame
[23]:
pdf_path = "../_static/report.pdf"
document = weasyprint.HTML(string=html)
document.write_pdf(pdf_path)
IFrame(pdf_path, width=800, height=800)
Going further#
The report could be sent by email periodically using the schedule library or:
Task Scheduler on Windows
cron
on Linux or macOSA Lambda function in the cloud