Tablespam - Simple Tables Made Simple¶
The objective of tablespam is to provide a “good enough” approach to
creating tables in python. tablespam is a port of the R package
tablespan.
tablespam currently builds on the awesome package
great_tables,
which allows tables created with tablespam to be exported to the
following formats:
Excel (using
openpyxl)HTML (using
great_tables)LaTeX (using
great_tables)RTF (using
great_tables)
Installation¶
tablespam is available from pip:
pip install tablespam
To install the development version from GitHub, run:
pip install git+https://github.com/jhorzek/tablespam.git#egg=tablespam
Introduction¶
Python has a large set of great packages that allow you to create and
export tables that look exactly like you envisioned. However, sometimes
you may just need a good-enough table that is easy to create and share
with others. This is where tablespam can be of help.
Let’s assume that we want to share the following table:
import polars as pl
from tablespam.Data.mtcars import mtcars
# Import mtcars data from R:
cars = mtcars()
summarized_table = cars.group_by(['cyl', 'vs'],
maintain_order=True).agg(
[
pl.len().alias('N'),
pl.col('hp').mean().alias('mean_hp'),
pl.col('hp').std().alias('sd_hp'),
pl.col('wt').mean().alias('mean_wt'),
pl.col('wt').std().alias('sd_wt'),
]
).sort(pl.col('cyl', 'vs'))
print(summarized_table)
shape: (5, 7)
┌─────┬─────┬─────┬────────────┬───────────┬──────────┬──────────┐
│ cyl ┆ vs ┆ N ┆ mean_hp ┆ sd_hp ┆ mean_wt ┆ sd_wt │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ u32 ┆ f64 ┆ f64 ┆ f64 ┆ f64 │
╞═════╪═════╪═════╪════════════╪═══════════╪══════════╪══════════╡
│ 4 ┆ 0 ┆ 1 ┆ 91.0 ┆ null ┆ 2.14 ┆ null │
│ 4 ┆ 1 ┆ 10 ┆ 81.8 ┆ 21.872357 ┆ 2.3003 ┆ 0.598207 │
│ 6 ┆ 0 ┆ 3 ┆ 131.666667 ┆ 37.527767 ┆ 2.755 ┆ 0.12816 │
│ 6 ┆ 1 ┆ 4 ┆ 115.25 ┆ 9.17878 ┆ 3.38875 ┆ 0.116216 │
│ 8 ┆ 0 ┆ 14 ┆ 209.214286 ┆ 50.976886 ┆ 3.999214 ┆ 0.759405 │
└─────┴─────┴─────┴────────────┴───────────┴──────────┴──────────┘
Note:
tablespamcurrently only supportspolarsdata frames.
We don’t want to share the table as is - the variable names are all a bit technical and the table could need some spanners summarizing columns. So, we want to share a table that looks something like this:
| | Horse Power | Weight |
| Cylinder | Engine | Mean | SD | Mean | SD |
| -------- | ------ | ----- | --- | ---- | -- |
| | |
tablespam allows us to create this table with a single formula.
Creating a Basic Table¶
In tablespam, the table headers are defined with a formula inspired by
R. For example, "cyl ~ mean_hp + sd_hp" defines a table with cyl as
the row names and mean_hp and sd_hp as columns:
from tablespam import TableSpam
tbl = TableSpam(data = summarized_table,
formula = "cyl ~ mean_hp + sd_hp")
print(tbl.as_string())
| cyl | mean_hp sd_hp |
| --- - ------- ----- |
| 4 | 91.0 None |
| 4 | 81.8 21.87 |
| 6 | 131.67 37.53 |
| ... | ... ... |
Note that the row names (cyl) are in a separate block to the left.
Adding Spanners¶
Spanners are defined using braces and spanner names. For example, the
following defines a spanner for mean_hp and sd_hp with the name
Horsepower: "cyl ~ (Horsepower = mean_hp + sd_hp)":
tbl = TableSpam(data = summarized_table,
formula = "cyl ~ (Horsepower = mean_hp + sd_hp)")
print(tbl.as_string())
| | Horsepower |
| cyl | mean_hp sd_hp |
| --- - ---------- ----- |
| 4 | 91.0 None |
| 4 | 81.8 21.87 |
| 6 | 131.67 37.53 |
| ... | ... ... |
Spanners can also be nested:
tbl = TableSpam(data = summarized_table,
formula = "cyl ~ (Horsepower = (Mean = mean_hp) + (SD = sd_hp))")
print(tbl.as_string())
| | Horsepower |
| | Mean SD |
| cyl | mean_hp sd_hp |
| --- - ---------- ----- |
| 4 | 91.0 None |
| 4 | 81.8 21.87 |
| 6 | 131.67 37.53 |
| ... | ... ... |
Renaming Columns¶
Variable names in an data frame are often very technical (e.g.,
mean_hp and sd_hp). When sharing the table, we may want to replace
those names. In the example above, we may want to replace mean_hp and
sd_hp with “Mean” and “SD”. In tablespam renaming variables is
achieved with new_name:old_name. For example,
"cyl ~ (Horsepower = Mean:mean_hp + SD:sd_hp)" renames mean_hp to
Mean and sd_hp to SD:
tbl = TableSpam(data = summarized_table,
formula = "cyl ~ (Horsepower = Mean:mean_hp + SD:sd_hp)")
print(tbl.as_string())
| | Horsepower |
| cyl | Mean SD |
| --- - ---------- ----- |
| 4 | 91.0 None |
| 4 | 81.8 21.87 |
| 6 | 131.67 37.53 |
| ... | ... ... |
Creating the Full Table¶
The combination of row names, spanners, and renaming of variables allows creating the full table:
tbl = TableSpam(data = summarized_table,
formula = """Cylinder:cyl + Engine:vs ~
N +
(`Horse Power` = Mean:mean_hp + SD:sd_hp) +
(`Weight` = Mean:mean_wt + SD:sd_wt)""",
title = "Motor Trend Car Road Tests",
subtitle = "A table created with tablespam",
footnote = "Data from the infamous mtcars data set.")
print(tbl.as_string())
Motor Trend Car Road Tests
A table created with tablespam
| | Horse Power Weight |
| Cylinder Engine | N Mean SD Mean SD |
| -------- ------ - --- ----------- ----- ------ ---- |
| 4 0 | 1 91.0 None 2.14 None |
| 4 1 | 10 81.8 21.87 2.3 0.6 |
| 6 0 | 3 131.67 37.53 2.76 0.13 |
| ... ... | ... ... ... ... ... |
Data from the infamous mtcars data set.
Exporting to Excel¶
Note: The screenshots below are from the R-package tablespan and may deviate slightly from the actual output. The screenshots will be updated soon.
Tables created with tablespam can be exported to openpyxl workbooks,
which allows saving as .xlsx files.
# Translate to openpyxl:
tbl_xlsx = tbl.as_excel()
# save the table:
# tbl_xlsx.save("my_table.xlsx")

Styling¶
While tablespam provides limited styling options, some elements can be
adjusted. For example, we may want to print some elements in bold or
format numbers differently. In tablespam, styling happens when
translating the table to an openpyxl workbook with as_excel.
Changing the Overall Look¶
The easiest way to customize tables is to change the default color
scheme. The class XlsxStyles provides control over most elements in
the table, but in many cases style_color may be sufficient. The
following creates a table with teal-colored backgrounds for the title,
header, and row names:
from tablespam.Excel.xlsx_styles import style_color
tbl_xlsx = tbl.as_excel(styles = style_color(primary_color = "008080"))
# save the table:
# tbl_xlsx.save("my_table.xlsx")

Similarly, a dark background can be defined as follows:
tbl_xlsx = tbl.as_excel(styles = style_color(primary_color = "000000"))
# save the table:
# tbl_xlsx.save("my_table.xlsx")
Formatting Cells¶
Let’s assume we want all mean_hp values with a value $\geq 100$ to be
printed in bold. To this end, we first create a function that takes in a
single openpyxl cell and applies a style to it:
import openpyxl
def bold(c):
c.font = openpyxl.styles.Font(bold=True)
Next, we have to define a CellStyle for the column mean_hp, where we
pass in the index of the rows that should be bold.
Note: openpyxl uses 1-based indexing!
# get the indices:
geq_100 = (summarized_table.with_row_index(name="index")
.filter(pl.col("mean_hp") >= 100)
.select("index")
.to_series()
.to_list())
# translate to 1-based index:
geq_100 = [i + 1 for i in geq_100]
Define cell styles:
from tablespam.Excel.xlsx_styles import XlsxStyles, CellStyle
styles=XlsxStyles(
cell_styles=[
CellStyle(
rows=geq_100,
cols=['mean_hp'],
style=bold,
),
]
)
Finally, we pass this style to as_excel:
tbl_xlsx = tbl.as_excel(styles=styles)
# save the table:
# tbl_xlsx.save("my_table.xlsx")

Formatting Data Types¶
tablespan also allows formatting specific data types. Let’s assume
that we want to round all doubles to 3 instead of the default 2 digits.
To this end, we use DataStyles, where we specify (1) a function that
checks for the data type we want to style (here for doubles) and (2) a
style for all columns that match that style:
from tablespam.Excel.xlsx_styles import DataStyle
# Define test: The function will be passed a polars data frame and should
# check for the data types defined here:
# https://docs.pola.rs/api/python/stable/reference/datatypes.html
def is_double(x: pl.DataFrame) -> bool:
return all([tp in [pl.Float32, pl.Float64] for tp in x.dtypes])
# Function that applies our style to a single cell:
def three_digits(c):
c.number_format = "0.000"
# Now we define the data_style. The data_style must be a
# dict.
data_style = {
"double": DataStyle(test = is_double,
style = three_digits)
}
tbl_xlsx = tbl.as_excel(
styles=XlsxStyles(
data_styles=data_style
)
)
# tbl_xlsx.save("my_table.xlsx")

Exporting to HTML, LaTeX, and RTF¶
Tables created with tablespam can be exported to great_tables which
allows saving as HTML, LaTeX, or RTF file. To this end, we simply have
to call as_gt on our table:
# Translate to gt:
gt_tbl = tbl.as_gt()
gt_tbl.show()
Styling Great Tables¶
The great_tables package provides a wide range of functions to adapt
the style of the table created with as_gt. For instance, opt_stylize
adds a pre-defined style to the entire table:
(gt_tbl
.opt_stylize(style = 6,
color = 'gray')
.show())
When adapting the great_tables object, there is an important detail to
keep in mind: To ensure that each table spanner has a unique ID,
tablespam will create IDs that differ from the text shown in the
spanner. To demonstrate this, Let’s assume that we want to add a spanner
above Horse Power and Weight:
(gt_tbl
.tab_spanner(label = "New Spanner",
spanners = ["Horse Power", "Weight"]))
AssertionError:
[0;31m---------------------------------------------------------------------------[0m
[0;31mAssertionError[0m Traceback (most recent call last)
Cell [0;32mIn[33], line 2[0m
[1;32m 1[0m ([43mgt_tbl[49m
[0;32m----> 2[0m [43m [49m[38;5;241;43m.[39;49m[43mtab_spanner[49m[43m([49m[43mlabel[49m[43m [49m[38;5;241;43m=[39;49m[43m [49m[38;5;124;43m"[39;49m[38;5;124;43mNew Spanner[39;49m[38;5;124;43m"[39;49m[43m,[49m[43m [49m
[1;32m 3[0m [43m [49m[43mspanners[49m[43m [49m[38;5;241;43m=[39;49m[43m [49m[43m[[49m[38;5;124;43m"[39;49m[38;5;124;43mHorse Power[39;49m[38;5;124;43m"[39;49m[43m,[49m[43m [49m[38;5;124;43m"[39;49m[38;5;124;43mWeight[39;49m[38;5;124;43m"[39;49m[43m][49m[43m)[49m)
File [0;32m~/Documents/Programming/Python/tablespam/.venv/lib/python3.13/site-packages/great_tables/_spanners.py:165[0m, in [0;36mtab_spanner[0;34m(self, label, columns, spanners, level, id, gather, replace)[0m
[1;32m 161[0m [38;5;66;03m# select spanner ids ----[39;00m
[1;32m 162[0m [38;5;66;03m# TODO: this supports tidyselect[39;00m
[1;32m 163[0m [38;5;66;03m# TODO: could we use something like resolve_vector_l[39;00m
[1;32m 164[0m [38;5;28;01mif[39;00m spanners [38;5;129;01mis[39;00m [38;5;129;01mnot[39;00m [38;5;28;01mNone[39;00m:
[0;32m--> 165[0m [38;5;28;01massert[39;00m [38;5;28mset[39m(spanners)[38;5;241m.[39missubset([38;5;28mset[39m(crnt_spanner_ids))
[1;32m 166[0m spanner_ids [38;5;241m=[39m spanners
[1;32m 167[0m [38;5;28;01melse[39;00m:
[0;31mAssertionError[0m:
This will throw an error because the spanner IDs are different from the spanner labels.
In general, the ID of a spanner is given by a concatenation of “BASE_LEVEL” and the names of all spanners above the current spanner. For example, the IDs for Horse Power and Weight are “__BASE_LEVEL__Horse Power” and “__BASE_LEVEL__Weight”:
(gt_tbl
.tab_spanner(label = "New Spanner",
spanners = ["__BASE_LEVEL__Horse Power",
"__BASE_LEVEL__Weight"])
.show())
Tables without row names¶
Using 1 on the left hand side of the formula creates a table without
row names. For example, "1 ~ (Horsepower = Mean:mean_hp + SD:sd_hp)"
defines
tbl = TableSpam(data = summarized_table,
formula = "1 ~ (Horsepower = Mean:mean_hp + SD:sd_hp)")
print(tbl.as_string())
| Horsepower |
| Mean SD |
| ---------- ----- |
| 91.0 None |
| 81.8 21.87 |
| 131.67 37.53 |
| ... ... |
References¶
great_tables: Iannone R., et al., https://posit-dev.github.io/great-tables/articles/intro.html
tables: Murdoch D (2024). tables: Formula-Driven Table Generation. R package version 0.9.31, https://dmurdoch.github.io/tables/.
openpyxl: Gazoni, E. & Clark C., https://openpyxl.readthedocs.io/en/stable/
Contents: