Structure

R
Python

import pandas as pd
from great_tables import GT, from_column, style, loc
coffee = pd.read_csv("data/coffee-sales.csv")
coffee

             product  revenue_dollars  revenue_pct  profit_dollars  profit_pct
0            Grinder           904500         0.03          567960        0.04
1           Moka pot          2045250         0.07          181080        0.01
2          Cold brew           288750         0.01          241770        0.02
3             Filter           404250         0.01           70010        0.00
4       Drip machine          2632000         0.09         1374450        0.09
5          AeroPress          2601500         0.09         1293780        0.09
6          Pour over           846000         0.03          364530        0.02
7       French press          1113250         0.04          748120        0.05
8              Cezve          2512500         0.09         1969520        0.13
9             Chemex          3137250         0.11          817680        0.06
10             Scale          3801000         0.13         2910290        0.20
11            Kettle           756250         0.03          617520        0.04
12  Espresso Machine          8406000         0.29         3636440        0.25
13             Total         29448500         1.00        14793150        1.00

Starting a GT Table

R
Python

gt_tbl1 = GT(coffee)
gt_tbl1

product	revenue_dollars	revenue_pct	profit_dollars	profit_pct
Grinder	904500	0.03	567960	0.04
Moka pot	2045250	0.07	181080	0.01
Cold brew	288750	0.01	241770	0.02
Filter	404250	0.01	70010	0.0
Drip machine	2632000	0.09	1374450	0.09
AeroPress	2601500	0.09	1293780	0.09
Pour over	846000	0.03	364530	0.02
French press	1113250	0.04	748120	0.05
Cezve	2512500	0.09	1969520	0.13
Chemex	3137250	0.11	817680	0.06
Scale	3801000	0.13	2910290	0.2
Kettle	756250	0.03	617520	0.04
Espresso Machine	8406000	0.29	3636440	0.25
Total	29448500	1.0	14793150	1.0

Working with the Header

We use tab_header() to modify the header of our table. We can supply both a title and subtitle for our Table.

R
Python

Here we’re using the pipe |> to chain our commands together.

Here we chain commands together using . - we usually use this in Python to select methods.

gt_tbl_2 = GT(coffee) \
  .tab_header(title="Sales of Coffee Equipment")
  
gt_tbl_2

Sales of Coffee Equipment
product	revenue_dollars	revenue_pct	profit_dollars	profit_pct
Grinder	904500	0.03	567960	0.04
Moka pot	2045250	0.07	181080	0.01
Cold brew	288750	0.01	241770	0.02
Filter	404250	0.01	70010	0.0
Drip machine	2632000	0.09	1374450	0.09
AeroPress	2601500	0.09	1293780	0.09
Pour over	846000	0.03	364530	0.02
French press	1113250	0.04	748120	0.05
Cezve	2512500	0.09	1969520	0.13
Chemex	3137250	0.11	817680	0.06
Scale	3801000	0.13	2910290	0.2
Kettle	756250	0.03	617520	0.04
Espresso Machine	8406000	0.29	3636440	0.25
Total	29448500	1.0	14793150	1.0

Selecting Columns

How do we specify columns to modify? We can use column selectors (in Great Tables in Python) or tidy selectors in R.

R
Python

columns = ["profit_dollars", "profit_pct"]
columns = [4, 5]
# anonymous function
columns=lambda x: "profit" in x

Much more info about column selectors in Python: https://posit-dev.github.io/great-tables/get-started/column-selection.html

Column Spanners

We can specify a spanner with tab_spanner(), which takes two main arguments:

label - what we want the spanner to say
columns - which columns we want to apply the spanner to.

R
Python

gt_tbl_3 = (GT(coffee) \
  .tab_spanner(label="Revenue", columns=["revenue_dollars", "revenue_pct"])
  .tab_spanner(label="Profit", columns=["profit_dollars", "profit_pct"]))
gt_tbl_3

product	Revenue		Profit
product	revenue_dollars	revenue_pct	profit_dollars	profit_pct
Grinder	904500	0.03	567960	0.04
Moka pot	2045250	0.07	181080	0.01
Cold brew	288750	0.01	241770	0.02
Filter	404250	0.01	70010	0.0
Drip machine	2632000	0.09	1374450	0.09
AeroPress	2601500	0.09	1293780	0.09
Pour over	846000	0.03	364530	0.02
French press	1113250	0.04	748120	0.05
Cezve	2512500	0.09	1969520	0.13
Chemex	3137250	0.11	817680	0.06
Scale	3801000	0.13	2910290	0.2
Kettle	756250	0.03	617520	0.04
Espresso Machine	8406000	0.29	3636440	0.25
Total	29448500	1.0	14793150	1.0

Column Labels

Change Column labels for readability. The column names are: product, revenue_dollars, revenue_pct, profit_dollars, profit_pct, and monthly_sales.

R
Python

gt_tbl_4 = (
    gt_tbl_3
    .cols_label(
        product="Product",
        revenue_dollars="Amount",
        revenue_pct="Percent",
        profit_dollars="Amount",
        profit_pct="Percent"
    )
)

gt_tbl_4

Product	Revenue		Profit
Product	Amount	Percent	Amount	Percent
Grinder	904500	0.03	567960	0.04
Moka pot	2045250	0.07	181080	0.01
Cold brew	288750	0.01	241770	0.02
Filter	404250	0.01	70010	0.0
Drip machine	2632000	0.09	1374450	0.09
AeroPress	2601500	0.09	1293780	0.09
Pour over	846000	0.03	364530	0.02
French press	1113250	0.04	748120	0.05
Cezve	2512500	0.09	1969520	0.13
Chemex	3137250	0.11	817680	0.06
Scale	3801000	0.13	2910290	0.2
Kettle	756250	0.03	617520	0.04
Espresso Machine	8406000	0.29	3636440	0.25
Total	29448500	1.0	14793150	1.0

Formatting the Stub

I have to use a separate table to show these. We have two columns in this dataset: row, which serve as the row labels, and group, which serve as the column labels.

R
Python

from great_tables import data
exibble = data.exibble

out =  (GT(exibble)\
  .tab_stub(rowname_col = "row")
  )

out

	num	char	fctr	date	time	datetime	currency	group
row_1	0.1111	apricot	one	2015-01-15	13:35	2018-01-01 02:22	49.95	grp_a
row_2	2.222	banana	two	2015-02-15	14:40	2018-02-02 14:33	17.95	grp_a
row_3	33.33	coconut	three	2015-03-15	15:45	2018-03-03 03:44	1.39	grp_a
row_4	444.4	durian	four	2015-04-15	16:50	2018-04-04 15:55	65100.0	grp_a
row_5	5550.0		five	2015-05-15	17:55	2018-05-05 04:00	1325.81	grp_b
row_6		fig	six	2015-06-15		2018-06-06 16:11	13.255	grp_b
row_7	777000.0	grapefruit	seven		19:10	2018-07-07 05:22		grp_b
row_8	8880000.0	honeydew	eight	2015-08-15	20:20		0.44	grp_b

R
Python

out =  (GT(exibble)\
  .tab_stub(rowname_col = "row", groupname_col = "group")
  )

out

	num	char	fctr	date	time	datetime	currency
grp_a
row_1	0.1111	apricot	one	2015-01-15	13:35	2018-01-01 02:22	49.95
row_2	2.222	banana	two	2015-02-15	14:40	2018-02-02 14:33	17.95
row_3	33.33	coconut	three	2015-03-15	15:45	2018-03-03 03:44	1.39
row_4	444.4	durian	four	2015-04-15	16:50	2018-04-04 15:55	65100.0
grp_b
row_5	5550.0		five	2015-05-15	17:55	2018-05-05 04:00	1325.81
row_6		fig	six	2015-06-15		2018-06-06 16:11	13.255
row_7	777000.0	grapefruit	seven		19:10	2018-07-07 05:22
row_8	8880000.0	honeydew	eight	2015-08-15	20:20		0.44

We can make the group its own column with row_group_as_column = TRUE. This does not appear to be the case in Python, however:

R
Python

out =  (GT(exibble)\
  .tab_stub(groupname_col = "group")
  )

out

num	char	fctr	date	time	datetime	currency	row
0.1111	apricot	one	2015-01-15	13:35	2018-01-01 02:22	49.95	row_1
2.222	banana	two	2015-02-15	14:40	2018-02-02 14:33	17.95	row_2
33.33	coconut	three	2015-03-15	15:45	2018-03-03 03:44	1.39	row_3
444.4	durian	four	2015-04-15	16:50	2018-04-04 15:55	65100.0	row_4
5550.0		five	2015-05-15	17:55	2018-05-05 04:00	1325.81	row_5
	fig	six	2015-06-15		2018-06-06 16:11	13.255	row_6
777000.0	grapefruit	seven		19:10	2018-07-07 05:22		row_7
8880000.0	honeydew	eight	2015-08-15	20:20		0.44	row_8

```