Structure

import pandas as pd
from great_tables import GT, from_column, style, loc
coffee = pd.read_csv("data/coffee-sales.csv")
coffee
             product  revenue_dollars  revenue_pct  profit_dollars  profit_pct
0            Grinder           904500         0.03          567960        0.04
1           Moka pot          2045250         0.07          181080        0.01
2          Cold brew           288750         0.01          241770        0.02
3             Filter           404250         0.01           70010        0.00
4       Drip machine          2632000         0.09         1374450        0.09
5          AeroPress          2601500         0.09         1293780        0.09
6          Pour over           846000         0.03          364530        0.02
7       French press          1113250         0.04          748120        0.05
8              Cezve          2512500         0.09         1969520        0.13
9             Chemex          3137250         0.11          817680        0.06
10             Scale          3801000         0.13         2910290        0.20
11            Kettle           756250         0.03          617520        0.04
12  Espresso Machine          8406000         0.29         3636440        0.25
13             Total         29448500         1.00        14793150        1.00

Starting a GT Table

gt_tbl1 = GT(coffee)
gt_tbl1
product revenue_dollars revenue_pct profit_dollars profit_pct
Grinder 904500 0.03 567960 0.04
Moka pot 2045250 0.07 181080 0.01
Cold brew 288750 0.01 241770 0.02
Filter 404250 0.01 70010 0.0
Drip machine 2632000 0.09 1374450 0.09
AeroPress 2601500 0.09 1293780 0.09
Pour over 846000 0.03 364530 0.02
French press 1113250 0.04 748120 0.05
Cezve 2512500 0.09 1969520 0.13
Chemex 3137250 0.11 817680 0.06
Scale 3801000 0.13 2910290 0.2
Kettle 756250 0.03 617520 0.04
Espresso Machine 8406000 0.29 3636440 0.25
Total 29448500 1.0 14793150 1.0

Working with the Header

We use tab_header() to modify the header of our table. We can supply both a title and subtitle for our Table.

Here we’re using the pipe |> to chain our commands together.

Here we chain commands together using . - we usually use this in Python to select methods.

gt_tbl_2 = GT(coffee) \
  .tab_header(title="Sales of Coffee Equipment")
  
gt_tbl_2
Sales of Coffee Equipment
product revenue_dollars revenue_pct profit_dollars profit_pct
Grinder 904500 0.03 567960 0.04
Moka pot 2045250 0.07 181080 0.01
Cold brew 288750 0.01 241770 0.02
Filter 404250 0.01 70010 0.0
Drip machine 2632000 0.09 1374450 0.09
AeroPress 2601500 0.09 1293780 0.09
Pour over 846000 0.03 364530 0.02
French press 1113250 0.04 748120 0.05
Cezve 2512500 0.09 1969520 0.13
Chemex 3137250 0.11 817680 0.06
Scale 3801000 0.13 2910290 0.2
Kettle 756250 0.03 617520 0.04
Espresso Machine 8406000 0.29 3636440 0.25
Total 29448500 1.0 14793150 1.0

Selecting Columns

How do we specify columns to modify? We can use column selectors (in Great Tables in Python) or tidy selectors in R.

columns = ["profit_dollars", "profit_pct"]
columns = [4, 5]
# anonymous function
columns=lambda x: "profit" in x

Much more info about column selectors in Python: https://posit-dev.github.io/great-tables/get-started/column-selection.html

Column Spanners

We can specify a spanner with tab_spanner(), which takes two main arguments:

  • label - what we want the spanner to say
  • columns - which columns we want to apply the spanner to.
gt_tbl_3 = (GT(coffee) \
  .tab_spanner(label="Revenue", columns=["revenue_dollars", "revenue_pct"])
  .tab_spanner(label="Profit", columns=["profit_dollars", "profit_pct"]))
gt_tbl_3
product Revenue Profit
revenue_dollars revenue_pct profit_dollars profit_pct
Grinder 904500 0.03 567960 0.04
Moka pot 2045250 0.07 181080 0.01
Cold brew 288750 0.01 241770 0.02
Filter 404250 0.01 70010 0.0
Drip machine 2632000 0.09 1374450 0.09
AeroPress 2601500 0.09 1293780 0.09
Pour over 846000 0.03 364530 0.02
French press 1113250 0.04 748120 0.05
Cezve 2512500 0.09 1969520 0.13
Chemex 3137250 0.11 817680 0.06
Scale 3801000 0.13 2910290 0.2
Kettle 756250 0.03 617520 0.04
Espresso Machine 8406000 0.29 3636440 0.25
Total 29448500 1.0 14793150 1.0

Column Labels

Change Column labels for readability. The column names are: product, revenue_dollars, revenue_pct, profit_dollars, profit_pct, and monthly_sales.

gt_tbl_4 = (
    gt_tbl_3
    .cols_label(
        product="Product",
        revenue_dollars="Amount",
        revenue_pct="Percent",
        profit_dollars="Amount",
        profit_pct="Percent"
    )
)

gt_tbl_4
Product Revenue Profit
Amount Percent Amount Percent
Grinder 904500 0.03 567960 0.04
Moka pot 2045250 0.07 181080 0.01
Cold brew 288750 0.01 241770 0.02
Filter 404250 0.01 70010 0.0
Drip machine 2632000 0.09 1374450 0.09
AeroPress 2601500 0.09 1293780 0.09
Pour over 846000 0.03 364530 0.02
French press 1113250 0.04 748120 0.05
Cezve 2512500 0.09 1969520 0.13
Chemex 3137250 0.11 817680 0.06
Scale 3801000 0.13 2910290 0.2
Kettle 756250 0.03 617520 0.04
Espresso Machine 8406000 0.29 3636440 0.25
Total 29448500 1.0 14793150 1.0

Formatting the Stub

I have to use a separate table to show these. We have two columns in this dataset: row, which serve as the row labels, and group, which serve as the column labels.

from great_tables import data
exibble = data.exibble

out =  (GT(exibble)\
  .tab_stub(rowname_col = "row")
  )

out
num char fctr date time datetime currency group
row_1 0.1111 apricot one 2015-01-15 13:35 2018-01-01 02:22 49.95 grp_a
row_2 2.222 banana two 2015-02-15 14:40 2018-02-02 14:33 17.95 grp_a
row_3 33.33 coconut three 2015-03-15 15:45 2018-03-03 03:44 1.39 grp_a
row_4 444.4 durian four 2015-04-15 16:50 2018-04-04 15:55 65100.0 grp_a
row_5 5550.0 five 2015-05-15 17:55 2018-05-05 04:00 1325.81 grp_b
row_6 fig six 2015-06-15 2018-06-06 16:11 13.255 grp_b
row_7 777000.0 grapefruit seven 19:10 2018-07-07 05:22 grp_b
row_8 8880000.0 honeydew eight 2015-08-15 20:20 0.44 grp_b
out =  (GT(exibble)\
  .tab_stub(rowname_col = "row", groupname_col = "group")
  )

out
num char fctr date time datetime currency
grp_a
row_1 0.1111 apricot one 2015-01-15 13:35 2018-01-01 02:22 49.95
row_2 2.222 banana two 2015-02-15 14:40 2018-02-02 14:33 17.95
row_3 33.33 coconut three 2015-03-15 15:45 2018-03-03 03:44 1.39
row_4 444.4 durian four 2015-04-15 16:50 2018-04-04 15:55 65100.0
grp_b
row_5 5550.0 five 2015-05-15 17:55 2018-05-05 04:00 1325.81
row_6 fig six 2015-06-15 2018-06-06 16:11 13.255
row_7 777000.0 grapefruit seven 19:10 2018-07-07 05:22
row_8 8880000.0 honeydew eight 2015-08-15 20:20 0.44

We can make the group its own column with row_group_as_column = TRUE. This does not appear to be the case in Python, however:

out =  (GT(exibble)\
  .tab_stub(groupname_col = "group")
  )

out
num char fctr date time datetime currency row
0.1111 apricot one 2015-01-15 13:35 2018-01-01 02:22 49.95 row_1
2.222 banana two 2015-02-15 14:40 2018-02-02 14:33 17.95 row_2
33.33 coconut three 2015-03-15 15:45 2018-03-03 03:44 1.39 row_3
444.4 durian four 2015-04-15 16:50 2018-04-04 15:55 65100.0 row_4
5550.0 five 2015-05-15 17:55 2018-05-05 04:00 1325.81 row_5
fig six 2015-06-15 2018-06-06 16:11 13.255 row_6
777000.0 grapefruit seven 19:10 2018-07-07 05:22 row_7
8880000.0 honeydew eight 2015-08-15 20:20 0.44 row_8

```