Buckaroo Embedding Guide
========================
This guide covers everything you need to embed interactive buckaroo tables
in your own applications, documentation, and reports.
Why embed
---------
- **Share DataFrames without Jupyter**: Send a colleague an HTML file they
can open in any browser. No Python install required.
- **Build data apps**: Integrate the buckaroo viewer into React dashboards,
internal tools, or customer-facing data products.
- **Static reports**: Generate HTML reports from your pipeline that include
interactive, sortable tables with summary statistics.
- **Documentation**: Embed live data tables in your docs site (Sphinx,
MkDocs, or plain HTML).
Choose your embedding mode
--------------------------
Buckaroo offers two static embed modes and one live widget mode:
``embed_type="DFViewer"`` — Lightweight table
Just the data grid with sortable columns, summary stats pinned at the
bottom, histograms, and type-aware formatting. Smaller payload. Best
for documentation, reports, and sharing.
``embed_type="Buckaroo"`` — Full experience
Everything in DFViewer plus the display switcher bar, multiple computed
views, and the interactive analysis pipeline. Larger payload. Best for
data exploration and internal tools.
**anywidget** — Live in notebooks
The ``BuckarooWidget`` runs inside Jupyter, Marimo, VS Code notebooks,
and Google Colab via anywidget. Full interactivity including the command
UI for data cleaning operations. Requires a running Python kernel.
For most embedding use cases, start with ``DFViewer``.
Data size guidelines
~~~~~~~~~~~~~~~~~~~~
.. list-table::
:header-rows: 1
* - Row count
- Recommended approach
* - < 1,000 rows
- Inline static embed. JSON payload is small (~10-50 KB).
* - 1,000 - 100,000 rows
- Static embed still works. Parquet encoding keeps payload
compact (50-500 KB). Consider sampling for faster page load.
* - > 100,000 rows
- Host data separately. Use Parquet range queries on S3/R2 to
fetch only the visible rows and columns.
Generate a static embed
-----------------------
.. code-block:: python
from buckaroo.artifact import to_html
import pandas as pd
df = pd.read_csv('my_data.csv')
html = to_html(df, title="My Data", embed_type="DFViewer")
with open('my-data.html', 'w') as f:
f.write(html)
The HTML file references ``static-embed.js`` and ``static-embed.css``.
These are shipped in the buckaroo wheel under ``buckaroo/static/``.
Copy them alongside your generated HTML:
.. code-block:: bash
STATIC=$(python -c "from pathlib import Path; import buckaroo; print(Path(buckaroo.__file__).parent / 'static')")
cp "$STATIC/static-embed.js" "$STATIC/static-embed.css" ./
**With polars:**
.. code-block:: python
import polars as pl
from buckaroo.artifact import to_html
df = pl.read_parquet('my_data.parquet')
html = to_html(df, title="Polars Data")
``to_html()`` auto-detects polars DataFrames and uses the polars analysis
pipeline.
**From a file path:**
.. code-block:: python
from buckaroo.artifact import to_html
# Reads CSV, Parquet, JSON, or JSONL automatically
html = to_html('/path/to/data.parquet', title="Direct from file")
Customizing appearance
----------------------
Column config overrides
~~~~~~~~~~~~~~~~~~~~~~~
Pass ``column_config_overrides`` to control per-column display:
.. code-block:: python
html = to_html(df, column_config_overrides={
'revenue': {
'color_map_config': {
'color_rule': 'color_from_column',
'map_name': 'RdYlGn',
}
},
'join_key': {
'color_map_config': {
'color_rule': 'color_static',
'color': '#6c5fc7',
}
}
})
Available color rules:
- ``color_from_column``: Color cells based on their value using a named
colormap (e.g., ``RdYlGn``, ``Blues``, ``Viridis``)
- ``color_categorical``: Map categorical values to a list of colors
- ``color_static``: Constant background color for every cell in the column
Tooltips
~~~~~~~~
Show the value of another column on hover:
.. code-block:: python
column_config_overrides={
'name': {
'tooltip_config': {
'tooltip_type': 'simple',
'val_column': 'full_name',
}
}
}
Analysis classes
~~~~~~~~~~~~~~~~
Control which summary statistics are computed:
.. code-block:: python
from buckaroo.artifact import to_html
from buckaroo.pluggable_analysis_framework.analysis_management import (
ColAnalysis,
)
# Use extra_analysis_klasses to add custom stats
# Use analysis_klasses to replace the default set
html = to_html(df,
extra_analysis_klasses=[MyCustomAnalysis],
embed_type="Buckaroo")
See :doc:`pluggable` for details on writing custom analysis classes.
Pinned rows
~~~~~~~~~~~
Add custom pinned rows (shown at the bottom of the table):
.. code-block:: python
html = to_html(df,
extra_pinned_rows=[
{'index': 'target', 'a': 100, 'b': 200},
])
Integration patterns
--------------------
Static HTML file
~~~~~~~~~~~~~~~~
The simplest approach. Generate the HTML, copy ``static-embed.js`` and
``static-embed.css`` next to it, and open in a browser or serve from any
static file host.
.. code-block:: bash
cp $(python -c "import buckaroo; print(buckaroo.__path__[0])")/static/static-embed.* ./
open my-data.html
React component
~~~~~~~~~~~~~~~
For deeper integration, import the React components directly from
``buckaroo-js-core``:
.. code-block:: bash
npm install buckaroo-js-core
.. code-block:: typescript
import { DFViewer } from 'buckaroo-js-core';
function MyTable({ data, config, summaryStats }) {
return (
);
}
Sphinx / ReadTheDocs
~~~~~~~~~~~~~~~~~~~~~
Use a ``raw`` directive to embed an iframe pointing to a pre-generated
static HTML file:
.. code-block:: rst
.. raw:: html
Generate the HTML with the ``to_html()`` function and place it in your
Sphinx ``_static`` directory.
What's included in the bundle
-----------------------------
The ``static-embed.js`` bundle (1.3 MB minified) includes:
- React 18 + ReactDOM
- AG-Grid Community v33 (table rendering)
- hyparquet (Parquet decoding in the browser)
- recharts (histogram rendering)
- lodash-es (utility functions, tree-shaken)
The bundle is built with esbuild and shipped as an ES module.