Plotting and interactivity

The setvis.plots module

class setvis.plots.IntersectionBarChart(data: Membership, initial_selection=Selection(columns=[], records=[], intersections=[]), sort_x_by=None, sort_y_by=None, sort_x_order=None, sort_y_order=None, bins=None)

A bar chart plot that represents the number of records for each unique set intersection.

Only the items in the initial selection of the Membership object are included in the plott. If no selection is specified, all items are included.

Parameters:
  • data (Membership) – a Membership object

  • initial_selection (Selection) – initial selection of items in the membership object to be included in the set bar chart

  • sort_x_by (str) – Name of the sort option for the x-axis. Sort options are: - “value”: sorts the bars along the x-axis with ascending or descending y-value as specified in sort_x_order - “alphabetical”: sorts the bars along the x-axis in alphabetical order as specified in sort_x_order - default: if none of the above is provided the bars are sorted in the order they appear in the dataset.

  • sort_x_order (str) – Sorting order for the x-axis. Options are: - “ascending” - “descending”

plot(**kwargs) figure

Creates a figure with the intersection bar chart plot.

Parameters:
  • title (str) – title of the plot

  • tools (list[str]) – list of tools to interact with Bokeh plot. For each tool an icon appears in the plot toolbar.

  • y_range (range of y-axis. By default, range is automatically) – determined based on plot data.

  • y_axis_type (str) – Options are “linear” and “log”. The default is “linear”.

  • **kwargs – All other arguments are forwarded to bokeh.plotting.figure()

Returns:

bar chart plot. The default is

Return type:

bokeh.plotting.figure

plot_indices_to_selection(indices: Sequence[int]) Selection

Maps the interactive selection made in the plot to a Selection object.

To identify the elements of a plot (e.g. bars, fields in a heatmap), Bokeh indexes them in a numeric way. This converts the list of selected Bokeh indices into a selection of records, columns or intersections understood by setvis. This conversion is specific to a plot type.

Parameters:

indices (Sequence[int]) – Bokeh indices of the elements selected in the plot

Returns:

items in Membership object that correspond to the plot selection

Return type:

Selection

selection_to_plot_indices(selection: Selection) List[int]

Maps a Selection to the corresponding bin indices of the histogram.

Setvis understands items selected in a Membership object in form of records, columns or intersections, while Bokeh uses numeric indices to identify the elements of a plot (e.g. bars, fields in a heatmap). This converts such setvis Selection object into a list of corresponding Bokeh indices. This conversion is specific to a plot type.

Parameters:

selection (Selection) – selected items of Membership object

Returns:

Bokeh indices of the plot elements that correspond to the setvis selection.

Return type:

Sequence[int]

Raises:

NotImplementedError – _description_

class setvis.plots.IntersectionCardinalityHistogram(data: Membership, initial_selection=Selection(columns=[], records=[], intersections=[]), sort_x_by=None, sort_y_by=None, sort_x_order=None, sort_y_order=None, bins=None)

A histogram plot that bins the number of records for each set intersection.

Only the items in the initial selection of the Membership object are included in the plot. If no selection is specified, all items are included.

Parameters:
  • data (Membership) – a Membership object

  • initial_selection (Selection) – initial selection of items in the membership object to be included in the intersection cardinality histogram

  • bins (int)

plot(**kwargs) figure

Creates a figure with the intersection cardinality histogram plot.

Parameters:
  • title (str) – title of the plot

  • tools (list[str]) – list of tools to interact with Bokeh plot. For each tool an icon appears in the plot toolbar.

  • y_range (range of y-axis. By default, range is automatically) – determined based on plot data.

  • y_axis_type (str) – Options are “linear” and “log”. The default is “linear”.

  • **kwargs – All other arguments are forwarded to bokeh.plotting.figure()

Returns:

histogram plot

Return type:

bokeh.plotting.figure

plot_indices_to_selection(indices: Sequence[int]) Selection

Maps the interactive selection made in the plot to a Selection object.

To identify the elements of a plot (e.g. bars, fields in a heatmap), Bokeh indexes them in a numeric way. This converts the list of selected Bokeh indices into a selection of records, columns or intersections understood by setvis. This conversion is specific to a plot type.

Parameters:

indices (Sequence[int]) – Bokeh indices of the elements selected in the plot

Returns:

items in Membership object that correspond to the plot selection

Return type:

Selection

selection_to_plot_indices(selection: Selection) List[int]

Maps a Selection to the corresponding bin indices of the histogram.

Setvis understands items selected in a Membership object in form of records, columns or intersections, while Bokeh uses numeric indices to identify the elements of a plot (e.g. bars, fields in a heatmap). This converts such setvis Selection object into a list of corresponding Bokeh indices. This conversion is specific to a plot type.

Parameters:

selection (Selection) – selected items of Membership object

Returns:

Bokeh indices of the plot elements that correspond to the setvis selection.

Return type:

Sequence[int]

Raises:

NotImplementedError – _description_

class setvis.plots.IntersectionDegreeHistogram(data: Membership, initial_selection=Selection(columns=[], records=[], intersections=[]), sort_x_by=None, sort_y_by=None, sort_x_order=None, sort_y_order=None, bins=None)

A histogram plot that bins the number of sets that form a set intersection.

Only the items in the initial selection of the Membership object are included in the plot. If no selection is specified, all items are included.

Parameters:
  • data (Membership) – a Membership object

  • initial_selection (Selection) – initial selection of items in the membership object to be included in the histogram

  • bins (int) – number of histogram bins

plot(**kwargs) figure

Creates a figure with the intersection degree histogram.

Parameters:
  • title (str) – title of the plot

  • tools (list[str]) – list of tools to interact with Bokeh plot. For each tool an icon appears in the plot toolbar.

  • y_range (range of y-axis. By default, range is automatically) – determined based on plot data.

  • y_axis_type (str) – Options are “linear” and “log”. The default is “linear”.

  • **kwargs – All other arguments are forwarded to bokeh.plotting.figure()

Returns:

histogram plot

Return type:

bokeh.plotting.figure

plot_indices_to_selection(indices: Sequence[int]) Selection

Maps the interactive selection made in the plot to a Selection object.

To identify the elements of a plot (e.g. bars, fields in a heatmap), Bokeh indexes them in a numeric way. This converts the list of selected Bokeh indices into a selection of records, columns or intersections understood by setvis. This conversion is specific to a plot type.

Parameters:

indices (Sequence[int]) – Bokeh indices of the elements selected in the plot

Returns:

items in Membership object that correspond to the plot selection

Return type:

Selection

selection_to_plot_indices(selection: Selection) List[int]

Maps a Selection to the corresponding bin indices of the histogram.

Setvis understands items selected in a Membership object in form of records, columns or intersections, while Bokeh uses numeric indices to identify the elements of a plot (e.g. bars, fields in a heatmap). This converts such setvis Selection object into a list of corresponding Bokeh indices. This conversion is specific to a plot type.

Parameters:

selection (Selection) – selected items of Membership object

Returns:

Bokeh indices of the plot elements that correspond to the setvis selection.

Return type:

Sequence[int]

Raises:

NotImplementedError – _description_

class setvis.plots.IntersectionHeatmap(data: Membership, initial_selection=Selection(columns=[], records=[], intersections=[]), sort_x_by=None, sort_y_by=None, sort_x_order=None, sort_y_order=None, bins=None)

A heatmap that displays a matrix of sets on the x-axis and the set intersections on the y-axis.

The number of records that are associated with a set intersection is encoded in the colour map.

Only the items in the initial selection of the Membership object are included in the plot. If no selection is specified, all items are included.

Parameters:
  • data (Membership) – a Membership object

  • initial_selection (Selection) – initial selection of items in the membership object to be included in the set bar chart

  • sort_x_by (str) –

    Name of the sort option for the x-axis. Sort options are: - “alphabetical”: sorts the fields along the x-axis in alphabetical

    order as specified in sort_x_order

    • default: if none of the above is provided the fields on the x-axis of the heatmap are sorted in the order they appear in the dataset.

  • sort_y_by (str) –

    Name of the sort option for the y-axis. Sort options are: - “value”: sorts the fields along the y-axis by the heatmap value with

    the order as specified in sort_x_order

    • ”length”: sorts the fields along the y-axis by the intersection length with the order as specified in sort_x_order

    • default: if none of the above is provided the intersections on the y-axis of the heatmap are sorted in the order they appear in the dataset.

  • sort_x_order (str) –

    • “ascending” (default)

    • ”descending”

  • sort_y_order (str) –

    • “ascending” (default)

    • ”descending”

plot(**kwargs) figure

Creates a figure with the intersection heatmap plot.

Parameters:
  • title (str) – title of the plot

  • tools (list[str]) – list of tools to interact with Bokeh plot. For each tool an icon appears in the plot toolbar.

  • x_range (range of x-axis. By default, range is automatically) – determined based on plot data.

  • y_range (range of y-axis. By default, range is automatically) – determined based on plot data.

  • **kwargs – All other arguments are forwarded to bokeh.plotting.figure()

Returns:

heatmap plot

Return type:

bokeh.plotting.figure

plot_indices_to_selection(indices: Sequence[int]) Selection

Maps the interactive selection made in the plot to a Selection object.

To identify the elements of a plot (e.g. bars, fields in a heatmap), Bokeh indexes them in a numeric way. This converts the list of selected Bokeh indices into a selection of records, columns or intersections understood by setvis. This conversion is specific to a plot type.

Parameters:

indices (Sequence[int]) – Bokeh indices of the elements selected in the plot

Returns:

items in Membership object that correspond to the plot selection

Return type:

Selection

selection_to_plot_indices(selection: Selection) List[int]

Maps a Selection to the corresponding bin indices of the histogram.

Setvis understands items selected in a Membership object in form of records, columns or intersections, while Bokeh uses numeric indices to identify the elements of a plot (e.g. bars, fields in a heatmap). This converts such setvis Selection object into a list of corresponding Bokeh indices. This conversion is specific to a plot type.

Parameters:

selection (Selection) – selected items of Membership object

Returns:

Bokeh indices of the plot elements that correspond to the setvis selection.

Return type:

Sequence[int]

Raises:

NotImplementedError – _description_

class setvis.plots.PlotSession(data, session_file=None, set_mode=False, verbose=False)

An interactive plotting session in a Jupyter notebook

A session contains a sequence of named selections, each with a few corresponding Bokeh plots (tabs in a tabbed layout). New selections can be made interactively from the plots, and then other plots added.

It is possible to save a session, i.e., the user-made interactive selections in every plot to a file, and load it to restore the state of the session.

Parameters:
  • data (pd.DataFrame or Membership) – contains the dataset a Membership object

  • session_file (str) – file containing the interactive selections of a previously saved session, by default None

  • set_mode (bool) – whether to operate in set mode (True) or missingness mode (False), by default False

  • verbose (bool) – whether to produce verbose logging output, by default False

add_plot(name, based_on=None, notebook=True, html_display_link=True, **kwargs)

Add a new plot

Renders a set of interactive Bokeh plots in a tabbed layout (see the options below controlling how these are displayed).

The plots have different titles, depending on the mode (‘set mode’ or ‘missingness mode’, see PlotSession). The included plots are shown in the table below. More detail about the plot can be found under the documentation for its plot class.

Naming the plot allows any interactive selection made in the plot to be referred to later.

Set-mode plot

Missingness-mode plot

Plot class

Set bar chart

Value bar chart

SetBarChart

Intersection heatmap

Combination heatmap

IntersectionHeatMap

Set cardinality histogram

Value count histogram

SetCardinalityHistogram

Intersection bar chart

Combination bar chart

IntersectionBarChart

Intersection cardinality histogram

Combination count histogram

IntersectionCardinalityHistogram

Intersection degree histogram

Combination length histogram

IntersectionDegreeHistogram

Parameters:
  • name (str) – The name of the plot. This name is used to refer to any selection made in the plot.

  • based_on (str or None) – The data to plot is taken from this selection (it is ‘based on’ this selection). Any selection made in this plot is a refinement of the based_on selection.

  • notebook (bool) – Should the plot be displayed inline in the notebook? A value of False starts and returns a Bokeh server for rendering the plots.

  • html_display_link (bool) – Display an inline notebook link to the Bokeh server? Only used when ok is False, and when running in a notebook.

  • **kwargs (dict) –

    Additional keyword arguments for each plot.

    Each keyword argument should be a dictionary, whose contents are used as keyword arguments for the plot method of the class for the corresponding plot.

    The arguments that are forwarded are listed below. They have ‘set mode’ names (even in missingness mode).

    • set_bar_chart

    • intersection_heatmap

    • set_cardinality_histogram

    • intersection_bar_chart

    • intersection_cardinality_histogram

    • intersection_degree_histogram

    See the documentation for the individual plot classes for the accepted dictionary keys (any that are unknown are forwarded to bokeh.plotting.figure()).

add_selection(name, based_on=None, columns=None, intersections=None, records=None, invert=False)

Adds a selection to the plot session

Parameters:
  • name (str) – name of the selection

  • based_on (str, optional) – name of the selection on which new selection is based, by default None

  • columns (list, optional) – The included column names (may be any value returned by Membership.columns(), which will generally be the same as in the underlying data source)

  • records (list, optional) – The included record IDs (may be any value in Membership.columns()["_record_id"])

  • intersections (list, optional) – The included intersection IDs (may be any value in Membership.intersections().index)

  • invert (bool, optional) – inverts selection, by default False

dict()

Returns a json-serializable dict representing the session

It includes:
  • the plot selections (contained in _selection_history)

  • the active (currently-selected) tab in each Bokeh ‘Tabs’ layout

It does not include any of the membership data itself

This is used by .save() to save the session state to a file.

membership(name=None)

Return the membership instance associated with the selection

Parameters:

name (str, optional) – the name of the selection for which to construct the Membership object, by default None

Returns:

membership object associated with the selection

Return type:

membership

save(filename: str)

Saves the session state to a json file

Parameters:

filename (str) – name of the file used to save the session state

selected_records(name=None, base_selection=None)

Returns the IDs of the records in the selection

Parameters:
  • name (str, optional) – name of the selection, by default None

  • base_selection (_type_, optional) – name of the base selection from which selection is taken, by default None

Returns:

records IDs in selection

Return type:

pd.Series

class setvis.plots.SetBarChart(data: Membership, initial_selection=Selection(columns=[], records=[], intersections=[]), sort_x_by=None, sort_y_by=None, sort_x_order=None, sort_y_order=None, bins=None)

A bar chart plot representing the number of records in each set.

Only the items in the initial selection of the Membership object are included in the plot. If no selection is specified, all items are included.

Parameters:
  • data (Membership) – a Membership object

  • initial_selection (Selection) – initial selection of items in the membership object to be included in the set bar chart

  • sort_x_by (str) – Name of the sort option for the x-axis. Sort options are: - “value”: sorts the bars along the x-axis with ascending or descending y-value as specified in sort_x_order - “alphabetical”: sorts the bars along the x-axis in alphabetical order as specified in sort_x_order - default: if none of the above is provided the bars are sorted in the order they appear in the dataset.

  • sort_x_order (str) – Sorting order for the x-axis. Options are: - “ascending” - “descending”

plot(**kwargs) figure

Creates a figure with the set bar chart plot.

Parameters:
  • title (str) – title of the plot.

  • tools (list[str]) – list of tools to interact with Bokeh plot. For each tool an icon appears in the plot toolbar.

  • x_range (range of x-axis. By default, range is automatically) – determined based on plot data.

  • y_range (range of y-axis. By default, range is automatically) – determined based on plot data.

  • y_axis_type (str) – Options are “linear” and “log”. The default is “linear”.

  • **kwargs – All other arguments are forwarded to bokeh.plotting.figure()

Returns:

bar chart plot

Return type:

bokeh.plotting.figure

plot_indices_to_selection(indices: Sequence[int]) Selection

Maps the interactive selection made in the plot to a Selection object.

To identify the elements of a plot (e.g. bars, fields in a heatmap), Bokeh indexes them in a numeric way. This converts the list of selected Bokeh indices into a selection of records, columns or intersections understood by setvis. This conversion is specific to a plot type.

Parameters:

indices (Sequence[int]) – Bokeh indices of the elements selected in the plot

Returns:

items in Membership object that correspond to the plot selection

Return type:

Selection

selection_to_plot_indices(selection: Selection) List[int]

Maps a Selection to the corresponding bin indices of the histogram.

Setvis understands items selected in a Membership object in form of records, columns or intersections, while Bokeh uses numeric indices to identify the elements of a plot (e.g. bars, fields in a heatmap). This converts such setvis Selection object into a list of corresponding Bokeh indices. This conversion is specific to a plot type.

Parameters:

selection (Selection) – selected items of Membership object

Returns:

Bokeh indices of the plot elements that correspond to the setvis selection.

Return type:

Sequence[int]

Raises:

NotImplementedError – _description_

class setvis.plots.SetCardinalityHistogram(data: Membership, initial_selection=Selection(columns=[], records=[], intersections=[]), sort_x_by=None, sort_y_by=None, sort_x_order=None, sort_y_order=None, bins=None)

A histogram plot that bins the number of records in a set.

The first bin only contains the sets with no records in them.

Only the items in the initial selection of the Membership object are included in the plot. If no selection is specified, all items are included.

Parameters:
  • data (Membership) – a Membership object

  • initial_selection (Selection) – initial selection of items in the membership object to be included in the histogram

  • bins (int) – number of histogram bins

plot(**kwargs) figure

Creates a figure with the set cardinality histogram plot.

Parameters:
  • title (str) – title of the plot.

  • tools (list[str]) – list of tools to interact with Bokeh plot. For each tool an icon appears in the plot toolbar.

  • y_range (range of y-axis. By default, range is automatically) – determined based on plot data.

  • y_axis_type (str) – Options are “linear” and “log”. The default is “linear”.

  • **kwargs – All other arguments are forwarded to bokeh.plotting.figure()

Returns:

histogram plot

Return type:

bokeh.plotting.figure

plot_indices_to_selection(indices: Sequence[int]) Selection

Maps the interactive selection made in the plot to a Selection object.

To identify the elements of a plot (e.g. bars, fields in a heatmap), Bokeh indexes them in a numeric way. This converts the list of selected Bokeh indices into a selection of records, columns or intersections understood by setvis. This conversion is specific to a plot type.

Parameters:

indices (Sequence[int]) – Bokeh indices of the elements selected in the plot

Returns:

items in Membership object that correspond to the plot selection

Return type:

Selection

selection_to_plot_indices(selection: Selection) List[int]

Maps a Selection to the corresponding bin indices of the histogram.

Setvis understands items selected in a Membership object in form of records, columns or intersections, while Bokeh uses numeric indices to identify the elements of a plot (e.g. bars, fields in a heatmap). This converts such setvis Selection object into a list of corresponding Bokeh indices. This conversion is specific to a plot type.

Parameters:

selection (Selection) – selected items of Membership object

Returns:

Bokeh indices of the plot elements that correspond to the setvis selection.

Return type:

Sequence[int]

Raises:

NotImplementedError – _description_

Plotting outside of a notebook

Note

This is ‘advanced’ use of Setvis, and is not as well tested as the notebook based workflow.

After creating a PlotSession object, interactive plots can be created with the PlotSession.add_plot() method. The usual, Jupyter-notebook-based workflow, creates these inline in the notebook.

When calling PlotSession.add_plot() with notebook=False, a Bokeh server is started and returned, enabling SetVis to be used outside of a Jupyter notebook.

Note that this usage is still interactive, but the plots are no longer confined to a notebook. One use-case for this is to show the plots on a large external display.

The code below creates and starts the Bokeh server.

import pandas as pd
from setvis.plots import PlotSession

## This csv file can be found in the Setvis git repository
df = pd.read_csv("examples/datasets/Synthetic_APC_DIAG_Fields.csv")

session = PlotSession(df)

## Create a plot
bokeh_plot_server = session.add_plot(name="Plot 3", notebook=False, html_display_link=False)

## Display the URL of the plot that was just created
print(f"Connect to http://localhost:{bokeh_plot_server.port}/ to see the plot")

## Start and enter the event loop (this command blocks)
## Not required if running inside Jupyter
bokeh_plot_server.run_until_shutdown()

The last command (run_until_shutdown()) is not required if creating the plot from inside Jupyter (it will be attached to Jupyter’s event loop). See the Bokeh documentation for more information on using a Bokeh server, including how it can be embedded in other applications.

See also