Core API Reference

This section documents the core classes of the Xplain Python package.

xplain.Xsession

class xplain.Xsession(url='http://localhost:8080', user='user', password='xplainData', httpsession=None, http_session_id=None, jwt_dispatch_url=None, jwt_cookie_name=None, jwt_token=None)

Bases: object

Xplain session manager for data analytics operations.

The primary interface for interacting with an Xplain server. Each instance represents an authenticated session that can load data models, execute queries, perform statistical analyses, and manage data transformations.

Xsession provides comprehensive functionality for:

  • Session Management: Connect, authenticate, load configurations

  • Data Querying: Execute aggregations, group-bys, selections

  • Object Navigation: Explore hierarchical data structures

  • Statistical Modeling: Run regressions, build predictive models

  • Data Import/Export: Load from databases, export results

  • Visualization: Generate collapsible trees and data views

The class supports multiple authentication methods (credentials, JWT, session reuse) and can be used standalone or in multi-session scenarios for parallel analysis.

__url__str

Base URL of the connected Xplain server

__id__str

Unique 32-character session identifier

__xplain_session__dict

Current session state and metadata

__requests_session__requests.Session

Underlying HTTP session for server communication

  • Each Xsession instance maintains independent state and can connect to different servers or use different credentials.

  • Sessions persist on the server until explicitly terminated or until server timeout expires.

  • For production use, always call terminate() when done or use the context manager pattern to ensure proper cleanup.

Basic usage:

>>> from xplain import Xsession, Query_config
>>> session = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session.startup("MIMIC_IV")
>>>
>>> query = Query_config()
>>> query.add_aggregation(object_name="Patients", dimension_name="Age", type="AVG")
>>> df = session.execute_query(query)
>>> print(df)
>>> session.terminate()

Context manager pattern (recommended):

>>> with Xsession(url="http://localhost:8080", user="admin", password="admin") as session:
...     session.startup("Analysis")
...     df = session.open_attribute("Patients", "Gender", "Gender")
...     print(df)

Multi-session analysis:

>>> session1 = Xsession(url="http://server1:8080", user="admin", password="pass1")
>>> session2 = Xsession(url="http://server2:8080", user="admin", password="pass2")
>>> session1.startup("Dataset_A")
>>> session2.startup("Dataset_B")
>>> # Compare results from different servers
>>> df1 = session1.execute_query(query)
>>> df2 = session2.execute_query(query)

XplainSession : Unified API with namespaced methods (alternative interface) XplainClient : Low-level client for direct Web API calls Query_config : Builder for constructing analytical queries

__init__(url='http://localhost:8080', user='user', password='xplainData', httpsession=None, http_session_id=None, jwt_dispatch_url=None, jwt_cookie_name=None, jwt_token=None)

Create a new Xplain session for data analytics operations.

Establishes an authenticated connection to an Xplain server instance. Supports multiple authentication methods: standard credentials, JWT tokens, or session reuse via existing HTTP session IDs.

urlstr, default=’http://localhost:8080

The base URL of the Xplain server (including protocol and port). Can also be set via environment variable xplain_url or global variable xplain_url.

userstr, default=’user’

Username for authentication. Required unless using JWT or session ID.

passwordstr, default=’xplainData’

Password for authentication. Required unless using JWT or session ID.

httpsessionrequests.Session, optional

Existing Python requests Session object to reuse. Useful for sharing session state across multiple Xplain connections.

http_session_idstr, optional

Existing HTTP session ID (JSESSIONID) to reuse an active session. Must be a valid 32-character session identifier.

jwt_dispatch_urlstr, optional

URL endpoint for JWT-based authentication. Required for JWT auth.

jwt_cookie_namestr, optional

Cookie name containing the JWT token. Required for JWT auth.

jwt_tokenstr, optional

JWT token string for authentication. Required for JWT auth.

RuntimeError

If the URL is not provided via any method (argument, environment, or globals).

HTTPError

If HTTP-level errors occur during authentication.

ConnectionError

If network connection to the server fails.

Timeout

If the server does not respond within the timeout period.

ValueError

If an invalid session ID format is provided.

  • Authentication is attempted in the following order: 1. Credential-based (user/password) 2. JWT-based (if JWT parameters provided) 3. Session ID reuse (if http_session_id provided)

  • The session can be loaded from an existing session ID via the environment variable xplain_session_id.

  • SSL verification is disabled by default for development environments. Enable it in production by modifying the verify parameter in requests.

Basic authentication:

>>> from xplain import Xsession
>>> session = Xsession(
...     url="http://myhost:8080",
...     user="analyst",
...     password="secret123"
... )
>>> session.startup("PatientCohort")

JWT authentication:

>>> session = Xsession(
...     url="https://secure.xplain.com",
...     jwt_dispatch_url="https://auth.example.com/dispatch",
...     jwt_cookie_name="auth_token",
...     jwt_token="eyJhbGciOi..."
... )

Reuse existing session:

>>> session = Xsession(
...     url="http://myhost:8080",
...     http_session_id="A1B2C3D4E5F6G7H8I9J0K1L2M3N4O5P6"
... )

Environment-based URL configuration:

>>> import os
>>> os.environ['xplain_url'] = 'http://production:8080'
>>> session = Xsession(user="admin", password="prod_pass")

startup : Load a startup configuration file startup_from_xview_config : Load session from XView configuration load_from_session_id : Load session by existing session ID terminate : Close the session and logout

add_quantile_based_attributes(object_name=None, dimension=None, quantiles=None, quantiles_attribute_name=None, ranges_attribute_name=None, use_names_as_postfix=False, selections=None, sample_size=None, script_file=None, script_file_ownership='PUBLIC')

Generate quantile-based attributes for FLOAT/DOUBLE dimensions.

For each target dimension the server computes two optional attributes:

  1. Quantiles attribute (quantiles_attribute_name): bins whose boundaries are the actual data quantiles — non-equidistant, but each bin contains roughly the same number of records.

  2. Ranges attribute (ranges_attribute_name): equidistant bins spanning [min-quantile .. max-quantile] — equal width, unequal population.

If neither name is supplied, both are created with default names "Quantiles" and "Ranges". If exactly one name is None that attribute is skipped entirely.

Parameters
  • object_name (str, optional) – Name of the XObject. When given without dimension, attributes are created for all FLOAT/DOUBLE dimensions of that object.

  • dimension (dict, optional) – {"object": "...", "dimension": "..."} dict targeting a single dimension. object_name may be omitted when this is provided.

  • quantiles (list of float, optional) – List of quantile fractions, e.g. [0.1, 0.25, 0.5, 0.75, 0.9]. Every value must be strictly between 0 and 1, and the list must have at least 2 entries. When omitted the server uses all 1 % steps (0.01 … 0.99).

  • quantiles_attribute_name (str or None) – Name for the quantile-bins attribute. Pass None (while providing ranges_attribute_name) to skip.

  • ranges_attribute_name (str or None) – Name for the equidistant-ranges attribute. Pass None (while providing quantiles_attribute_name) to skip.

  • use_names_as_postfix (bool) – When True the dimension name is prepended to each attribute name (e.g. "Torque - Quantiles").

  • selections (dict, optional) – Active session selections to scope the quantile computation, e.g. {"object": "...", "attribute": "...", "dimension": "...", "selectedStates": [...]}.

  • sample_size (int, optional) – If provided, quantiles are computed on a random sample. The value is in permille units (1–999): 10 means a 1 % sample, 50 means a 5 % sample, 100 means a 10 % sample. Must be strictly between 0 and 1000.

  • script_file (dict, optional) – Server-side file path to persist the generated addNumberRangesAttribute calls as a re-runnable .xscript. Example: {"ownership": "PUBLIC", "filePath": ["quantiles.xscript"]}

  • script_file_ownership (str) – Ownership for script_file when only a plain string filename is given (default: "PUBLIC").

Raises
  • ValueError – If neither object_name nor dimension is given.

  • RuntimeError – On server-side errors.

Example — all numeric dims of one object, save re-runnable script:

xsession.add_quantile_based_attributes(
    object_name="screwing station",
    use_names_as_postfix=True,
    script_file={"ownership": "PUBLIC",
                 "filePath": ["screwing_station_quantiles.xscript"]},
)

Example — single dimension with explicit quantiles:

xsession.add_quantile_based_attributes(
    dimension={"object": "screwing station",
               "dimension": "Torque max"},
    quantiles=[0.03, 0.1, 0.25, 0.5, 0.75, 0.9, 0.97],
    quantiles_attribute_name="Torque max Quantiles",
    ranges_attribute_name=None,
)
build_formula(response, predictors)

Dynamically build an R-style formula for Patsy.

Parameters
  • response (str) – The dependent variable.

  • predictors (list) – A list of predictor variable names.

Returns

The constructed formula in R-style syntax.

Return type

str

build_predictive_model(model_name, xmodel_configuration_file_name, target_event_object)

build predictive model [BETA!!]

build_tree_data(json_object)

Convert complex JSON structure into a format suitable for D3.js tree visualization. This recursively parses the JSON, building a nested dictionary format compatible with D3.js.

collapsible_tree()

Generate and visualize a collapsible tree using hierarchical data.

This function builds a tree structure based on the current focus object, processes it into a source-target DataFrame suitable for visualization, and then uses pyecharts to render the tree directly in Jupyter.

Example

Xsession.collapsible_tree()

Parameters

None

Returns

The function directly renders the visualization in the notebook.

Return type

None

convert_to_dataframe(data)

Convert query result JSON to pandas DataFrame format.

Transforms nested JSON result structures from Xplain queries into a flat pandas DataFrame suitable for analysis. Handles hierarchical data by extracting leaf node values.

datadict

Query result in JSON format with ‘fields’ and ‘children’ keys. Expected structure:

{
    "fields": ["Attribute1", "Attribute2", "Count"],
    "children": [
        {"data": [{"field1": value1}, {"field2": value2}]},
        ...
    ]
}
pandas.DataFrame

Tabular data with columns corresponding to the ‘fields’ list. Each row represents a leaf node from the hierarchical result.

KeyError

If expected keys (‘fields’ or ‘children’) are missing from data.

TypeError

If data contains invalid types that cannot be converted.

  • Nested hierarchies are flattened; only leaf nodes contribute rows.

  • Dict values in result data are unwrapped (e.g., {"value": 123} becomes 123).

  • Missing values are filled with None.

  • This method is called automatically by execute_query() when data_frame=True.

Convert query results:

>>> result_json = session.perform({"method": "getResult", "requestName": "my_query"})
>>> df = session.convert_to_dataframe(result_json)
>>> print(df.head())

Manual conversion of custom result:

>>> custom_data = {
...     "fields": ["Category", "Count"],
...     "children": [
...         {"data": [{"Category": "A"}, {"Count": 100}]},
...         {"data": [{"Category": "B"}, {"Count": 200}]}
...     ]
... }
>>> df = session.convert_to_dataframe(custom_data)

execute_query : Execute query and return DataFrame directly get_result : Retrieve query result (optionally as DataFrame)

count_attribute(attribute_name, object_name=None, dimension_name=None, request_name=None, data_frame=True)

Convenient method to count an attribute. Automatically resolves object and dimension if not provided by searching through the object structure.

Parameters
  • attribute_name (string) – name of attribute (required)

  • object_name (string) – name of object (optional, auto-resolved if omitted)

  • dimension_name (string) – name of dimension (optional, auto-resolved if omitted)

  • request_name (string) – id or name of request

  • data_frame (boolean) – if result shall be returned as pandas

Returns

attribute grouped by on first level and aggregated by count.

Return type

data frame or json

Raises

ValueError – if attribute_name is ambiguous (exists in multiple locations)

Example

>>> session = xplain.Xsession(url="myhost:8080", user="myUser",
password="myPwd")
>>> session.startup("mystartup")
>>> # Simple case - just provide attribute name
>>> session.count_attribute("Agegroup")
>>> # Explicit case - provide all three
>>> session.count_attribute("Type", object_name="Hospital Diagnose",
                           dimension_name="Diagnose")
create_contingency_table(df, var1, var2)

Create a contingency table (frequency table) for two variables.

Parameters
  • df (pd.DataFrame) – The data frame containing the variables.

  • var1 (str) – Name of the first variable (row).

  • var2 (str) – Name of the second variable (column).

Returns

A contingency table.

Return type

pd.DataFrame

download_result(filename, save_as)

download a file from result directory of server and save it to current local path

Parameters
  • file_name (string) – file name in result directory

  • save_as (string) – downloaded file save as local file

download_selections(objects, selection_set=None)

returns the selection as json for given objects and selection set

Parameters
  • objects (list of strings) – list of object names

  • selectionSet (string) – the selection set name

execute_query(query, data_frame=True)

Execute an analytical query and return results.

Runs a query specification against the current session’s data and returns aggregated, grouped, or filtered results. Queries can be specified using the Query_config builder or as raw JSON dictionaries.

queryQuery_config or dict

Query specification containing aggregations, group-bys, and selections. Can be: - A Query_config object (recommended for type safety) - A dictionary with query structure in JSON format

data_framebool, default=True

If True, return results as a pandas DataFrame. If False, return raw JSON structure.

pandas.DataFrame or dict

Query results in the requested format: - DataFrame: Columns correspond to grouped attributes and aggregated values - dict: Nested JSON structure with full hierarchy information

ValueError

If the query object is invalid or missing required fields.

RuntimeError

If query execution fails on the server or results cannot be retrieved.

AttributeError

If a Query_config object lacks the required to_json() method.

  • If no requestName is provided in the query, a unique identifier is auto-generated using format query_<8-char-uuid>.

  • The query remains available for inspection until explicitly deleted.

  • For large result sets, consider using get_result() separately to control result retrieval timing.

  • Aggregation types supported: COUNT, SUM, AVG, MIN, MAX, DISTINCT, etc.

Using Query_config (recommended):

>>> from xplain import Xsession, Query_config
>>> session = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session.startup("MIMIC_IV")
>>>
>>> # Count diagnoses grouped by type
>>> query = Query_config()
>>> query.add_aggregation(
...     object_name="Diagnoses",
...     dimension_name="ICD_Code",
...     type="COUNT"
... )
>>> query.add_groupby(
...     object_name="Diagnoses",
...     dimension_name="ICD_Code",
...     attribute_name="Category"
... )
>>> df = session.execute_query(query)
>>> print(df.head())
   Category  COUNT_ICD_Code
0  Circulatory      15234
1  Respiratory      12456
2  Injury           8901

Filter by selection:

>>> query = Query_config()
>>> query.add_aggregation(
...     object_name="LabEvents",
...     dimension_name="Creatinine",
...     type="AVG"
... )
>>> query.add_groupby(
...     object_name="Patients",
...     dimension_name="Gender",
...     attribute_name="Gender"
... )
>>> query.add_selection(
...     object_name="Patients",
...     dimension_name="Age",
...     attribute_name="AgeGroup",
...     selected_states=["65-75", "75-85", ">85"]
... )
>>> elderly_creatinine = session.execute_query(query)

Using raw JSON (alternative):

>>> query_json = {
...     "aggregations": [{
...         "object": "Admissions",
...         "dimension": "LOS",
...         "type": "AVG"
...     }],
...     "groupBys": [{
...         "attribute": {
...             "object": "Admissions",
...             "dimension": "AdmissionType",
...             "attribute": "Type"
...         }
...     }],
...     "requestName": "avg_los_by_type"
... }
>>> df = session.execute_query(query_json)

Return raw JSON instead of DataFrame:

>>> result_json = session.execute_query(query, data_frame=False)
>>> print(result_json.keys())
dict_keys(['fields', 'children', 'requestName', 'status'])

Query_config : Builder class for constructing queries QueryBuilder : Fluent API for building queries with method chaining query : Start building a query using the fluent API get_result : Retrieve results from a named query open_attribute : Convenience method to open and count an attribute

gen_xtable(data, xtable_config, file_name)
get(params=None)

Send a GET request to the /xplainsession endpoint.

Parameters

params – Optional URL parameters.

Returns

API response.

get_attribute_info(object_name, dimension_name, attribute_name)

find and retrieves the details of an attribute

Parameters
  • object_name – the name of xobject

  • dimension_name – the name of dimension

  • attribute_name – the name of attribute

Returns

details of this attribute in json format

get_current_xplain_session()

Get the current xplain session instance.

get_dimension_info(object_name, dimension_name)

find and retrieves the details of a dimension

Parameters
  • object_name – the name of the xobject

  • dimension_name – the name of dimension

Returns

details of this dimension in json format

get_full_object_structure()

Returns a flat list of all objects with their parent, dimensions, and attributes.

Each entry contains: - object: object name - parent: parent object name (None for root) - dimensions: list of {"name", "attributes"} dicts

The flat structure makes it easy to search, filter, and read without recursive traversal.

get_importer()

Get the importer instance for managing database connections and imports.

get_independent_variables_of_model(model_name)

get the list of independent variables of given predictive model

Parameters

model_name (string) – name of predictive model

Returns

list of independent variables with details

Return type

array of dict

get_instance_as_dataframe(elements)

get a pandas dataframe representation of the xplain artifacts references by elements, equivalent to the standard csv download functionality in XOE

Parameters

elements (list) – array of x-element paths, each one referring a Xplain artifact — an object, a dimension or an attribute.

Returns

Dataframe representation of requested instance

Return type

pd.Dataframe

Example:

elements = [
    {"object": "Person"},
    {"object": "Diagnosis", "dimension": "Physician"},
    {"object": "Prescription", "dimension": "Rx Code",
     "attribute": "ATC Hierarchy"},
    {"object": "Prescription", "dimension": "Rx Code",
     "attribute": "Substance"},
]
get_model_names()

list all loaded predictive models

Returns

list of model names

Return type

array of string

get_object_info(object_name, root=None)

find and display the details of a xobject in json

Parameters
  • object_name

  • root – the object name from where the search starts. if none root is provided, the root node of the entire object tree

Returns

details of the Xobject in json

get_open_sequences(sequence_name)

Retrieves details of open sequences by name.

get_queries()

get the list of the existing query ids

Returns

list of query ids

Return type

array of string

get_result(query_name, data_frame=True)

get the result of the query :param query_name: the name /id of the query :type query_name: string :return: Dataframe result of the query :rtype: pd.Dataframe or json

get_root_object()

[Beta] Retrieve the root object.

Returns

The root object.

Return type

Xobject

Raises

KeyError – If ‘focusObject’ or ‘objectName’ is missing from the session.

get_selections()

display all global selections in the current xplain session

Returns

selections as json

Return type

list of json

get_sequence_transition_matrix(sequence_name)

Retrieves the transition matrix for the specified sequence.

Parameters

sequence_name – Name of the sequence.

Returns

Transition matrix as a dictionary with labels, sources, targets, and values.

get_session()
get_session_id()

Get the current Xplain session identifier.

Returns the unique 32-character session ID assigned by the server when the session was created or loaded.

str

The 32-character alphanumeric session identifier (e.g., "A1B2C3D4E5F6G7H8I9J0K1L2M3N4O5P6").

  • The session ID can be used to share or resume sessions across different clients using load_from_session_id().

  • Session IDs remain valid until explicitly terminated or until server timeout expires.

  • Can be set via environment variable xplain_session_id during initialization.

Get current session ID:

>>> session = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session.startup("MIMIC_IV")
>>> session_id = session.get_session_id()
>>> print(f"Current session: {session_id}")
Current session: A1B2C3D4E5F6G7H8I9J0K1L2M3N4O5P6

Share session with another client:

>>> # Client 1
>>> session1 = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session1.startup("Analysis")
>>> shared_id = session1.get_session_id()
>>>
>>> # Client 2 (reuses same session)
>>> session2 = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session2.load_from_session_id(shared_id)

load_from_session_id : Load session from an existing session ID terminate : Close and invalidate the current session

get_state_hierarchy(object_name, dimension_name, attribute_name, state=None, levels=None, request_name=None)

Retrieve the hierarchical structure of states for a given attribute.

Parameters
  • object_name – Name of the object.

  • dimension_name – Name of the dimension.

  • attribute_name – Name of the attribute.

  • state – The name of a state in the attribute’s hierarchy. Optional.

  • levels – The number of hierarchy levels to return. Optional.

  • data_frame – Whether to return the result as a pandas DataFrame. Default is True.

Returns

Hierarchical structure of states.

Return type

dict or DataFrame

get_tree_details(object_name=None, dimension_name=None, attribute_name=None)

get the metadata details of certain xplain object, dimension or attribute as json

Parameters
  • object_name (string, optional) – the name of object optional, if empty show the whole object tree from root. If only objectName is specified, this function will return the metadata of this object.

  • dimension_name (string, optional) – the name of dimension, optional. If object_name and dimension_name are specified, returns the dimension metadata.

  • attribute_name (string, optional) – the name of attribute, optional. If object_name, dimension_name and attribute_name are specified, returns the attribute metadata.

Returns

object tree details

Return type

json

get_variable_details(model_name, data_frame=True)

Retrieve the details of the independent variables for a predictive model.

Parameters
  • model_name (str) – The name of the predictive model.

  • data_frame (bool) – Whether to return the result as a pandas DataFrame.

Returns

The model’s independent variables details as a DataFrame or JSON.

Return type

pd.DataFrame or dict

Raises

ValueError – If the predictive model or its variables are not found.

get_variable_list(model_name)

get the list of independent variables of given predictive model

Parameters

model_name (string) – name of predictive model

Returns

list of independent variables

Return type

array of string

get_xobject(object_name)

[Beta] Retrieve the object with the given name.

Parameters

object_name (str) – The name of the object to retrieve.

Returns

The object with the given name, or None if not found.

Return type

Xobject or None

http_get(entrypoint, params=None)

Performs an HTTP GET request to the specified endpoint.

Parameters
  • entrypoint – API endpoint relative to the base URL.

  • params – Query parameters for the GET request.

Returns

Parsed JSON response or raw content.

Raises

RuntimeError – If the GET request fails.

http_post(entrypoint, payload_json=None, data=None, files=None, params=None)

Performs an HTTP POST request to the specified endpoint.

Parameters
  • entrypoint – API endpoint relative to the base URL.

  • payload_json (dict) – Dictionary payload for the POST request

  • data (dict) – Form data for the POST request.

  • files (dict) –

  • params (dict) –

Returns

Parsed JSON response or raw content.

Raises

RuntimeError – If the POST request fails.

list_analyses()

List available xanalysis configurations

list_existing_analyses()

List available xanalysis configurations

list_files(ownership, file_type, file_extension=None)

Lists files with the specified ownership and type.

Parameters
  • ownership – Ownership type.

  • file_type – File type.

  • file_extension – Optional file extension.

Returns

List of files or raises exception on failure.

load_analysis(file_name)

Load xanalysis

load_from_session_id(session_id)

load xplain session by given exisiting session id

Parameters

session_id (string) – the 32 digit xplain session id

load_result_file_as_df(filename)

Load a file from the session as a pandas DataFrame.

Parameters

filename – Name of the file to load.

Returns

DataFrame containing file content.

open_attribute(object_name, dimension_name, attribute_name, request_name=None, data_frame=True)

Open an attribute and get counts grouped by its values.

Convenience method that creates a simple aggregation query counting entities grouped by the first level of the specified attribute hierarchy. Equivalent to a COUNT aggregation with a single GROUP BY.

object_namestr

Name of the object containing the dimension.

dimension_namestr

Name of the dimension containing the attribute.

attribute_namestr

Name of the attribute to open and count.

request_namestr, optional

Identifier for the query request. If None, a unique UUID is generated.

data_framebool, default=True

If True, return results as a pandas DataFrame. If False, return raw JSON structure.

pandas.DataFrame or dict

Counts of entities grouped by attribute values: - DataFrame: Two columns (attribute value, count) - dict: Nested JSON with full hierarchy

RuntimeError

If the attribute cannot be opened or does not exist in the structure.

  • This method is optimized for quick exploration of categorical attributes.

  • For multi-level hierarchies, only the first level is expanded by default.

  • To navigate deeper levels, use expand() or expand_to_level() methods.

  • The request remains available for further operations (expand, collapse, etc.).

Count patients by gender:

>>> from xplain import Xsession
>>> session = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session.startup("MIMIC_IV")
>>>
>>> df = session.open_attribute(
...     object_name="Patients",
...     dimension_name="Gender",
...     attribute_name="Gender"
... )
>>> print(df)
  Gender  Count
0      M  25431
1      F  23892

Analyze admission types:

>>> admissions_df = session.open_attribute(
...     object_name="Admissions",
...     dimension_name="AdmissionType",
...     attribute_name="Type"
... )
>>> print(admissions_df)
       Type  Count
0  EMERGENCY  45123
1   ELECTIVE  12456
2    URGENT   8901

Explore ICD diagnosis categories:

>>> diagnoses = session.open_attribute(
...     object_name="Diagnoses",
...     dimension_name="ICD_Code",
...     attribute_name="Category",
...     request_name="diag_category_counts"
... )

Return raw JSON for custom processing:

>>> result = session.open_attribute(
...     object_name="LabEvents",
...     dimension_name="ItemID",
...     attribute_name="TestName",
...     data_frame=False
... )

count_attribute : Auto-resolve object/dimension and count attribute execute_query : Execute full query with multiple aggregations expand : Expand attribute hierarchy to show child nodes expand_to_level : Expand hierarchy to a specific depth level

open_query(query, data_frame=True)

perform the query and keep it open, the result of this query will be impacted by further modification of current session, like selection changes

Parameters
  • query – either xplain.Query instance or JSON

  • data_frame – if True, the result will be returned as DataFrame

Returns

result of given query

Return type

JSON or DataFrame, depending on parameter dataFrame

open_sequence(target_object, base_object, ranks, reverse, names, name_postfixes, dimensions_2_replicate, sort_dimension, zero_point_dimension, selections, selection_set_definition_rank, floating_semantics, attribute_2_copy, sequence_name, rank_dimension_name, rank_zero_is_first_instance_equal_or_greater_zero_point, transition_attribute, transition_level, open_marginal_queries, open_transition_queries, selection_set)
perform(payload)

Send POST request against entry point /xplainsession with payload as json

Parameters

method_call (json) – content of xplain web api

Returns

request response

Return type

json

Example
>>> session.perform({"method": "deleteRequest",
                      "requestName":"abcd"})
post(payload)

Send POST request against entry point /xplainsession with payload as json

Parameters

payload – xplain web api in json

Returns

request response as JSON

post_and_broadcast(payload)

Send a POST request and notify the backend of session updates.

Parameters

payload – JSON payload for the API request.

post_file_download(file_name, file_type, ownership='PUBLIC', team=None, user=None, delete_after_download=True)

Triggers the flat table download functionality in XOE.

Parameters
  • file_name – Name of the file to be downloaded.

  • file_type – Type of the file.

  • ownership – Ownership type, defaults to “PUBLIC”.

  • team – Team identifier, optional.

  • user – User identifier, optional.

  • delete_after_download – Whether to delete the file after download, defaults to True.

Returns

HTTP response object or raises exception on failure.

print_error()

Print the last error message.

print_last_stack_trace()

Print the stack trace of the last error.

query_builder(name=None)

Start building a query using the fluent QueryBuilder API.

This is an alternative to Query_config that lets you chain aggregate, groupby, and selection calls and then finalise with execute() or open().

When only attribute is supplied to groupby / selection, the builder searches the session object tree and resolves the matching object and dimension automatically.

Parameters

name (str, optional) – A label for the query used as its requestName. Defaults to a random UUID.

Returns

A new query builder bound to this session.

Return type

QueryBuilder

Example:

df = (
    session.query_builder(name="lab_counts")
    .aggregate(object="Lab Events", type="COUNT")
    .groupby(attribute="TestType")
    .selection(attribute="Date", selected_states=["2024-01"])
    .execute()
)
read_file(ownership, file_type, file_path)

Reads the specified file.

Parameters
  • ownership – Ownership type.

  • file_type – File type.

  • file_path – Path of the file.

Returns

File content or raises exception on failure.

refresh()

synchronize the session content with the backend

resume_analysis(file_name)

resume the stored session

Parameters

file_name (string) – name of stored session file

Returns

False (fail) or True (success)

Return type

Boolean

run(method)

perform xplain web api method and broadcast the change to other client sharing with same session id

Parameters

method (json) – xplain web api method in json format

run_py(file_name, options, ownership)

Executes a Python script file on the server.

Parameters
  • file_name – Name of the Python file.

  • options – Execution options.

  • ownership – File ownership type.

Returns

Parsed JSON result or raw content.

Raises

RuntimeError – If the request fails.

run_statsmodels(df, formula, model_type='logit')

Fit a statistical model to the provided dataframe using the specified formula and model type.

Parameters
  • df (pandas.DataFrame) – The input dataframe containing the data.

  • formula (str) – A Patsy-compatible formula specifying the dependent and independent variables.

  • model_type (str) – The type of model to fit. Supported options are ‘logit’, ‘probit’, ‘ols’, ‘mnlogit’, ‘glm’, ‘poisson’, ‘negative_binomial’. Default is ‘logit’.

Returns

statsmodels.regression.linear_model.OLSResults or statsmodels.discrete.discrete_model.LogitResults or other statsmodels result object depending on the model_type.

Raises

ValueError – If the model_type is unsupported or if the dependent variable is not appropriate for the chosen model (e.g., non-binary dependent variable for logit/probit).

property session

Returns the underlying requests.Session object. This allows external code to reuse the authenticated session.

set_default_broadcast(broadcast)

set default broadcast behaviour so that other xplain client sharing the same xplain session could get informed about the update of current xplain session.

Parameters

broadcast (boolean) – after successful session update via python call, if a default refresh signal should be broadcasted to all xplain clients sharing the same session, to force them to refresh.

show_tree()

show object tree

Returns

render the object hierarchy as a tree

Return type

string

Raises
  • RuntimeError – if the session is not properly initialized.

  • Exception – if an unexpected error occurs.

show_tree_details()

Display the details of the object tree.

startup(startup_file)

Load an Xplain session from a startup configuration file.

Initializes the session’s object structure, dimensions, and default settings from a saved .xstartup configuration file. The file extension is optional and will be added automatically if not provided.

startup_filestr

Name of the startup configuration file. The .xstartup extension is optional and will be appended automatically if missing.

RuntimeError

If the startup file cannot be found or loaded, or if the file contains invalid configuration.

  • Startup files define the initial object tree structure, including:
    • Objects and their hierarchies (parent-child relationships)

    • Dimensions attached to each object

    • Attributes within dimensions

    • Default selections and filters

  • Loading a startup file replaces any existing session state.

  • After loading, the session is ready for query execution without additional configuration.

Load a MIMIC-IV patient cohort:

>>> from xplain import Xsession
>>> session = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session.startup("MIMIC_IV_Patients")  # .xstartup extension added automatically

Load different configurations sequentially:

>>> session = Xsession(url="http://localhost:8080", user="researcher", password="pass")
>>> session.startup("ICU_Admissions.xstartup")
>>> # ... perform analysis ...
>>> session.startup("Lab_Events")  # Switch to different configuration

Check loaded structure:

>>> session.startup("MIMIC_Cohort")
>>> session.show_tree()  # Display the loaded object hierarchy

startup_from_xview_config : Load session from an XView configuration object show_tree : Display the current object structure get_session : Get the current session information

startup_from_xview_config(xview_config)

load xplain session by given view configuration json

:param xview_config: the view configuration in json format

store_xsession(response_json)

Store session details from the response.

Parameters

response_json – Response parsed as JSON.

terminate()

Terminate the Xplain session and logout from the server.

Closes the current session, invalidating the session ID and releasing server resources. After termination, the session cannot be reused and a new session must be created.

  • All pending queries and results are lost after termination.

  • The session ID becomes invalid and cannot be loaded again.

  • It is good practice to terminate sessions explicitly when done to free server resources, especially in long-running applications.

  • Automatic session cleanup occurs on server timeout if not explicitly terminated.

Basic session lifecycle:

>>> from xplain import Xsession
>>> session = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session.startup("MIMIC_IV")
>>> # ... perform analysis ...
>>> session.terminate()

Using context manager (recommended):

>>> with Xsession(url="http://localhost:8080", user="admin", password="admin") as session:
...     session.startup("Analysis")
...     df = session.execute_query(query)
...     # Session automatically terminated when exiting context

Multiple sessions:

>>> session1 = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session2 = Xsession(url="http://other:8080", user="admin", password="admin")
>>> session1.startup("Dataset_A")
>>> session2.startup("Dataset_B")
>>> # ... work with both sessions ...
>>> session1.terminate()
>>> session2.terminate()

__init__ : Create a new session get_session_id : Get the current session identifier

upload_data(file_name)

upload the file from current local directory to data directory on server :param file_name: file :type file_name: string

upload_xmodel(model_or_path, filename=None, ownership='PUBLIC')

Upload an .xmodel configuration file to the server’s public model store.

The file is stored in the server directory resolved by file-type XMODEL_CONFIG (config/models/). Once uploaded it can be referenced by name in buildModel / crossValidateModel payloads:

{"method": "buildModel", "xmodelConfigurationFileName": "My_Model.xmodel", ...}

Parameters
  • model_or_path – Either an XModel instance or a local file-system path (str) to an existing .xmodel file.

  • filename (str) – Name to use on the server (e.g. "My_Model.xmodel"). Defaults to <model.name>.xmodel when an XModel is passed, or to the basename of the path otherwise. .xmodel is appended automatically if omitted.

  • ownership (str) – File-store ownership scope. One of "PUBLIC" (shared by all users, default), "TEAM", "USER", or "SYSTEM".

Raises

RuntimeError – if the HTTP upload fails or the server returns an error response.

Return type

None

Example:

from xplain.xmodel import XModel, IndependentVariableSet, AutoSpaceDefinition

model = XModel(
    name="Failure_Model",
    predictive_model_object="actuator",
    independent_variable_sets=[
        IndependentVariableSet(
            predictive_model_object="actuator",
            auto_space_definitions=[
                AutoSpaceDefinition("screwing station", ["Result"]),
            ],
        )
    ],
)
xsession.upload_xmodel(model)
# → uploaded as "Failure_Model.xmodel" to config/models/ (PUBLIC)
validate_db(db_connection_config)

Validates a database connection configuration.

Parameters

db_connection_config – Dictionary containing DB connection settings.

Raises

RuntimeError – If validation fails or an error occurs.

Constructor Parameters:

Parameter

Type

Description

url

str

URL of the Xplain server (default: http://localhost:8080)

user

str

Username for authentication (default: user)

password

str

Password for authentication (default: xplainData)

httpsession

requests.Session

Existing requests session object (optional)

http_session_id

str

Existing HTTP session ID / JSESSIONID (optional)

jwt_dispatch_url

str

JWT authentication endpoint URL (optional)

jwt_cookie_name

str

Cookie name for JWT token (optional)

jwt_token

str

JWT token value (optional)

Note

Authentication Methods:

  • Password authentication: Provide user and password

  • JWT authentication: Provide all three: jwt_dispatch_url, jwt_cookie_name, jwt_token

  • Recommended: Use create_session() to load credentials from config file or environment variables

See Authentication & Credential Management for credential management best practices.

Session Management Methods:

Method

Description

startup(startup_file)

Load a session from a startup configuration file

startup_from_xview_config(xview_config)

Load a session from an XView configuration

load_from_session_id(session_id)

Connect to an existing session by its 32-character ID

get_session_id()

Get the current session ID

terminate()

Terminate the session and logout

refresh()

Synchronize local session state with the server

set_default_broadcast(broadcast)

Enable/disable broadcasting updates to other clients

Query Methods:

Method

Description

query(name=None)

Start a fluent QueryBuilder chain (see below)

execute_query(query, data_frame=True)

Execute a query (Query_config or JSON) and return results

open_query(query, data_frame=True)

Execute a query and keep it open (results update with selections)

open_attribute(object_name, dimension_name, attribute_name, ...)

Open an attribute grouped by first level, aggregated by count

get_result(query_name, data_frame=True)

Get results of an existing query by name

get_queries()

List IDs of all open queries

convert_to_dataframe(data)

Convert JSON result to pandas DataFrame

Object Tree Methods:

Method

Description

show_tree()

Print the object hierarchy as a text tree

show_tree_details()

Display detailed object tree as JSON

collapsible_tree()

Render interactive tree in Jupyter using pyecharts

get_tree_details(object_name, dimension_name, attribute_name)

Get metadata for object/dimension/attribute

get_object_info(object_name)

Get detailed JSON info for an object

get_dimension_info(object_name, dimension_name)

Get detailed JSON info for a dimension

get_attribute_info(object_name, dimension_name, attribute_name)

Get detailed JSON info for an attribute

get_root_object()

Get the root XObject instance [Beta]

get_xobject(object_name)

Get an XObject by name [Beta]

get_full_object_structure()

Get nested dict of all objects, dimensions, and attributes

Selection Methods:

Method

Description

get_selections()

Get all global selections in the session

download_selections(objects, selection_set=None)

Download selections for specific objects

get_state_hierarchy(object_name, dimension_name, attribute_name, ...)

Get the hierarchical state structure for an attribute

Data Export Methods:

Method

Description

get_instance_as_dataframe(elements)

Export instance data as a pandas DataFrame (CSV download)

download_result(filename, save_as)

Download a file from the server result directory

upload_data(file_name)

Upload a local file to the server data directory

Predictive Modeling Methods:

Method

Description

build_predictive_model(model_name, config_file, target_object)

Build a predictive model [Beta]

get_model_names()

List all loaded predictive models

get_variable_list(model_name)

Get independent variable names for a model

get_independent_variables_of_model(model_name)

Get detailed independent variable info

get_variable_details(model_name, data_frame=True)

Get variable details as DataFrame or JSON

Statistical Modeling Methods:

Method

Description

run_statsmodels(df, formula, model_type="logit")

Fit a statistical model (logit, probit, ols, mnlogit, glm, poisson, negative_binomial)

build_formula(response, predictors)

Build an R-style formula string

create_contingency_table(df, var1, var2)

Create a cross-tabulation table

File Management Methods:

Method

Description

list_files(ownership, file_type, file_extension=None)

List files of a given type and ownership

read_file(ownership, file_type, file_path)

Read a file from the server

run_py(file_name, options, ownership)

Execute a Python script on the server

list_analyses()

List available xanalysis configurations

load_analysis(file_name)

Load an xanalysis (startup + saved state)

resume_analysis(file_name)

Resume a stored analysis session

Low-Level API Methods:

Method

Description

run(method)

Execute an Xplain Web API method and broadcast changes

perform(payload)

Send POST to /xplainsession and return JSON response

post(payload)

Send raw POST to /xplainsession

get(params=None)

Send GET to /xplainsession

http_get(entrypoint, params=None)

Generic HTTP GET to any endpoint

http_post(entrypoint, payload_json=None)

Generic HTTP POST to any endpoint

get_api()

Get an Api instance for advanced operations

get_importer()

Get an Importer instance for data import operations

xplain.XObject

class xplain.XObject(object_name, ref_session)

Bases: object

Represents an Xplain data object with navigation and dimension manipulation.

XObjects are the fundamental building blocks of the Xplain object model, representing entities in your data domain (e.g., Patients, Admissions, Lab Events). Each XObject contains:

  • Child objects: Hierarchical relationships to other objects

  • Dimensions: Measurable or categorical properties of the object

  • Aggregations: Computed values derived from child object data

This class provides methods to explore the object structure, retrieve dimensions and child objects, and dynamically add aggregation dimensions that compute summary statistics from related data.

object_namestr

The name of the XObject in the current session.

ref_sessionXsession

Reference to the active Xsession for API interactions.

object_namestr

The name of the XObject.

_ref_sessionXsession

The session object used for API interactions.

TypeError

If object_name is not a string.

  • XObjects are retrieved via Xsession.get_xobject(object_name).

  • The object tree structure is defined by the loaded startup configuration or XView.

  • Aggregation dimensions enable analysis across object hierarchies without manual joins.

Get an XObject and explore its structure:

>>> from xplain import Xsession
>>> session = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session.startup("MIMIC_IV")
>>>
>>> patients = session.get_xobject("Patients")
>>> print(patients.get_name())
Patients
>>> print(patients.get_dimensions())
['PatientID', 'Age', 'Gender', 'Ethnicity']
>>> print(patients.get_child_objects())
['Admissions', 'Diagnoses', 'LabEvents']

Navigate child objects:

>>> admissions = session.get_xobject("Admissions")
>>> print(admissions.get_dimensions())
['AdmissionID', 'AdmitDate', 'DischargeDate', 'LOS', 'AdmissionType']

Add aggregation dimension:

>>> # Add average length of stay to Patients object
>>> patients.add_aggregation_dimension(
...     dimension_name="AvgLOS",
...     aggregation={
...         "object": "Admissions",
...         "dimension": "LOS",
...         "type": "AVG"
...     }
... )

Xsession.get_xobject : Retrieve an XObject by name Dimension : Represents a dimension within an XObject Attribute : Represents an attribute within a dimension

Parameters

object_name (str) –

__init__(object_name, ref_session)

Initialize a Xobject instance.

Parameters
  • object_name (str) – The name of the Xobject.

  • ref_session – A session object for making API calls.

add_aggregation_dimension(dimension_name, aggregation, selections=None, floating_semantics=False)

Add an aggregation dimension that computes values from child object data.

Aggregation dimensions enable computing summary statistics from related objects without writing explicit joins. For example, add “AvgLOS” to a Patients object by averaging the Length-of-Stay dimension from the child Admissions object.

The aggregation dimension becomes part of the object’s schema and can be used in queries, groupings, and further aggregations.

dimension_namestr

Name for the new aggregation dimension. Must be unique within the object.

aggregationdict

Aggregation specification defining what to compute. Required keys: - "object" (str): Name of the child object to aggregate from - "dimension" (str): Name of the dimension to aggregate - "type" (str): Aggregation type (COUNT, SUM, AVG, MIN, MAX, etc.)

Example:

{
    "object": "Admissions",
    "dimension": "LOS",
    "type": "AVG"
}
selectionslist of dict, optional

Filters to apply before aggregation. Each selection is a dict with: - "attribute" (dict): Object/dimension/attribute to filter on - "selectedStates" (list): Values to include

Useful for conditional aggregations (e.g., “count only ICU admissions”).

floating_semanticsbool, default=False

If True, the dimension uses floating semantics, meaning it updates dynamically based on current selections. If False (default), the dimension is computed once and remains static.

dict

Server response confirming the dimension was added.

ValueError

If dimension_name is empty, aggregation is not a dict, or selections is not a list.

RuntimeError

If the API call to add the dimension fails.

  • Aggregation dimensions are computed server-side and cached for performance.

  • They appear in the object’s dimension list immediately after creation.

  • Floating semantics dimensions recalculate when selections change, enabling dynamic “what-if” analysis.

  • The aggregation can reference any descendant object, not just direct children.

Add average length of stay to Patients:

>>> session.startup("MIMIC_IV")
>>> patients = session.get_xobject("Patients")
>>> patients.add_aggregation_dimension(
...     dimension_name="AvgLOS",
...     aggregation={
...         "object": "Admissions",
...         "dimension": "LOS",
...         "type": "AVG"
...     }
... )

Count total admissions per patient:

>>> patients.add_aggregation_dimension(
...     dimension_name="TotalAdmissions",
...     aggregation={
...         "object": "Admissions",
...         "dimension": "AdmissionID",
...         "type": "COUNT"
...     }
... )

Conditional aggregation - ICU admissions only:

>>> patients.add_aggregation_dimension(
...     dimension_name="ICU_AdmissionCount",
...     aggregation={
...         "object": "Admissions",
...         "dimension": "AdmissionID",
...         "type": "COUNT"
...     },
...     selections=[{
...         "attribute": {
...             "object": "Admissions",
...             "dimension": "ICU_Stay",
...             "attribute": "ICU_Flag"
...         },
...         "selectedStates": ["Yes"]
...     }]
... )

Average creatinine level (multi-level aggregation):

>>> patients.add_aggregation_dimension(
...     dimension_name="AvgCreatinine",
...     aggregation={
...         "object": "LabEvents",  # Grandchild of Patients
...         "dimension": "Creatinine",
...         "type": "AVG"
...     }
... )

Floating semantics for dynamic analysis:

>>> patients.add_aggregation_dimension(
...     dimension_name="SelectedAdmissionCount",
...     aggregation={
...         "object": "Admissions",
...         "dimension": "AdmissionID",
...         "type": "COUNT"
...     },
...     floating_semantics=True  # Updates when selections change
... )

get_dimensions : List all dimensions including aggregations Xsession.execute_query : Use aggregation dimensions in queries Query_config.add_aggregation : Alternative aggregation method

Parameters
  • dimension_name (str) –

  • aggregation (dict) –

  • selections (list) –

  • floating_semantics (bool) –

Return type

dict

get_child_objects()

Retrieve the names of all child objects in the hierarchy.

Child objects represent one-to-many relationships from the current object. For example, a “Patients” object might have “Admissions” and “Diagnoses” as child objects.

list of str

Names of all child objects. Returns an empty list if the object has no children.

KeyError

If the response from the server is missing expected keys.

RuntimeError

If the API call to fetch object details fails.

  • Child objects are defined in the startup configuration or XView.

  • The parent-child relationship enables aggregation dimensions that compute statistics across the hierarchy.

  • This method returns names only; use session.get_xobject(name) to get the actual child XObject instances.

Explore object hierarchy:

>>> session.startup("MIMIC_IV")
>>> patients = session.get_xobject("Patients")
>>> children = patients.get_child_objects()
>>> print(children)
['Admissions', 'Diagnoses', 'LabEvents', 'Prescriptions']

Navigate to child objects:

>>> for child_name in patients.get_child_objects():
...     child_obj = session.get_xobject(child_name)
...     print(f"{child_name}: {child_obj.get_dimensions()}")
Admissions: ['AdmissionID', 'AdmitDate', 'LOS']
Diagnoses: ['DiagnosisID', 'ICD_Code', 'DiagnosisDate']
...

get_dimensions : Get dimensions of the current object Xsession.get_xobject : Retrieve a child object instance

Return type

list

get_dimension(dimension_name)

Retrieve a specific dimension by its name.

Parameters

dimension_name (str) – The name of the dimension to retrieve.

Returns

The name of the dimension if found, otherwise None.

Return type

str

Raises
  • ValueError – If the dimension name is invalid.

  • RuntimeError – If fetching dimensions fails.

get_dimensions()

Retrieve the names of all dimensions attached to this object.

Dimensions represent properties or measurements of the object. They can be:

  • Stored dimensions: Values imported from source data

  • Aggregation dimensions: Computed from child object data

  • Derived dimensions: Calculated from other dimensions

list of str

Names of all dimensions attached to the object. Returns an empty list if the object has no dimensions.

KeyError

If the response from the server is missing expected keys.

RuntimeError

If the API call to fetch object details fails.

  • Dimensions are defined in the object’s configuration or added dynamically.

  • Each dimension can have one or more attributes that categorize its values.

  • To retrieve dimension objects (not just names), iterate and call session.get_dimension(object_name, dimension_name).

List all dimensions of an object:

>>> session.startup("MIMIC_IV")
>>> patients = session.get_xobject("Patients")
>>> dimensions = patients.get_dimensions()
>>> print(dimensions)
['PatientID', 'Age', 'Gender', 'Ethnicity', 'DOB', 'AdmissionCount']

Explore dimension details:

>>> for dim_name in patients.get_dimensions():
...     print(f"Dimension: {dim_name}")
Dimension: PatientID
Dimension: Age
Dimension: Gender
...

Filter for specific dimensions:

>>> numeric_dims = [d for d in patients.get_dimensions()
...                 if d in ['Age', 'Weight', 'Height']]
>>> print(numeric_dims)
['Age', 'Weight', 'Height']

get_dimension : Retrieve a specific dimension by name get_child_objects : Get child objects in the hierarchy add_aggregation_dimension : Add a computed dimension

Return type

list

get_name()

Return the name of the Xobject.

Return type

str

Methods:

Method

Description

get_name()

Return the name of the XObject

get_child_objects()

Get list of child object names

get_dimensions()

Get list of dimension names

get_dimension(dimension_name)

Get a specific dimension by name

add_aggregation_dimension(dimension_name, aggregation, ...)

Add a computed aggregation dimension

xplain.Dimension

class xplain.Dimension(object_name, dimension_name, ref_session)

Bases: object

Represents a dimension within an Xplain object.

Dimensions are properties or measurements associated with objects in the Xplain data model. They can be numeric (e.g., Age, Temperature) or categorical (e.g., Gender, Diagnosis Code). Each dimension has one or more attributes that organize its values into hierarchies or categories.

Dimensions are the fundamental units of analysis in Xplain:

  • Aggregations compute statistics on dimensions (COUNT, AVG, SUM)

  • Attributes categorize dimension values for grouping

  • Selections filter data based on attribute states

object_namestr

Name of the parent object containing this dimension.

dimension_namestr

Name of the dimension.

ref_sessionXsession

Reference to the active session for API interactions.

object_namestr

Name of the associated object.

dimension_namestr

Name of the dimension.

_ref_sessionXsession

Reference to the session object for API interaction.

TypeError

If object_name or dimension_name are not strings.

  • Dimensions are accessed via session.get_dimension(object, dimension) or through XObject.get_dimensions().

  • Each dimension has at least one default attribute (often named the same as the dimension itself).

  • Hierarchical dimensions have multi-level attributes (e.g., Date → Year → Month → Day).

Get a dimension and explore its attributes:

>>> session.startup("MIMIC_IV")
>>> age_dim = session.get_dimension("Patients", "Age")
>>> print(age_dim.get_name())
Age
>>> attributes = age_dim.get_attributes()
>>> for attr in attributes:
...     print(attr.get_name())
Age
AgeGroup
AgeDecade

Access a specific attribute:

>>> age_group_attr = age_dim.get_attribute("AgeGroup")
>>> if age_group_attr:
...     levels = age_group_attr.get_levels()
...     print(levels)
['0-18', '18-30', '30-45', '45-65', '65+']

XObject : Parent object containing dimensions Attribute : Categorization of dimension values Xsession.get_dimension : Retrieve a dimension by name

Parameters
  • object_name (str) –

  • dimension_name (str) –

__init__(object_name, dimension_name, ref_session)

Initialize the Dimension instance.

Parameters
  • object_name (str) – Name of the object.

  • dimension_name (str) – Name of the dimension.

  • ref_session – Session object for API calls.

get_attribute(attribute_name)

Retrieve a specific attribute by name.

Searches the dimension’s attributes and returns the matching Attribute object if found. Useful for accessing hierarchical attributes or specific categorizations.

attribute_namestr

The name of the attribute to retrieve. Case-sensitive.

Attribute or None

The matching Attribute object, or None if no attribute with the given name exists in this dimension.

ValueError

If attribute_name is not a non-empty string.

KeyError

If the server response is missing expected keys.

RuntimeError

If the API call to fetch dimension details fails.

  • Returns None (not an exception) if the attribute doesn’t exist, allowing safe existence checks.

  • Attribute names are case-sensitive and must match exactly.

  • The default attribute typically has the same name as the dimension.

Check if an attribute exists:

>>> age_dim = session.get_dimension("Patients", "Age")
>>> age_group = age_dim.get_attribute("AgeGroup")
>>> if age_group:
...     print(f"Found attribute: {age_group.get_name()}")
... else:
...     print("Attribute not found")
Found attribute: AgeGroup

Get hierarchy levels:

>>> date_dim = session.get_dimension("Admissions", "AdmitDate")
>>> year_month = date_dim.get_attribute("YearMonth")
>>> if year_month:
...     levels = year_month.get_levels()
...     print(f"Hierarchy levels: {levels}")
Hierarchy levels: ['Year', 'Month']

Safe attribute access:

>>> gender_dim = session.get_dimension("Patients", "Gender")
>>> custom_attr = gender_dim.get_attribute("CustomGrouping")
>>> if custom_attr is None:
...     print("Custom attribute doesn't exist, using default")
...     custom_attr = gender_dim.get_attribute("Gender")

get_attributes : Retrieve all attributes of the dimension Attribute.get_levels : Get hierarchy levels of an attribute

Parameters

attribute_name (str) –

get_attributes()

Retrieve all attributes attached to this dimension.

Attributes organize dimension values into categories or hierarchies. A dimension typically has at least one default attribute, and may have additional custom or hierarchical attributes.

list of Attribute

List of Attribute objects representing all attributes of the dimension. Returns an empty list if the dimension has no attributes defined.

KeyError

If the server response is missing expected keys.

RuntimeError

If the API call to fetch dimension details fails.

  • Each returned Attribute is a fully instantiated object that can be used to explore hierarchy levels and states.

  • Attributes enable grouping in queries via add_groupby() and filtering via add_selection().

  • The default attribute usually has the same name as the dimension.

List all attributes of a dimension:

>>> session.startup("MIMIC_IV")
>>> gender_dim = session.get_dimension("Patients", "Gender")
>>> attributes = gender_dim.get_attributes()
>>> for attr in attributes:
...     print(attr.get_name())
Gender

Explore hierarchical attributes:

>>> date_dim = session.get_dimension("Admissions", "AdmitDate")
>>> attributes = date_dim.get_attributes()
>>> for attr in attributes:
...     print(f"{attr.get_name()}: {attr.get_levels()}")
AdmitDate: ['Date']
YearMonth: ['Year', 'Month']
Quarter: ['Year', 'Quarter']

Use attributes in queries:

>>> age_dim = session.get_dimension("Patients", "Age")
>>> age_group = age_dim.get_attribute("AgeGroup")
>>> query = Query_config()
>>> query.add_groupby(
...     object_name="Patients",
...     dimension_name="Age",
...     attribute_name="AgeGroup"  # From get_attributes()
... )

get_attribute : Retrieve a specific attribute by name Attribute : Detailed attribute information and hierarchy navigation

Return type

list

get_name()

Returns the dimension name.

Return type

str

Methods:

Method

Description

get_name()

Return the dimension name

get_attributes()

Get list of Attribute instances

get_attribute(attribute_name)

Get a specific Attribute by name

xplain.Attribute

class xplain.Attribute(object_name, dimension_name, attribute_name, ref_session)

Bases: object

Represents an attribute that categorizes dimension values.

Attributes organize dimension values into hierarchies or categories, enabling grouping and filtering in queries. For example, a continuous “Age” dimension might have an “AgeGroup” attribute with categories like “0-18”, “18-65”, “65+”.

Attributes can be:

  • Flat: Single-level categorization (e.g., Gender: Male/Female)

  • Hierarchical: Multi-level trees (e.g., Date → Year → Month → Day)

  • Derived: Computed from dimension values (e.g., age bins, quantiles)

object_namestr

Name of the object containing the dimension.

dimension_namestr

Name of the dimension containing this attribute.

attribute_namestr

Name of the attribute.

ref_sessionXsession

Reference to the active session for API interactions.

object_namestr

The parent object name.

dimension_namestr

The parent dimension name.

attribute_namestr

The attribute name.

_ref_sessionXsession

Session reference for API calls.

  • Attributes are accessed via Dimension.get_attribute(name) or Dimension.get_attributes().

  • Hierarchical attributes enable drill-down analysis (expand/collapse).

  • Attribute states (values) can be explored via get_state_hierarchy().

Get an attribute and explore its hierarchy:

>>> session.startup("MIMIC_IV")
>>> age_dim = session.get_dimension("Patients", "Age")
>>> age_group = age_dim.get_attribute("AgeGroup")
>>> print(age_group.get_name())
AgeGroup
>>> print(age_group.get_levels())
['AgeGroup']

Hierarchical attribute (Date):

>>> date_dim = session.get_dimension("Admissions", "AdmitDate")
>>> year_month = date_dim.get_attribute("YearMonth")
>>> levels = year_month.get_levels()
>>> print(levels)
['Year', 'Month']

Get state hierarchy:

>>> hierarchy = age_group.get_state_hierarchy()
>>> print(hierarchy)
{'stateName': 'All', 'children': [{'stateName': '0-18'}, {'stateName': '18-65'}, ...]}

Dimension : Parent dimension containing attributes Dimension.get_attribute : Retrieve an attribute by name

get_levels()

Retrieve the hierarchy level names of this attribute.

For hierarchical attributes, this returns the ordered list of levels from coarsest to finest granularity. For flat attributes, returns a single-element list.

list of str

Ordered list of hierarchy level names. For example: - Flat attribute: ["Gender"] - Hierarchical: ["Year", "Quarter", "Month", "Week"]

ValueError

If the attribute information cannot be retrieved or if ‘hierarchyLevelNames’ is missing from the response.

  • The first level is the coarsest (e.g., “Year”), the last is finest (e.g., “Day”).

  • Level names are used in Query_config.add_groupby() to specify which hierarchy level to group by.

  • For non-hierarchical attributes, the list contains only the attribute name itself.

Flat attribute (single level):

>>> session.startup("MIMIC_IV")
>>> gender_dim = session.get_dimension("Patients", "Gender")
>>> gender_attr = gender_dim.get_attribute("Gender")
>>> print(gender_attr.get_levels())
['Gender']

Hierarchical attribute (date/time):

>>> date_dim = session.get_dimension("Admissions", "AdmitDate")
>>> year_month = date_dim.get_attribute("YearMonth")
>>> print(year_month.get_levels())
['Year', 'Month']

ICD diagnosis hierarchy:

>>> icd_dim = session.get_dimension("Diagnoses", "ICD_Code")
>>> icd_hierarchy = icd_dim.get_attribute("ICD_Hierarchy")
>>> print(icd_hierarchy.get_levels())
['Chapter', 'Block', 'Category', 'Code']

Use levels in queries:

>>> query = Query_config()
>>> query.add_groupby(
...     object_name="Admissions",
...     dimension_name="AdmitDate",
...     attribute_name="YearMonth",
...     groupby_level="Year"  # Group by year only
... )

get_state_hierarchy : Explore the actual values (states) in the hierarchy Query_config.add_groupby : Use hierarchy levels in queries

get_name()

Retrieves the name of the attribute. :return: Attribute name as a string.

get_root_state()

Get the root state (top-level category) of this attribute.

The root state represents the most aggregated level in the attribute hierarchy, typically named “All” or representing the total population.

str

The name of the root state (e.g., “All”, “Total”, or a custom name).

  • The root state encompasses all child states in the hierarchy.

  • For flat attributes, the root may be the only state or a summary category.

  • This is useful for understanding the top-level category before drilling down.

Get root state:

>>> session.startup("MIMIC_IV")
>>> gender_dim = session.get_dimension("Patients", "Gender")
>>> gender_attr = gender_dim.get_attribute("Gender")
>>> root = gender_attr.get_root_state()
>>> print(root)
All

Hierarchical attribute root:

>>> date_dim = session.get_dimension("Admissions", "AdmitDate")
>>> year_month = date_dim.get_attribute("YearMonth")
>>> root = year_month.get_root_state()
>>> print(root)
All

get_state_hierarchy : Get the full state hierarchy get_levels : Get hierarchy level names

get_state_hierarchy(state=None, levels=None)

Retrieve the state hierarchy showing all values and their structure.

Returns a tree structure of attribute states (values), showing parent-child relationships for hierarchical attributes. Useful for exploring available categories and understanding the attribute’s organization.

statestr, optional

Specific state to retrieve the sub-hierarchy for. If None, returns the full hierarchy starting from the root.

levelslist of str, optional

Specific hierarchy levels to include. If None, returns all levels.

dict

Nested dictionary representing the state hierarchy. Structure:

{
    "stateName": "Root",
    "children": [
        {"stateName": "Child1", "children": [...]},
        {"stateName": "Child2", "children": [...]}
    ]
}
  • For flat attributes, the hierarchy is shallow with no nested children.

  • For hierarchical attributes, children represent progressively finer granularities.

  • States are the actual categorical values used in selections and groupings.

  • This method delegates to Xsession.get_state_hierarchy().

Get full hierarchy for a flat attribute:

>>> session.startup("MIMIC_IV")
>>> gender_dim = session.get_dimension("Patients", "Gender")
>>> gender_attr = gender_dim.get_attribute("Gender")
>>> hierarchy = gender_attr.get_state_hierarchy()
>>> print(hierarchy)
{
    'stateName': 'All',
    'children': [
        {'stateName': 'Male'},
        {'stateName': 'Female'},
        {'stateName': 'Unknown'}
    ]
}

Hierarchical attribute (date by year/month):

>>> date_dim = session.get_dimension("Admissions", "AdmitDate")
>>> year_month = date_dim.get_attribute("YearMonth")
>>> hierarchy = year_month.get_state_hierarchy()
>>> print(hierarchy)
{
    'stateName': 'All',
    'children': [
        {
            'stateName': '2020',
            'children': [
                {'stateName': '2020-01'},
                {'stateName': '2020-02'},
                ...
            ]
        },
        ...
    ]
}

Get sub-hierarchy for specific state:

>>> sub_hierarchy = year_month.get_state_hierarchy(state='2020')
>>> print(sub_hierarchy)
{
    'stateName': '2020',
    'children': [
        {'stateName': '2020-01'},
        {'stateName': '2020-02'},
        ...
    ]
}

get_levels : Get hierarchy level names get_root_state : Get the root state name Xsession.get_state_hierarchy : Underlying implementation

Methods:

Method

Description

get_name()

Return the attribute name

get_levels()

Get hierarchy level names

get_state_hierarchy(state=None, levels=None)

Get the hierarchical state structure

get_root_state()

Get the root state name

xplain.Query_config

class xplain.Query_config(name=None)

Bases: object

Builder for constructing Xplain analytical query configurations.

Provides a fluent API for building complex queries with aggregations, group-bys, and selections. Queries are executed via Xsession.execute_query().

The builder pattern allows chaining method calls to incrementally construct queries. Each query consists of three main components:

  1. Aggregations: Compute summary statistics (COUNT, SUM, AVG, etc.)

  2. Group-bys: Organize results by attribute categories

  3. Selections: Filter data to specific attribute states

requestdict

The internal query configuration containing aggregations, groupBys, and selections in JSON-serializable format.

  • All aggregation methods return self to enable method chaining.

  • Queries are identified by a unique requestName, auto-generated if not provided.

  • The query configuration can be serialized to JSON via to_json().

  • For simpler queries, consider using Xsession.open_attribute() or the fluent QueryBuilder API via Xsession.query_builder().

Basic aggregation with grouping:

>>> from xplain import Query_config
>>> query = Query_config()
>>> query.add_aggregation(
...     object_name="Patients",
...     dimension_name="Age",
...     type="AVG"
... )
>>> query.add_groupby(
...     object_name="Patients",
...     dimension_name="Gender",
...     attribute_name="Gender"
... )
>>> df = session.execute_query(query)

Multiple aggregations:

>>> query = Query_config(name="patient_stats")
>>> query.add_aggregation(object_name="Patients", dimension_name="Age", type="AVG") \
...      .add_aggregation(object_name="Patients", dimension_name="Weight", type="AVG") \
...      .add_groupby(object_name="Patients", dimension_name="Gender", attribute_name="Gender")
>>> results = session.execute_query(query)

With selections (filtering):

>>> query = Query_config()
>>> query.add_aggregation(object_name="Admissions", dimension_name="LOS", type="AVG") \
...      .add_groupby(object_name="Admissions", dimension_name="AdmissionType", attribute_name="Type") \
...      .add_selection(
...          object_name="Patients",
...          dimension_name="Age",
...          attribute_name="AgeGroup",
...          selected_states=["65-75", "75-85", ">85"]
...      )
>>> elderly_los = session.execute_query(query)

MIMIC-IV analysis - ICU mortality by diagnosis:

>>> query = Query_config(name="icu_mortality_by_diagnosis")
>>> query.add_aggregation(object_name="Patients", dimension_name="Mortality", type="AVG") \
...      .add_groupby(object_name="Diagnoses", dimension_name="ICD_Code", attribute_name="Category") \
...      .add_selection(
...          object_name="Admissions",
...          dimension_name="ICU_Stay",
...          attribute_name="ICU_Flag",
...          selected_states=["Yes"]
...      )
>>> df = session.execute_query(query)

Xsession.execute_query : Execute the constructed query Xsession.query_builder : Alternative fluent API via QueryBuilder Xsession.open_attribute : Convenience method for simple attribute counts

Parameters

name (str) –

__init__(name=None)

Initialize the QueryConfig instance with a default or provided name.

Parameters

name (str, optional) – The name or identifier for the query. Defaults to a UUID.

add_aggregation(object_name, dimension_name, type, aggregation_name=None)

Add an aggregation to compute summary statistics on a dimension.

Aggregations define what to calculate from the data. Multiple aggregations can be added to a single query, and each produces a column in the result.

object_namestr

Name of the object containing the dimension to aggregate.

dimension_namestr

Name of the dimension to compute the aggregation on.

typestr

Aggregation type. Supported values: - "COUNT" : Count of non-null values - "COUNTDISTINCT" : Count of unique values - "COUNTENTITY" : Count of entities - "SUM" : Sum of numeric values - "AVG" : Average (mean) of numeric values - "MIN" : Minimum value - "MAX" : Maximum value - "VAR" : Variance - "STDEV" : Standard deviation - "QUANTILE" : Quantile (requires additional config)

aggregation_namestr, optional

Custom name for the aggregation column in results. If not provided, the server auto-generates a name (e.g., "COUNT_DimensionName").

Query_config

Returns self to enable method chaining.

ValueError

If required parameters are missing or if type is not a valid aggregation type.

  • Multiple aggregations on the same or different dimensions are allowed.

  • Aggregation results appear as columns in the returned DataFrame.

  • For COUNT operations, the dimension value itself doesn’t matter; it counts the number of entities with that dimension defined.

Count patients:

>>> query = Query_config()
>>> query.add_aggregation(
...     object_name="Patients",
...     dimension_name="PatientID",
...     type="COUNT"
... )

Average age with custom name:

>>> query.add_aggregation(
...     object_name="Patients",
...     dimension_name="Age",
...     type="AVG",
...     aggregation_name="AvgAge"
... )

Multiple aggregations (chained):

>>> query = Query_config()
>>> query.add_aggregation(object_name="LabEvents", dimension_name="Creatinine", type="AVG") \
...      .add_aggregation(object_name="LabEvents", dimension_name="Creatinine", type="MIN") \
...      .add_aggregation(object_name="LabEvents", dimension_name="Creatinine", type="MAX")

MIMIC-IV: Vital signs statistics:

>>> query = Query_config()
>>> query.add_aggregation(object_name="VitalSigns", dimension_name="HeartRate", type="AVG", aggregation_name="AvgHR") \
...      .add_aggregation(object_name="VitalSigns", dimension_name="BloodPressureSystolic", type="AVG", aggregation_name="AvgSBP") \
...      .add_groupby(object_name="Patients", dimension_name="AgeGroup", attribute_name="AgeGroup")

add_groupby : Add grouping to organize aggregated results add_selection : Filter data before aggregation

Parameters
  • object_name (str) –

  • dimension_name (str) –

  • type (str) –

  • aggregation_name (str) –

add_groupby(attribute_name, object_name=None, dimension_name=None, groupby_level=None, groupby_level_number=None, groupby_states=None)

Add a group-by to organize aggregation results by attribute categories.

Group-bys partition the data into categories based on attribute values, creating separate rows in the result for each category. Multiple group-bys create nested hierarchies.

attribute_namestr

Name of the attribute to group by. For hierarchical attributes, this groups by the first level unless groupby_level is specified.

object_namestr, optional

Name of the object containing the dimension. If omitted, the builder attempts auto-resolution (experimental).

dimension_namestr, optional

Name of the dimension containing the attribute. If omitted, the builder attempts auto-resolution (experimental).

groupby_levelstr, optional

Specific level name in a hierarchical attribute to group by. Overrides the default first-level grouping.

groupby_level_numberint, optional

Numeric level in the attribute hierarchy to group by (0-indexed). Alternative to groupby_level for hierarchical attributes.

groupby_stateslist, optional

Specific attribute states to include in the grouping. If provided, only these states will appear in results. (Currently unused in implementation)

Query_config

Returns self to enable method chaining.

ValueError

If attribute_name is not provided or if groupby_level_number is not an integer.

RuntimeError

If the group-by specification cannot be constructed.

  • Group-bys are applied in the order they are added, creating nested hierarchies.

  • Each group-by creates a new dimension in the result structure.

  • For non-hierarchical attributes, groupby_level and groupby_level_number are ignored.

  • Auto-resolution of object/dimension names is experimental and may not work in all cases; explicitly providing them is recommended.

Group by gender:

>>> query = Query_config()
>>> query.add_aggregation(object_name="Patients", dimension_name="Age", type="AVG")
>>> query.add_groupby(
...     object_name="Patients",
...     dimension_name="Gender",
...     attribute_name="Gender"
... )

Multiple group-bys (nested):

>>> query = Query_config()
>>> query.add_aggregation(object_name="Admissions", dimension_name="LOS", type="AVG")
>>> query.add_groupby(object_name="Patients", dimension_name="Gender", attribute_name="Gender")
>>> query.add_groupby(object_name="Patients", dimension_name="Age", attribute_name="AgeGroup")
# Results grouped first by Gender, then by AgeGroup within each gender

Hierarchical attribute grouping:

>>> query = Query_config()
>>> query.add_aggregation(object_name="Diagnoses", dimension_name="ICD_Code", type="COUNT")
>>> query.add_groupby(
...     object_name="Diagnoses",
...     dimension_name="ICD_Code",
...     attribute_name="Hierarchy",
...     groupby_level="Chapter"  # Group by ICD chapter level
... )

MIMIC-IV: Admissions by type and age group:

>>> query = Query_config(name="admissions_analysis")
>>> query.add_aggregation(object_name="Admissions", dimension_name="AdmissionID", type="COUNT") \
...      .add_groupby(object_name="Admissions", dimension_name="AdmissionType", attribute_name="Type") \
...      .add_groupby(object_name="Patients", dimension_name="Age", attribute_name="AgeGroup")

add_aggregation : Define what statistics to compute add_selection : Filter data before grouping

Parameters
  • attribute_name (str) –

  • object_name (str) –

  • dimension_name (str) –

  • groupby_level (str) –

  • groupby_level_number (int) –

  • groupby_states (list) –

add_selection(attribute_name, object_name=None, dimension_name=None, selected_states=None)

Add a selection (filter) to restrict query results to specific attribute states.

Selections filter the dataset before aggregations and group-bys are applied, effectively creating a cohort or subset of data. Multiple selections act as AND conditions, narrowing the dataset further.

attribute_namestr

Name of the attribute to filter on.

object_namestr, optional

Name of the object containing the dimension. If omitted, the builder attempts auto-resolution (experimental).

dimension_namestr, optional

Name of the dimension containing the attribute. If omitted, the builder attempts auto-resolution (experimental).

selected_stateslist of str, optional

List of specific attribute states (values) to include. Only entities with these attribute values will be included in the query results. If None, all states are selected (effectively no filter).

Query_config

Returns self to enable method chaining.

ValueError

If attribute_name is not provided.

RuntimeError

If the selection specification cannot be constructed.

  • Selections are applied before aggregations and groupings.

  • Multiple selections create AND conditions (all must be satisfied).

  • An empty selected_states list means no filtering (all states included).

  • For date/time selections, states often correspond to time periods or formatted date strings.

  • Auto-resolution of object/dimension names is experimental; explicitly providing them is recommended for production code.

Filter by gender:

>>> query = Query_config()
>>> query.add_aggregation(object_name="Patients", dimension_name="Age", type="AVG")
>>> query.add_selection(
...     object_name="Patients",
...     dimension_name="Gender",
...     attribute_name="Gender",
...     selected_states=["Female"]
... )

Filter by multiple age groups:

>>> query = Query_config()
>>> query.add_aggregation(object_name="Admissions", dimension_name="LOS", type="AVG")
>>> query.add_selection(
...     object_name="Patients",
...     dimension_name="Age",
...     attribute_name="AgeGroup",
...     selected_states=["65-75", "75-85", ">85"]
... )

Multiple filters (AND condition):

>>> query = Query_config()
>>> query.add_aggregation(object_name="Patients", dimension_name="PatientID", type="COUNT")
>>> query.add_selection(
...     object_name="Patients",
...     dimension_name="Gender",
...     attribute_name="Gender",
...     selected_states=["Male"]
... )
>>> query.add_selection(
...     object_name="Diagnoses",
...     dimension_name="ICD_Code",
...     attribute_name="Category",
...     selected_states=["Circulatory"]
... )
# Results: Male patients with circulatory diagnoses

MIMIC-IV: ICU patients with high severity:

>>> query = Query_config(name="high_severity_icu")
>>> query.add_aggregation(object_name="Patients", dimension_name="Mortality", type="AVG") \
...      .add_selection(
...          object_name="Admissions",
...          dimension_name="ICU_Stay",
...          attribute_name="ICU_Flag",
...          selected_states=["Yes"]
...      ) \
...      .add_selection(
...          object_name="Severity",
...          dimension_name="SOFA_Score",
...          attribute_name="ScoreCategory",
...          selected_states=["High", "Very High"]
...      )

Time-based selection:

>>> query = Query_config()
>>> query.add_aggregation(object_name="Admissions", dimension_name="AdmissionID", type="COUNT")
>>> query.add_selection(
...     object_name="Admissions",
...     dimension_name="AdmitDate",
...     attribute_name="YearMonth",
...     selected_states=["2020-01", "2020-02", "2020-03"]
... )

add_aggregation : Define statistics to compute on filtered data add_groupby : Organize filtered results by categories

Parameters
  • attribute_name (str) –

  • object_name (str) –

  • dimension_name (str) –

  • selected_states (list) –

set_name(request_name)

Assign a specific name or ID to the query.

Parameters

request_name (str) – The name or ID to be assigned.

Raises

ValueError – If the request_name is not a valid string.

to_json()

Return the configuration of this query as JSON.

Returns

The query configuration.

Return type

dict

Methods:

Method

Description

set_name(request_name)

Set the query name/ID

add_aggregation(object_name, dimension_name, type, aggregation_name=None)

Add an aggregation (SUM, AVG, COUNT, etc.)

add_groupby(attribute_name, object_name, dimension_name, ...)

Add a group-by specification

add_selection(attribute_name, object_name, dimension_name, selected_states)

Add a selection (filter)

to_json()

Return the query configuration as a dictionary

Aggregation Types:

  • SUM - Sum of values

  • AVG - Average

  • COUNT - Count of records

  • COUNTDISTINCT - Count of distinct values

  • COUNTENTITY - Count of entities

  • MAX - Maximum value

  • MIN - Minimum value

  • VAR - Variance

  • STDEV - Standard deviation

  • QUANTILE - Quantile

Example:

from xplain import Query_config

query = Query_config(name="my_query")
query.add_aggregation(
    object_name="Sales",
    dimension_name="Revenue",
    type="SUM"
)
query.add_groupby(
    object_name="Sales",
    dimension_name="Product",
    attribute_name="Category"
)
query.add_selection(
    object_name="Sales",
    dimension_name="Date",
    attribute_name="Year",
    selected_states=["2024"]
)

df = session.execute_query(query)

xplain.QueryBuilder

class xplain.QueryBuilder(session, name=None)

Bases: object

Fluent builder for Xplain queries.

Obtained via Xsession.query_builder(name=...). Chain calls to aggregate(), groupby(), and selection(), then finalise with execute() (one-shot result) or open() (live result that updates when session selections change).

When only attribute is supplied to groupby() or selection(), the builder searches the loaded session object tree and resolves the matching object and dimension automatically. If the attribute name is ambiguous (found in more than one place), a ValueError is raised listing all candidates.

Example:

df = (
    session.query_builder(name="lab_counts")
    .aggregate(object="Lab Events", type="COUNT")
    .groupby(attribute="TestType")
    .selection(attribute="Date", selected_states=["2024-01", "2024-02"])
    .execute()
)
Parameters

name (str) –

aggregate(object, type, dimension=None, name=None)

Add an aggregation measure to the query.

Parameters
  • object (str) – Name of the Xobject to aggregate over.

  • type (str) – Aggregation function. One of SUM, AVG, COUNT, COUNTDISTINCT, MAX, MIN, COUNTENTITY, VAR, STDEV, QUANTILE.

  • dimension (str) – Dimension within the object. May be omitted for entity-level aggregations such as COUNT / COUNTENTITY.

  • name (str) – Optional display name for the resulting measure column.

Returns

self – for method chaining.

Raises

ValueError – If type is not a recognised aggregation function.

execute(data_frame=True)

Execute the query and return results immediately.

Equivalent to calling Xsession.execute_query with the built request. The query is not kept open – subsequent session changes (e.g. selection changes) will not affect the returned result.

Parameters

data_frame (bool) – When True (default) the result is returned as a pandas.DataFrame; otherwise raw JSON is returned.

Returns

DataFrame or JSON depending on data_frame.

groupby(attribute, object=None, dimension=None, groupby_level_name=None, groupby_level_number=None)

Add a group-by dimension to the query.

When object or dimension are omitted, the builder searches the session object tree for an attribute whose name matches attribute. If exactly one match is found, its object/dimension are used automatically. If more than one match exists, a ValueError is raised listing all candidates so you can disambiguate.

Parameters
  • attribute (str) – Attribute name to group by.

  • object (str) – Xobject name. Resolved automatically when omitted.

  • dimension (str) – Dimension name. Resolved automatically when omitted.

  • groupby_level_name (str) – Named level within a hierarchical dimension.

  • groupby_level_number (int) – Numeric level within a hierarchical dimension.

Returns

self – for method chaining.

Raises

ValueError – If attribute is not provided, not found in the session tree, or is ambiguous.

open(data_frame=True)

Open the query and return a live result.

Equivalent to calling Xsession.open_query. The query stays active inside the session so that further selection changes automatically update its result.

Parameters

data_frame (bool) – When True (default) the result is returned as a pandas.DataFrame; otherwise raw JSON is returned.

Returns

DataFrame or JSON depending on data_frame.

selection(attribute, object=None, dimension=None, selected_states=None)

Add a selection filter to the query.

Like groupby(), object and dimension are resolved automatically from the session tree when omitted.

Parameters
  • attribute (str) – Attribute name to filter on.

  • object (str) – Xobject name. Resolved automatically when omitted.

  • dimension (str) – Dimension name. Resolved automatically when omitted.

  • selected_states (list) – List of state values to keep. None means no state filtering (all states).

Returns

self – for method chaining.

Raises

ValueError – If attribute is not provided, not found, or ambiguous.

Obtained via Xsession.query_builder(). Every method (except the terminal execute() / open()) returns self so calls can be chained.

Builder Methods:

Method

Description

aggregate(object, type, dimension=None, name=None)

Add an aggregation measure. dimension may be omitted for entity-level types such as COUNT / COUNTENTITY.

groupby(attribute, object=None, dimension=None, groupby_level_name=None, groupby_level_number=None)

Add a group-by dimension. object and dimension are resolved automatically from the session tree when omitted.

selection(attribute, object=None, dimension=None, selected_states=None)

Add a selection filter. Auto-resolves object/dimension like groupby.

execute(data_frame=True)

Run the query and return results immediately (one-shot).

open(data_frame=True)

Run the query and keep it alive so results update with session selection changes.

Auto-resolution of object and dimension

When object or dimension is omitted from groupby / selection, QueryBuilder walks the session object tree and finds every (object, dimension) pair that contains the named attribute:

  • Unique match — used automatically, no action required.

  • No match — raises ValueError with the attribute name.

  • Multiple matches — raises ValueError listing all candidates so you can disambiguate by passing object and/or dimension explicitly.

Aggregation Types:

  • SUM - Sum of values

  • AVG - Average

  • COUNT - Count of records

  • COUNTDISTINCT - Count of distinct values

  • COUNTENTITY - Count of entities

  • MAX - Maximum value

  • MIN - Minimum value

  • VAR - Variance

  • STDEV - Standard deviation

  • QUANTILE - Quantile

Examples:

# Minimal: attribute auto-resolved from the session tree
df = (
    session.query_builder(name="lab_counts")
    .aggregate(object="Lab Events", type="COUNT")
    .groupby(attribute="TestType")
    .execute()
)

# With selection filter and hierarchical level
df = (
    session.query_builder()
    .aggregate(object="Orders", type="SUM", dimension="Revenue")
    .groupby(attribute="Month", groupby_level_name="Month")
    .selection(attribute="Year", selected_states=["2024"])
    .execute()
)

# Live query — updates when session selections change
df = (
    session.query_builder(name="live_revenue")
    .aggregate(object="Orders", type="SUM", dimension="Revenue")
    .groupby(attribute="Category")
    .open()
)

# Disambiguation: attribute "Date" exists in multiple objects
df = (
    session.query_builder()
    .aggregate(object="Lab Events", type="COUNT")
    .groupby(attribute="Date", object="Lab Events", dimension="Date")
    .execute()
)