Core API Reference

This section documents the core classes of the Xplain Python package.

xplain.Xsession

class xplain.Xsession(url='http://localhost:8080', user='user', password='xplainData', httpsession=None, http_session_id=None, jwt_dispatch_url=None, jwt_cookie_name=None, jwt_token=None)

Bases: object

Xplain session manager for data analytics operations.

The primary interface for interacting with an Xplain server. Each instance represents an authenticated session that can load data models, execute queries, perform statistical analyses, and manage data transformations.

Xsession provides comprehensive functionality for:

Session Management: Connect, authenticate, load configurations
Data Querying: Execute aggregations, group-bys, selections
Object Navigation: Explore hierarchical data structures
Statistical Modeling: Run regressions, build predictive models
Data Import/Export: Load from databases, export results
Visualization: Generate collapsible trees and data views

The class supports multiple authentication methods (credentials, JWT, session reuse) and can be used standalone or in multi-session scenarios for parallel analysis.

__url__str: Base URL of the connected Xplain server
__id__str: Unique 32-character session identifier
__xplain_session__dict: Current session state and metadata
__requests_session__requests.Session: Underlying HTTP session for server communication

Each Xsession instance maintains independent state and can connect to different servers or use different credentials.
Sessions persist on the server until explicitly terminated or until server timeout expires.
For production use, always call terminate() when done or use the context manager pattern to ensure proper cleanup.

Basic usage:

>>> from xplain import Xsession, Query_config
>>> session = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session.startup("MIMIC_IV")
>>>
>>> query = Query_config()
>>> query.add_aggregation(object_name="Patients", dimension_name="Age", type="AVG")
>>> df = session.execute_query(query)
>>> print(df)
>>> session.terminate()

Context manager pattern (recommended):

>>> with Xsession(url="http://localhost:8080", user="admin", password="admin") as session:
...     session.startup("Analysis")
...     df = session.open_attribute("Patients", "Gender", "Gender")
...     print(df)

Multi-session analysis:

>>> session1 = Xsession(url="http://server1:8080", user="admin", password="pass1")
>>> session2 = Xsession(url="http://server2:8080", user="admin", password="pass2")
>>> session1.startup("Dataset_A")
>>> session2.startup("Dataset_B")
>>> # Compare results from different servers
>>> df1 = session1.execute_query(query)
>>> df2 = session2.execute_query(query)

XplainSession : Unified API with namespaced methods (alternative interface) XplainClient : Low-level client for direct Web API calls Query_config : Builder for constructing analytical queries

__init__(url='http://localhost:8080', user='user', password='xplainData', httpsession=None, http_session_id=None, jwt_dispatch_url=None, jwt_cookie_name=None, jwt_token=None)

Create a new Xplain session for data analytics operations.

Establishes an authenticated connection to an Xplain server instance. Supports multiple authentication methods: standard credentials, JWT tokens, or session reuse via existing HTTP session IDs.

urlstr, default=’http://localhost:8080’: The base URL of the Xplain server (including protocol and port). Can also be set via environment variable xplain_url or global variable xplain_url.
userstr, default=’user’: Username for authentication. Required unless using JWT or session ID.
passwordstr, default=’xplainData’: Password for authentication. Required unless using JWT or session ID.
httpsessionrequests.Session, optional: Existing Python requests Session object to reuse. Useful for sharing session state across multiple Xplain connections.
http_session_idstr, optional: Existing HTTP session ID (JSESSIONID) to reuse an active session. Must be a valid 32-character session identifier.
jwt_dispatch_urlstr, optional: URL endpoint for JWT-based authentication. Required for JWT auth.
jwt_cookie_namestr, optional: Cookie name containing the JWT token. Required for JWT auth.
jwt_tokenstr, optional: JWT token string for authentication. Required for JWT auth.

RuntimeError: If the URL is not provided via any method (argument, environment, or globals).
HTTPError: If HTTP-level errors occur during authentication.
ConnectionError: If network connection to the server fails.
Timeout: If the server does not respond within the timeout period.
ValueError: If an invalid session ID format is provided.

Authentication is attempted in the following order: 1. Credential-based (user/password) 2. JWT-based (if JWT parameters provided) 3. Session ID reuse (if http_session_id provided)
The session can be loaded from an existing session ID via the environment variable xplain_session_id.
SSL verification is disabled by default for development environments. Enable it in production by modifying the verify parameter in requests.

Basic authentication:

>>> from xplain import Xsession
>>> session = Xsession(
...     url="http://myhost:8080",
...     user="analyst",
...     password="secret123"
... )
>>> session.startup("PatientCohort")

JWT authentication:

>>> session = Xsession(
...     url="https://secure.xplain.com",
...     jwt_dispatch_url="https://auth.example.com/dispatch",
...     jwt_cookie_name="auth_token",
...     jwt_token="eyJhbGciOi..."
... )

Reuse existing session:

>>> session = Xsession(
...     url="http://myhost:8080",
...     http_session_id="A1B2C3D4E5F6G7H8I9J0K1L2M3N4O5P6"
... )

Environment-based URL configuration:

>>> import os
>>> os.environ['xplain_url'] = 'http://production:8080'
>>> session = Xsession(user="admin", password="prod_pass")

startup : Load a startup configuration file startup_from_xview_config : Load session from XView configuration load_from_session_id : Load session by existing session ID terminate : Close the session and logout

add_quantile_based_attributes(object_name=None, dimension=None, quantiles=None, quantiles_attribute_name=None, ranges_attribute_name=None, use_names_as_postfix=False, selections=None, sample_size=None, script_file=None, script_file_ownership='PUBLIC')

Generate quantile-based attributes for FLOAT/DOUBLE dimensions.

For each target dimension the server computes two optional attributes:

Quantiles attribute (quantiles_attribute_name): bins whose boundaries are the actual data quantiles — non-equidistant, but each bin contains roughly the same number of records.
Ranges attribute (ranges_attribute_name): equidistant bins spanning [min-quantile .. max-quantile] — equal width, unequal population.

If neither name is supplied, both are created with default names "Quantiles" and "Ranges". If exactly one name is None that attribute is skipped entirely.

Parameters

object_name (str, optional) – Name of the XObject. When given without dimension, attributes are created for all FLOAT/DOUBLE dimensions of that object.
dimension (dict, optional) – {"object": "...", "dimension": "..."} dict targeting a single dimension. object_name may be omitted when this is provided.
quantiles (list of float, optional) – List of quantile fractions, e.g. [0.1, 0.25, 0.5, 0.75, 0.9]. Every value must be strictly between 0 and 1, and the list must have at least 2 entries. When omitted the server uses all 1 % steps (0.01 … 0.99).
quantiles_attribute_name (str or None) – Name for the quantile-bins attribute. Pass None (while providing ranges_attribute_name) to skip.
ranges_attribute_name (str or None) – Name for the equidistant-ranges attribute. Pass None (while providing quantiles_attribute_name) to skip.
use_names_as_postfix (bool) – When True the dimension name is prepended to each attribute name (e.g. "Torque - Quantiles").
selections (dict, optional) – Active session selections to scope the quantile computation, e.g. {"object": "...", "attribute": "...", "dimension": "...", "selectedStates": [...]}.
sample_size (int, optional) – If provided, quantiles are computed on a random sample. The value is in permille units (1–999): 10 means a 1 % sample, 50 means a 5 % sample, 100 means a 10 % sample. Must be strictly between 0 and 1000.
script_file (dict, optional) – Server-side file path to persist the generated addNumberRangesAttribute calls as a re-runnable .xscript. Example: {"ownership": "PUBLIC", "filePath": ["quantiles.xscript"]}
script_file_ownership (str) – Ownership for script_file when only a plain string filename is given (default: "PUBLIC").

Raises

ValueError – If neither object_name nor dimension is given.
RuntimeError – On server-side errors.

Example — all numeric dims of one object, save re-runnable script:

xsession.add_quantile_based_attributes(
    object_name="screwing station",
    use_names_as_postfix=True,
    script_file={"ownership": "PUBLIC",
                 "filePath": ["screwing_station_quantiles.xscript"]},
)

Example — single dimension with explicit quantiles:

xsession.add_quantile_based_attributes(
    dimension={"object": "screwing station",
               "dimension": "Torque max"},
    quantiles=[0.03, 0.1, 0.25, 0.5, 0.75, 0.9, 0.97],
    quantiles_attribute_name="Torque max Quantiles",
    ranges_attribute_name=None,
)

build_formula(response, predictors)

Dynamically build an R-style formula for Patsy.

Parameters

response (str) – The dependent variable.
predictors (list) – A list of predictor variable names.

Returns

The constructed formula in R-style syntax.

Return type

str

build_predictive_model(model_name, xmodel_configuration_file_name, target_event_object): build predictive model [BETA!!]

build_tree_data(json_object): Convert complex JSON structure into a format suitable for D3.js tree visualization. This recursively parses the JSON, building a nested dictionary format compatible with D3.js.

collapsible_tree()

Generate and visualize a collapsible tree using hierarchical data.

This function builds a tree structure based on the current focus object, processes it into a source-target DataFrame suitable for visualization, and then uses pyecharts to render the tree directly in Jupyter.

Example

Xsession.collapsible_tree()

Parameters: None –
Returns: The function directly renders the visualization in the notebook.
Return type: None

convert_to_dataframe(data)

Convert query result JSON to pandas DataFrame format.

Transforms nested JSON result structures from Xplain queries into a flat pandas DataFrame suitable for analysis. Handles hierarchical data by extracting leaf node values.

datadict

Query result in JSON format with ‘fields’ and ‘children’ keys. Expected structure:

{
    "fields": ["Attribute1", "Attribute2", "Count"],
    "children": [
        {"data": [{"field1": value1}, {"field2": value2}]},
        ...
    ]
}

pandas.DataFrame: Tabular data with columns corresponding to the ‘fields’ list. Each row represents a leaf node from the hierarchical result.

KeyError: If expected keys (‘fields’ or ‘children’) are missing from data.
TypeError: If data contains invalid types that cannot be converted.

Nested hierarchies are flattened; only leaf nodes contribute rows.
Dict values in result data are unwrapped (e.g., {"value": 123} becomes 123).
Missing values are filled with None.
This method is called automatically by execute_query() when data_frame=True.

Convert query results:

>>> result_json = session.perform({"method": "getResult", "requestName": "my_query"})
>>> df = session.convert_to_dataframe(result_json)
>>> print(df.head())

Manual conversion of custom result:

>>> custom_data = {
...     "fields": ["Category", "Count"],
...     "children": [
...         {"data": [{"Category": "A"}, {"Count": 100}]},
...         {"data": [{"Category": "B"}, {"Count": 200}]}
...     ]
... }
>>> df = session.convert_to_dataframe(custom_data)

execute_query : Execute query and return DataFrame directly get_result : Retrieve query result (optionally as DataFrame)

count_attribute(attribute_name, object_name=None, dimension_name=None, request_name=None, data_frame=True)

Convenient method to count an attribute. Automatically resolves object and dimension if not provided by searching through the object structure.

Parameters

attribute_name (string) – name of attribute (required)
object_name (string) – name of object (optional, auto-resolved if omitted)
dimension_name (string) – name of dimension (optional, auto-resolved if omitted)
request_name (string) – id or name of request
data_frame (boolean) – if result shall be returned as pandas

Returns

attribute grouped by on first level and aggregated by count.

Return type

data frame or json

Raises

ValueError – if attribute_name is ambiguous (exists in multiple locations)

Example

>>> session = xplain.Xsession(url="myhost:8080", user="myUser",
password="myPwd")
>>> session.startup("mystartup")
>>> # Simple case - just provide attribute name
>>> session.count_attribute("Agegroup")
>>> # Explicit case - provide all three
>>> session.count_attribute("Type", object_name="Hospital Diagnose",
                           dimension_name="Diagnose")

create_contingency_table(df, var1, var2)

Create a contingency table (frequency table) for two variables.

Parameters

df (pd.DataFrame) – The data frame containing the variables.
var1 (str) – Name of the first variable (row).
var2 (str) – Name of the second variable (column).

Returns

A contingency table.

Return type

pd.DataFrame

download_result(filename, save_as)

download a file from result directory of server and save it to current local path

Parameters

file_name (string) – file name in result directory
save_as (string) – downloaded file save as local file

download_selections(objects, selection_set=None)

returns the selection as json for given objects and selection set

Parameters

objects (list of strings) – list of object names
selectionSet (string) – the selection set name

execute_query(query, data_frame=True)

Execute an analytical query and return results.

Runs a query specification against the current session’s data and returns aggregated, grouped, or filtered results. Queries can be specified using the Query_config builder or as raw JSON dictionaries.

queryQuery_config or dict: Query specification containing aggregations, group-bys, and selections. Can be: - A Query_config object (recommended for type safety) - A dictionary with query structure in JSON format
data_framebool, default=True: If True, return results as a pandas DataFrame. If False, return raw JSON structure.

pandas.DataFrame or dict: Query results in the requested format: - DataFrame: Columns correspond to grouped attributes and aggregated values - dict: Nested JSON structure with full hierarchy information

ValueError: If the query object is invalid or missing required fields.
RuntimeError: If query execution fails on the server or results cannot be retrieved.
AttributeError: If a Query_config object lacks the required to_json() method.

If no requestName is provided in the query, a unique identifier is auto-generated using format query_<8-char-uuid>.
The query remains available for inspection until explicitly deleted.
For large result sets, consider using get_result() separately to control result retrieval timing.
Aggregation types supported: COUNT, SUM, AVG, MIN, MAX, DISTINCT, etc.

Using Query_config (recommended):

>>> from xplain import Xsession, Query_config
>>> session = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session.startup("MIMIC_IV")
>>>
>>> # Count diagnoses grouped by type
>>> query = Query_config()
>>> query.add_aggregation(
...     object_name="Diagnoses",
...     dimension_name="ICD_Code",
...     type="COUNT"
... )
>>> query.add_groupby(
...     object_name="Diagnoses",
...     dimension_name="ICD_Code",
...     attribute_name="Category"
... )
>>> df = session.execute_query(query)
>>> print(df.head())
   Category  COUNT_ICD_Code
0  Circulatory      15234
1  Respiratory      12456
2  Injury           8901

Filter by selection:

>>> query = Query_config()
>>> query.add_aggregation(
...     object_name="LabEvents",
...     dimension_name="Creatinine",
...     type="AVG"
... )
>>> query.add_groupby(
...     object_name="Patients",
...     dimension_name="Gender",
...     attribute_name="Gender"
... )
>>> query.add_selection(
...     object_name="Patients",
...     dimension_name="Age",
...     attribute_name="AgeGroup",
...     selected_states=["65-75", "75-85", ">85"]
... )
>>> elderly_creatinine = session.execute_query(query)

Using raw JSON (alternative):

>>> query_json = {
...     "aggregations": [{
...         "object": "Admissions",
...         "dimension": "LOS",
...         "type": "AVG"
...     }],
...     "groupBys": [{
...         "attribute": {
...             "object": "Admissions",
...             "dimension": "AdmissionType",
...             "attribute": "Type"
...         }
...     }],
...     "requestName": "avg_los_by_type"
... }
>>> df = session.execute_query(query_json)

Return raw JSON instead of DataFrame:

>>> result_json = session.execute_query(query, data_frame=False)
>>> print(result_json.keys())
dict_keys(['fields', 'children', 'requestName', 'status'])

Query_config : Builder class for constructing queries QueryBuilder : Fluent API for building queries with method chaining query : Start building a query using the fluent API get_result : Retrieve results from a named query open_attribute : Convenience method to open and count an attribute

gen_xtable(data, xtable_config, file_name)

get(params=None)

Send a GET request to the /xplainsession endpoint.

Parameters: params – Optional URL parameters.
Returns: API response.

get_attribute_info(object_name, dimension_name, attribute_name)

find and retrieves the details of an attribute

Parameters

object_name – the name of xobject
dimension_name – the name of dimension
attribute_name – the name of attribute

Returns

details of this attribute in json format

get_current_xplain_session(): Get the current xplain session instance.

get_dimension_info(object_name, dimension_name)

find and retrieves the details of a dimension

Parameters

object_name – the name of the xobject
dimension_name – the name of dimension

Returns

details of this dimension in json format

get_full_object_structure()

Returns a flat list of all objects with their parent, dimensions, and attributes.

Each entry contains: - object: object name - parent: parent object name (None for root) - dimensions: list of {"name", "attributes"} dicts

The flat structure makes it easy to search, filter, and read without recursive traversal.

get_importer(): Get the importer instance for managing database connections and imports.

get_independent_variables_of_model(model_name)

get the list of independent variables of given predictive model

Parameters: model_name (string) – name of predictive model
Returns: list of independent variables with details
Return type: array of dict

get_instance_as_dataframe(elements)

get a pandas dataframe representation of the xplain artifacts references by elements, equivalent to the standard csv download functionality in XOE

Parameters: elements (list) – array of x-element paths, each one referring a Xplain artifact — an object, a dimension or an attribute.
Returns: Dataframe representation of requested instance
Return type: pd.Dataframe

Example:

elements = [
    {"object": "Person"},
    {"object": "Diagnosis", "dimension": "Physician"},
    {"object": "Prescription", "dimension": "Rx Code",
     "attribute": "ATC Hierarchy"},
    {"object": "Prescription", "dimension": "Rx Code",
     "attribute": "Substance"},
]

get_model_names()

list all loaded predictive models

Returns: list of model names
Return type: array of string

get_object_info(object_name, root=None)

find and display the details of a xobject in json

Parameters

object_name –
root – the object name from where the search starts. if none root is provided, the root node of the entire object tree

Returns

details of the Xobject in json

get_open_sequences(sequence_name): Retrieves details of open sequences by name.

get_queries()

get the list of the existing query ids

Returns: list of query ids
Return type: array of string

get_result(query_name, data_frame=True): get the result of the query :param query_name: the name /id of the query :type query_name: string :return: Dataframe result of the query :rtype: pd.Dataframe or json

get_root_object()

[Beta] Retrieve the root object.

Returns: The root object.
Return type: Xobject
Raises: KeyError – If ‘focusObject’ or ‘objectName’ is missing from the session.

get_selections()

display all global selections in the current xplain session

Returns: selections as json
Return type: list of json

get_sequence_transition_matrix(sequence_name)

Retrieves the transition matrix for the specified sequence.

Parameters: sequence_name – Name of the sequence.
Returns: Transition matrix as a dictionary with labels, sources, targets, and values.

get_session()

get_session_id()

Get the current Xplain session identifier.

Returns the unique 32-character session ID assigned by the server when the session was created or loaded.

str: The 32-character alphanumeric session identifier (e.g., "A1B2C3D4E5F6G7H8I9J0K1L2M3N4O5P6").

The session ID can be used to share or resume sessions across different clients using load_from_session_id().
Session IDs remain valid until explicitly terminated or until server timeout expires.
Can be set via environment variable xplain_session_id during initialization.

Get current session ID:

>>> session = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session.startup("MIMIC_IV")
>>> session_id = session.get_session_id()
>>> print(f"Current session: {session_id}")
Current session: A1B2C3D4E5F6G7H8I9J0K1L2M3N4O5P6

Share session with another client:

>>> # Client 1
>>> session1 = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session1.startup("Analysis")
>>> shared_id = session1.get_session_id()
>>>
>>> # Client 2 (reuses same session)
>>> session2 = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session2.load_from_session_id(shared_id)

load_from_session_id : Load session from an existing session ID terminate : Close and invalidate the current session

get_state_hierarchy(object_name, dimension_name, attribute_name, state=None, levels=None, request_name=None)

Retrieve the hierarchical structure of states for a given attribute.

Parameters

object_name – Name of the object.
dimension_name – Name of the dimension.
attribute_name – Name of the attribute.
state – The name of a state in the attribute’s hierarchy. Optional.
levels – The number of hierarchy levels to return. Optional.
data_frame – Whether to return the result as a pandas DataFrame. Default is True.

Returns

Hierarchical structure of states.

Return type

dict or DataFrame

get_tree_details(object_name=None, dimension_name=None, attribute_name=None)

get the metadata details of certain xplain object, dimension or attribute as json

Parameters

object_name (string, optional) – the name of object optional, if empty show the whole object tree from root. If only objectName is specified, this function will return the metadata of this object.
dimension_name (string, optional) – the name of dimension, optional. If object_name and dimension_name are specified, returns the dimension metadata.
attribute_name (string, optional) – the name of attribute, optional. If object_name, dimension_name and attribute_name are specified, returns the attribute metadata.

Returns

object tree details

Return type

json

get_variable_details(model_name, data_frame=True)

Retrieve the details of the independent variables for a predictive model.

Parameters

model_name (str) – The name of the predictive model.
data_frame (bool) – Whether to return the result as a pandas DataFrame.

Returns

The model’s independent variables details as a DataFrame or JSON.

Return type

pd.DataFrame or dict

Raises

ValueError – If the predictive model or its variables are not found.

get_variable_list(model_name)

get the list of independent variables of given predictive model

Parameters: model_name (string) – name of predictive model
Returns: list of independent variables
Return type: array of string

get_xobject(object_name)

[Beta] Retrieve the object with the given name.

Parameters: object_name (str) – The name of the object to retrieve.
Returns: The object with the given name, or None if not found.
Return type: Xobject or None

http_get(entrypoint, params=None)

Performs an HTTP GET request to the specified endpoint.

Parameters

entrypoint – API endpoint relative to the base URL.
params – Query parameters for the GET request.

Returns

Parsed JSON response or raw content.

Raises

RuntimeError – If the GET request fails.

http_post(entrypoint, payload_json=None, data=None, files=None, params=None)

Performs an HTTP POST request to the specified endpoint.

Parameters

entrypoint – API endpoint relative to the base URL.
payload_json (dict) – Dictionary payload for the POST request
data (dict) – Form data for the POST request.
files (dict) –
params (dict) –

Returns

Parsed JSON response or raw content.

Raises

RuntimeError – If the POST request fails.

list_analyses(): List available xanalysis configurations

list_existing_analyses(): List available xanalysis configurations

list_files(ownership, file_type, file_extension=None)

Lists files with the specified ownership and type.

Parameters

ownership – Ownership type.
file_type – File type.
file_extension – Optional file extension.

Returns

List of files or raises exception on failure.

load_analysis(file_name): Load xanalysis

load_from_session_id(session_id)

load xplain session by given exisiting session id

Parameters: session_id (string) – the 32 digit xplain session id

load_result_file_as_df(filename)

Load a file from the session as a pandas DataFrame.

Parameters: filename – Name of the file to load.
Returns: DataFrame containing file content.

open_attribute(object_name, dimension_name, attribute_name, request_name=None, data_frame=True)

Open an attribute and get counts grouped by its values.

Convenience method that creates a simple aggregation query counting entities grouped by the first level of the specified attribute hierarchy. Equivalent to a COUNT aggregation with a single GROUP BY.

object_namestr: Name of the object containing the dimension.
dimension_namestr: Name of the dimension containing the attribute.
attribute_namestr: Name of the attribute to open and count.
request_namestr, optional: Identifier for the query request. If None, a unique UUID is generated.
data_framebool, default=True: If True, return results as a pandas DataFrame. If False, return raw JSON structure.

pandas.DataFrame or dict: Counts of entities grouped by attribute values: - DataFrame: Two columns (attribute value, count) - dict: Nested JSON with full hierarchy

RuntimeError: If the attribute cannot be opened or does not exist in the structure.

This method is optimized for quick exploration of categorical attributes.
For multi-level hierarchies, only the first level is expanded by default.
To navigate deeper levels, use expand() or expand_to_level() methods.
The request remains available for further operations (expand, collapse, etc.).

Count patients by gender:

>>> from xplain import Xsession
>>> session = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session.startup("MIMIC_IV")
>>>
>>> df = session.open_attribute(
...     object_name="Patients",
...     dimension_name="Gender",
...     attribute_name="Gender"
... )
>>> print(df)
  Gender  Count
0      M  25431
1      F  23892

Analyze admission types:

>>> admissions_df = session.open_attribute(
...     object_name="Admissions",
...     dimension_name="AdmissionType",
...     attribute_name="Type"
... )
>>> print(admissions_df)
       Type  Count
0  EMERGENCY  45123
1   ELECTIVE  12456
2    URGENT   8901

Explore ICD diagnosis categories:

>>> diagnoses = session.open_attribute(
...     object_name="Diagnoses",
...     dimension_name="ICD_Code",
...     attribute_name="Category",
...     request_name="diag_category_counts"
... )

Return raw JSON for custom processing:

>>> result = session.open_attribute(
...     object_name="LabEvents",
...     dimension_name="ItemID",
...     attribute_name="TestName",
...     data_frame=False
... )

count_attribute : Auto-resolve object/dimension and count attribute execute_query : Execute full query with multiple aggregations expand : Expand attribute hierarchy to show child nodes expand_to_level : Expand hierarchy to a specific depth level

open_query(query, data_frame=True)

perform the query and keep it open, the result of this query will be impacted by further modification of current session, like selection changes

Parameters

query – either xplain.Query instance or JSON
data_frame – if True, the result will be returned as DataFrame

Returns

result of given query

Return type

JSON or DataFrame, depending on parameter dataFrame

open_sequence(target_object, base_object, ranks, reverse, names, name_postfixes, dimensions_2_replicate, sort_dimension, zero_point_dimension, selections, selection_set_definition_rank, floating_semantics, attribute_2_copy, sequence_name, rank_dimension_name, rank_zero_is_first_instance_equal_or_greater_zero_point, transition_attribute, transition_level, open_marginal_queries, open_transition_queries, selection_set)

perform(payload)

Send POST request against entry point /xplainsession with payload as json

Parameters: method_call (json) – content of xplain web api
Returns: request response
Return type: json

Example

>>> session.perform({"method": "deleteRequest",
                      "requestName":"abcd"})

post(payload)

Send POST request against entry point /xplainsession with payload as json

Parameters: payload – xplain web api in json
Returns: request response as JSON

post_and_broadcast(payload)

Send a POST request and notify the backend of session updates.

Parameters: payload – JSON payload for the API request.

post_file_download(file_name, file_type, ownership='PUBLIC', team=None, user=None, delete_after_download=True)

Triggers the flat table download functionality in XOE.

Parameters

file_name – Name of the file to be downloaded.
file_type – Type of the file.
ownership – Ownership type, defaults to “PUBLIC”.
team – Team identifier, optional.
user – User identifier, optional.
delete_after_download – Whether to delete the file after download, defaults to True.

Returns

HTTP response object or raises exception on failure.

print_error(): Print the last error message.

print_last_stack_trace(): Print the stack trace of the last error.

query_builder(name=None)

Start building a query using the fluent QueryBuilder API.

This is an alternative to Query_config that lets you chain aggregate, groupby, and selection calls and then finalise with execute() or open().

When only attribute is supplied to groupby / selection, the builder searches the session object tree and resolves the matching object and dimension automatically.

Parameters: name (str, optional) – A label for the query used as its requestName. Defaults to a random UUID.
Returns: A new query builder bound to this session.
Return type: QueryBuilder

Example:

df = (
    session.query_builder(name="lab_counts")
    .aggregate(object="Lab Events", type="COUNT")
    .groupby(attribute="TestType")
    .selection(attribute="Date", selected_states=["2024-01"])
    .execute()
)

read_file(ownership, file_type, file_path)

Reads the specified file.

Parameters

ownership – Ownership type.
file_type – File type.
file_path – Path of the file.

Returns

File content or raises exception on failure.

refresh(): synchronize the session content with the backend

resume_analysis(file_name)

resume the stored session

Parameters: file_name (string) – name of stored session file
Returns: False (fail) or True (success)
Return type: Boolean

run(method)

perform xplain web api method and broadcast the change to other client sharing with same session id

Parameters: method (json) – xplain web api method in json format

run_py(file_name, options, ownership)

Executes a Python script file on the server.

Parameters

file_name – Name of the Python file.
options – Execution options.
ownership – File ownership type.

Returns

Parsed JSON result or raw content.

Raises

RuntimeError – If the request fails.

run_statsmodels(df, formula, model_type='logit')

Fit a statistical model to the provided dataframe using the specified formula and model type.

Parameters

df (pandas.DataFrame) – The input dataframe containing the data.
formula (str) – A Patsy-compatible formula specifying the dependent and independent variables.
model_type (str) – The type of model to fit. Supported options are ‘logit’, ‘probit’, ‘ols’, ‘mnlogit’, ‘glm’, ‘poisson’, ‘negative_binomial’. Default is ‘logit’.

Returns

statsmodels.regression.linear_model.OLSResults or statsmodels.discrete.discrete_model.LogitResults or other statsmodels result object depending on the model_type.

Raises

ValueError – If the model_type is unsupported or if the dependent variable is not appropriate for the chosen model (e.g., non-binary dependent variable for logit/probit).

property session: Returns the underlying requests.Session object. This allows external code to reuse the authenticated session.

set_default_broadcast(broadcast)

set default broadcast behaviour so that other xplain client sharing the same xplain session could get informed about the update of current xplain session.

Parameters: broadcast (boolean) – after successful session update via python call, if a default refresh signal should be broadcasted to all xplain clients sharing the same session, to force them to refresh.

show_tree()

show object tree

Returns

render the object hierarchy as a tree

Return type

string

Raises

RuntimeError – if the session is not properly initialized.
Exception – if an unexpected error occurs.

show_tree_details(): Display the details of the object tree.

startup(startup_file)

Load an Xplain session from a startup configuration file.

Initializes the session’s object structure, dimensions, and default settings from a saved .xstartup configuration file. The file extension is optional and will be added automatically if not provided.

startup_filestr: Name of the startup configuration file. The .xstartup extension is optional and will be appended automatically if missing.

RuntimeError: If the startup file cannot be found or loaded, or if the file contains invalid configuration.

Startup files define the initial object tree structure, including:
- Objects and their hierarchies (parent-child relationships)
- Dimensions attached to each object
- Attributes within dimensions
- Default selections and filters
Loading a startup file replaces any existing session state.
After loading, the session is ready for query execution without additional configuration.

Load a MIMIC-IV patient cohort:

>>> from xplain import Xsession
>>> session = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session.startup("MIMIC_IV_Patients")  # .xstartup extension added automatically

Load different configurations sequentially:

>>> session = Xsession(url="http://localhost:8080", user="researcher", password="pass")
>>> session.startup("ICU_Admissions.xstartup")
>>> # ... perform analysis ...
>>> session.startup("Lab_Events")  # Switch to different configuration

Check loaded structure:

>>> session.startup("MIMIC_Cohort")
>>> session.show_tree()  # Display the loaded object hierarchy

startup_from_xview_config : Load session from an XView configuration object show_tree : Display the current object structure get_session : Get the current session information

startup_from_xview_config(xview_config)

load xplain session by given view configuration json

:param xview_config: the view configuration in json format

store_xsession(response_json)

Store session details from the response.

Parameters: response_json – Response parsed as JSON.

terminate()

Terminate the Xplain session and logout from the server.

Closes the current session, invalidating the session ID and releasing server resources. After termination, the session cannot be reused and a new session must be created.

All pending queries and results are lost after termination.
The session ID becomes invalid and cannot be loaded again.
It is good practice to terminate sessions explicitly when done to free server resources, especially in long-running applications.
Automatic session cleanup occurs on server timeout if not explicitly terminated.

Basic session lifecycle:

>>> from xplain import Xsession
>>> session = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session.startup("MIMIC_IV")
>>> # ... perform analysis ...
>>> session.terminate()

Using context manager (recommended):

>>> with Xsession(url="http://localhost:8080", user="admin", password="admin") as session:
...     session.startup("Analysis")
...     df = session.execute_query(query)
...     # Session automatically terminated when exiting context

Multiple sessions:

>>> session1 = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session2 = Xsession(url="http://other:8080", user="admin", password="admin")
>>> session1.startup("Dataset_A")
>>> session2.startup("Dataset_B")
>>> # ... work with both sessions ...
>>> session1.terminate()
>>> session2.terminate()

__init__ : Create a new session get_session_id : Get the current session identifier

upload_data(file_name): upload the file from current local directory to data directory on server :param file_name: file :type file_name: string

upload_xmodel(model_or_path, filename=None, ownership='PUBLIC')

Upload an .xmodel configuration file to the server’s public model store.

The file is stored in the server directory resolved by file-type XMODEL_CONFIG (config/models/). Once uploaded it can be referenced by name in buildModel / crossValidateModel payloads:

{"method": "buildModel", "xmodelConfigurationFileName": "My_Model.xmodel", ...}

Parameters

model_or_path – Either an XModel instance or a local file-system path (str) to an existing .xmodel file.
filename (str) – Name to use on the server (e.g. "My_Model.xmodel"). Defaults to <model.name>.xmodel when an XModel is passed, or to the basename of the path otherwise. .xmodel is appended automatically if omitted.
ownership (str) – File-store ownership scope. One of "PUBLIC" (shared by all users, default), "TEAM", "USER", or "SYSTEM".

Raises

RuntimeError – if the HTTP upload fails or the server returns an error response.

Return type

None

Example:

from xplain.xmodel import XModel, IndependentVariableSet, AutoSpaceDefinition

model = XModel(
    name="Failure_Model",
    predictive_model_object="actuator",
    independent_variable_sets=[
        IndependentVariableSet(
            predictive_model_object="actuator",
            auto_space_definitions=[
                AutoSpaceDefinition("screwing station", ["Result"]),
            ],
        )
    ],
)
xsession.upload_xmodel(model)
# → uploaded as "Failure_Model.xmodel" to config/models/ (PUBLIC)

validate_db(db_connection_config)

Validates a database connection configuration.

Parameters: db_connection_config – Dictionary containing DB connection settings.
Raises: RuntimeError – If validation fails or an error occurs.

Constructor Parameters:

Parameter	Type	Description
`url`	str	URL of the Xplain server (default: `http://localhost:8080`)
`user`	str	Username for authentication (default: `user`)
`password`	str	Password for authentication (default: `xplainData`)
`httpsession`	requests.Session	Existing requests session object (optional)
`http_session_id`	str	Xplain session ID (32-character hex string). Attaches to an existing session without re-authenticating. Same value as returned by `get_session_id()` or shown in XOE Settings → Session (optional).
`jwt_dispatch_url`	str	JWT authentication endpoint URL (optional)
`jwt_cookie_name`	str	Cookie name for JWT token (optional)
`jwt_token`	str	JWT token value (optional)

Note

Authentication Methods:

Password authentication: Provide user and password
JWT authentication: Provide all three: jwt_dispatch_url, jwt_cookie_name, jwt_token
Recommended: Use create_session() to load credentials from config file or environment variables

See Authentication & Credential Management for credential management best practices.

Session Management Methods:

Method	Description
`startup(startup_file)`	Load a session from a startup configuration file
`startup_from_xview_config(xview_config)`	Load a session from an XView configuration
`load_from_session_id(session_id)`	Connect to an existing session by its 32-character ID
`get_session_id()`	Get the current session ID
`terminate()`	Terminate the session and logout
`refresh()`	Synchronize local session state with the server
`set_default_broadcast(broadcast)`	Enable/disable broadcasting updates to other clients

Query Methods:

Method	Description
`query(name=None)`	Start a fluent `QueryBuilder` chain (see below)
`execute_query(query, data_frame=True)`	Execute a query (Query_config or JSON) and return results
`open_query(query, data_frame=True)`	Execute a query and keep it open (results update with selections)
`open_attribute(object_name, dimension_name, attribute_name, ...)`	Open an attribute grouped by first level, aggregated by count
`get_result(query_name, data_frame=True)`	Get results of an existing query by name
`get_queries()`	List IDs of all open queries
`convert_to_dataframe(data)`	Convert JSON result to pandas DataFrame

Object Tree Methods:

Method	Description
`show_tree()`	Print the object hierarchy as a text tree
`show_tree_details()`	Display detailed object tree as JSON
`collapsible_tree()`	Render interactive tree in Jupyter using pyecharts
`get_tree_details(object_name, dimension_name, attribute_name)`	Get metadata for object/dimension/attribute
`get_object_info(object_name)`	Get detailed JSON info for an object
`get_dimension_info(object_name, dimension_name)`	Get detailed JSON info for a dimension
`get_attribute_info(object_name, dimension_name, attribute_name)`	Get detailed JSON info for an attribute
`get_root_object()`	Get the root XObject instance [Beta]
`get_xobject(object_name)`	Get an XObject by name [Beta]
`get_full_object_structure()`	Get nested dict of all objects, dimensions, and attributes

Selection Methods:

Method	Description
`get_selections()`	Get all global selections in the session
`download_selections(objects, selection_set=None)`	Download selections for specific objects
`get_state_hierarchy(object_name, dimension_name, attribute_name, ...)`	Get the hierarchical state structure for an attribute

Data Export Methods:

Method	Description
`get_instance_as_dataframe(elements)`	Export instance data as a pandas DataFrame (CSV download)
`download_result(filename, save_as)`	Download a file from the server result directory
`upload_data(file_name)`	Upload a local file to the server data directory

Predictive Modeling Methods:

Method	Description
`build_predictive_model(model_name, config_file, target_object)`	Build a predictive model [Beta]
`get_model_names()`	List all loaded predictive models
`get_variable_list(model_name)`	Get independent variable names for a model
`get_independent_variables_of_model(model_name)`	Get detailed independent variable info
`get_variable_details(model_name, data_frame=True)`	Get variable details as DataFrame or JSON

Statistical Modeling Methods:

Method	Description
`run_statsmodels(df, formula, model_type="logit")`	Fit a statistical model (logit, probit, ols, mnlogit, glm, poisson, negative_binomial)
`build_formula(response, predictors)`	Build an R-style formula string
`create_contingency_table(df, var1, var2)`	Create a cross-tabulation table

File Management Methods:

Method	Description
`list_files(ownership, file_type, file_extension=None)`	List files of a given type and ownership
`read_file(ownership, file_type, file_path)`	Read a file from the server
`run_py(file_name, options, ownership)`	Execute a Python script on the server
`list_analyses()`	List available xanalysis configurations
`load_analysis(file_name)`	Load an xanalysis (startup + saved state)
`resume_analysis(file_name)`	Resume a stored analysis session

Low-Level API Methods:

Method	Description
`run(method)`	Execute an Xplain Web API method and broadcast changes
`perform(payload)`	Send POST to /xplainsession and return JSON response
`post(payload)`	Send raw POST to /xplainsession
`get(params=None)`	Send GET to /xplainsession
`http_get(entrypoint, params=None)`	Generic HTTP GET to any endpoint
`http_post(entrypoint, payload_json=None)`	Generic HTTP POST to any endpoint
`get_api()`	Get an Api instance for advanced operations
`get_importer()`	Get an Importer instance for data import operations

xplain.XObject

class xplain.XObject(object_name, ref_session)

Bases: object

Represents an Xplain data object with navigation and dimension manipulation.

XObjects are the fundamental building blocks of the Xplain object model, representing entities in your data domain (e.g., Patients, Admissions, Lab Events). Each XObject contains:

Child objects: Hierarchical relationships to other objects
Dimensions: Measurable or categorical properties of the object
Aggregations: Computed values derived from child object data

This class provides methods to explore the object structure, retrieve dimensions and child objects, and dynamically add aggregation dimensions that compute summary statistics from related data.

object_namestr: The name of the XObject in the current session.
ref_sessionXsession: Reference to the active Xsession for API interactions.

object_namestr: The name of the XObject.
_ref_sessionXsession: The session object used for API interactions.

TypeError: If object_name is not a string.

XObjects are retrieved via Xsession.get_xobject(object_name).
The object tree structure is defined by the loaded startup configuration or XView.
Aggregation dimensions enable analysis across object hierarchies without manual joins.

Get an XObject and explore its structure:

>>> from xplain import Xsession
>>> session = Xsession(url="http://localhost:8080", user="admin", password="admin")
>>> session.startup("MIMIC_IV")
>>>
>>> patients = session.get_xobject("Patients")
>>> print(patients.get_name())
Patients
>>> print(patients.get_dimensions())
['PatientID', 'Age', 'Gender', 'Ethnicity']
>>> print(patients.get_child_objects())
['Admissions', 'Diagnoses', 'LabEvents']

Navigate child objects:

>>> admissions = session.get_xobject("Admissions")
>>> print(admissions.get_dimensions())
['AdmissionID', 'AdmitDate', 'DischargeDate', 'LOS', 'AdmissionType']

Add aggregation dimension:

>>> # Add average length of stay to Patients object
>>> patients.add_aggregation_dimension(
...     dimension_name="AvgLOS",
...     aggregation={
...         "object": "Admissions",
...         "dimension": "LOS",
...         "type": "AVG"
...     }
... )

Xsession.get_xobject : Retrieve an XObject by name Dimension : Represents a dimension within an XObject Attribute : Represents an attribute within a dimension

Parameters: object_name (str) –

__init__(object_name, ref_session)

Initialize a Xobject instance.

Parameters

object_name (str) – The name of the Xobject.
ref_session – A session object for making API calls.

add_aggregation_dimension(dimension_name, aggregation, selections=None, floating_semantics=False)

Add an aggregation dimension that computes values from child object data.

Aggregation dimensions enable computing summary statistics from related objects without writing explicit joins. For example, add “AvgLOS” to a Patients object by averaging the Length-of-Stay dimension from the child Admissions object.

The aggregation dimension becomes part of the object’s schema and can be used in queries, groupings, and further aggregations.

dimension_namestr

Name for the new aggregation dimension. Must be unique within the object.

aggregationdict

Aggregation specification defining what to compute. Required keys: - "object" (str): Name of the child object to aggregate from - "dimension" (str): Name of the dimension to aggregate - "type" (str): Aggregation type (COUNT, SUM, AVG, MIN, MAX, etc.)

Example:

{
    "object": "Admissions",
    "dimension": "LOS",
    "type": "AVG"
}

selectionslist of dict, optional

Filters to apply before aggregation. Each selection is a dict with: - "attribute" (dict): Object/dimension/attribute to filter on - "selectedStates" (list): Values to include

Useful for conditional aggregations (e.g., “count only ICU admissions”).

floating_semanticsbool, default=False

If True, the dimension uses floating semantics, meaning it updates dynamically based on current selections. If False (default), the dimension is computed once and remains static.

dict: Server response confirming the dimension was added.

ValueError: If dimension_name is empty, aggregation is not a dict, or selections is not a list.
RuntimeError: If the API call to add the dimension fails.

Aggregation dimensions are computed server-side and cached for performance.
They appear in the object’s dimension list immediately after creation.
Floating semantics dimensions recalculate when selections change, enabling dynamic “what-if” analysis.
The aggregation can reference any descendant object, not just direct children.

Add average length of stay to Patients:

>>> session.startup("MIMIC_IV")
>>> patients = session.get_xobject("Patients")
>>> patients.add_aggregation_dimension(
...     dimension_name="AvgLOS",
...     aggregation={
...         "object": "Admissions",
...         "dimension": "LOS",
...         "type": "AVG"
...     }
... )

Count total admissions per patient:

>>> patients.add_aggregation_dimension(
...     dimension_name="TotalAdmissions",
...     aggregation={
...         "object": "Admissions",
...         "dimension": "AdmissionID",
...         "type": "COUNT"
...     }
... )

Conditional aggregation - ICU admissions only:

>>> patients.add_aggregation_dimension(
...     dimension_name="ICU_AdmissionCount",
...     aggregation={
...         "object": "Admissions",
...         "dimension": "AdmissionID",
...         "type": "COUNT"
...     },
...     selections=[{
...         "attribute": {
...             "object": "Admissions",
...             "dimension": "ICU_Stay",
...             "attribute": "ICU_Flag"
...         },
...         "selectedStates": ["Yes"]
...     }]
... )

Average creatinine level (multi-level aggregation):

>>> patients.add_aggregation_dimension(
...     dimension_name="AvgCreatinine",
...     aggregation={
...         "object": "LabEvents",  # Grandchild of Patients
...         "dimension": "Creatinine",
...         "type": "AVG"
...     }
... )

Floating semantics for dynamic analysis:

>>> patients.add_aggregation_dimension(
...     dimension_name="SelectedAdmissionCount",
...     aggregation={
...         "object": "Admissions",
...         "dimension": "AdmissionID",
...         "type": "COUNT"
...     },
...     floating_semantics=True  # Updates when selections change
... )

get_dimensions : List all dimensions including aggregations Xsession.execute_query : Use aggregation dimensions in queries Query_config.add_aggregation : Alternative aggregation method

Parameters

dimension_name (str) –
aggregation (dict) –
selections (list) –
floating_semantics (bool) –

Return type

dict

get_child_objects()

Retrieve the names of all child objects in the hierarchy.

Child objects represent one-to-many relationships from the current object. For example, a “Patients” object might have “Admissions” and “Diagnoses” as child objects.

list of str: Names of all child objects. Returns an empty list if the object has no children.

KeyError: If the response from the server is missing expected keys.
RuntimeError: If the API call to fetch object details fails.

Child objects are defined in the startup configuration or XView.
The parent-child relationship enables aggregation dimensions that compute statistics across the hierarchy.
This method returns names only; use session.get_xobject(name) to get the actual child XObject instances.

Explore object hierarchy:

>>> session.startup("MIMIC_IV")
>>> patients = session.get_xobject("Patients")
>>> children = patients.get_child_objects()
>>> print(children)
['Admissions', 'Diagnoses', 'LabEvents', 'Prescriptions']

Navigate to child objects:

>>> for child_name in patients.get_child_objects():
...     child_obj = session.get_xobject(child_name)
...     print(f"{child_name}: {child_obj.get_dimensions()}")
Admissions: ['AdmissionID', 'AdmitDate', 'LOS']
Diagnoses: ['DiagnosisID', 'ICD_Code', 'DiagnosisDate']
...

get_dimensions : Get dimensions of the current object Xsession.get_xobject : Retrieve a child object instance

Return type: list

get_dimension(dimension_name)

Retrieve a specific dimension by its name.

Parameters

dimension_name (str) – The name of the dimension to retrieve.

Returns

The name of the dimension if found, otherwise None.

Return type

str

Raises

ValueError – If the dimension name is invalid.
RuntimeError – If fetching dimensions fails.

get_dimensions()

Retrieve the names of all dimensions attached to this object.

Dimensions represent properties or measurements of the object. They can be:

Stored dimensions: Values imported from source data
Aggregation dimensions: Computed from child object data
Derived dimensions: Calculated from other dimensions

list of str: Names of all dimensions attached to the object. Returns an empty list if the object has no dimensions.

KeyError: If the response from the server is missing expected keys.
RuntimeError: If the API call to fetch object details fails.

Dimensions are defined in the object’s configuration or added dynamically.
Each dimension can have one or more attributes that categorize its values.
To retrieve dimension objects (not just names), iterate and call session.get_dimension(object_name, dimension_name).

List all dimensions of an object:

>>> session.startup("MIMIC_IV")
>>> patients = session.get_xobject("Patients")
>>> dimensions = patients.get_dimensions()
>>> print(dimensions)
['PatientID', 'Age', 'Gender', 'Ethnicity', 'DOB', 'AdmissionCount']

Explore dimension details:

>>> for dim_name in patients.get_dimensions():
...     print(f"Dimension: {dim_name}")
Dimension: PatientID
Dimension: Age
Dimension: Gender
...

Filter for specific dimensions:

>>> numeric_dims = [d for d in patients.get_dimensions()
...                 if d in ['Age', 'Weight', 'Height']]
>>> print(numeric_dims)
['Age', 'Weight', 'Height']

get_dimension : Retrieve a specific dimension by name get_child_objects : Get child objects in the hierarchy add_aggregation_dimension : Add a computed dimension

Return type: list

get_name()

Return the name of the Xobject.

Return type: str

Methods:

Method	Description
`get_name()`	Return the name of the XObject
`get_child_objects()`	Get list of child object names
`get_dimensions()`	Get list of dimension names
`get_dimension(dimension_name)`	Get a specific dimension by name
`add_aggregation_dimension(dimension_name, aggregation, ...)`	Add a computed aggregation dimension

xplain.Dimension

class xplain.Dimension(object_name, dimension_name, ref_session)

Bases: object

Represents a dimension within an Xplain object.

Dimensions are properties or measurements associated with objects in the Xplain data model. They can be numeric (e.g., Age, Temperature) or categorical (e.g., Gender, Diagnosis Code). Each dimension has one or more attributes that organize its values into hierarchies or categories.

Dimensions are the fundamental units of analysis in Xplain:

Aggregations compute statistics on dimensions (COUNT, AVG, SUM)
Attributes categorize dimension values for grouping
Selections filter data based on attribute states

object_namestr: Name of the parent object containing this dimension.
dimension_namestr: Name of the dimension.
ref_sessionXsession: Reference to the active session for API interactions.

object_namestr: Name of the associated object.
dimension_namestr: Name of the dimension.
_ref_sessionXsession: Reference to the session object for API interaction.

TypeError: If object_name or dimension_name are not strings.

Dimensions are accessed via session.get_dimension(object, dimension) or through XObject.get_dimensions().
Each dimension has at least one default attribute (often named the same as the dimension itself).
Hierarchical dimensions have multi-level attributes (e.g., Date → Year → Month → Day).

Get a dimension and explore its attributes:

>>> session.startup("MIMIC_IV")
>>> age_dim = session.get_dimension("Patients", "Age")
>>> print(age_dim.get_name())
Age
>>> attributes = age_dim.get_attributes()
>>> for attr in attributes:
...     print(attr.get_name())
Age
AgeGroup
AgeDecade

Access a specific attribute:

>>> age_group_attr = age_dim.get_attribute("AgeGroup")
>>> if age_group_attr:
...     levels = age_group_attr.get_levels()
...     print(levels)
['0-18', '18-30', '30-45', '45-65', '65+']

XObject : Parent object containing dimensions Attribute : Categorization of dimension values Xsession.get_dimension : Retrieve a dimension by name

Parameters

object_name (str) –
dimension_name (str) –

__init__(object_name, dimension_name, ref_session)

Initialize the Dimension instance.

Parameters

object_name (str) – Name of the object.
dimension_name (str) – Name of the dimension.
ref_session – Session object for API calls.

get_attribute(attribute_name)

Retrieve a specific attribute by name.

Searches the dimension’s attributes and returns the matching Attribute object if found. Useful for accessing hierarchical attributes or specific categorizations.

attribute_namestr: The name of the attribute to retrieve. Case-sensitive.

Attribute or None: The matching Attribute object, or None if no attribute with the given name exists in this dimension.

ValueError: If attribute_name is not a non-empty string.
KeyError: If the server response is missing expected keys.
RuntimeError: If the API call to fetch dimension details fails.

Returns None (not an exception) if the attribute doesn’t exist, allowing safe existence checks.
Attribute names are case-sensitive and must match exactly.
The default attribute typically has the same name as the dimension.

Check if an attribute exists:

>>> age_dim = session.get_dimension("Patients", "Age")
>>> age_group = age_dim.get_attribute("AgeGroup")
>>> if age_group:
...     print(f"Found attribute: {age_group.get_name()}")
... else:
...     print("Attribute not found")
Found attribute: AgeGroup

Get hierarchy levels:

>>> date_dim = session.get_dimension("Admissions", "AdmitDate")
>>> year_month = date_dim.get_attribute("YearMonth")
>>> if year_month:
...     levels = year_month.get_levels()
...     print(f"Hierarchy levels: {levels}")
Hierarchy levels: ['Year', 'Month']

Safe attribute access:

>>> gender_dim = session.get_dimension("Patients", "Gender")
>>> custom_attr = gender_dim.get_attribute("CustomGrouping")
>>> if custom_attr is None:
...     print("Custom attribute doesn't exist, using default")
...     custom_attr = gender_dim.get_attribute("Gender")

get_attributes : Retrieve all attributes of the dimension Attribute.get_levels : Get hierarchy levels of an attribute

Parameters: attribute_name (str) –

get_attributes()

Retrieve all attributes attached to this dimension.

Attributes organize dimension values into categories or hierarchies. A dimension typically has at least one default attribute, and may have additional custom or hierarchical attributes.

list of Attribute: List of Attribute objects representing all attributes of the dimension. Returns an empty list if the dimension has no attributes defined.

KeyError: If the server response is missing expected keys.
RuntimeError: If the API call to fetch dimension details fails.

Each returned Attribute is a fully instantiated object that can be used to explore hierarchy levels and states.
Attributes enable grouping in queries via add_groupby() and filtering via add_selection().
The default attribute usually has the same name as the dimension.

List all attributes of a dimension:

>>> session.startup("MIMIC_IV")
>>> gender_dim = session.get_dimension("Patients", "Gender")
>>> attributes = gender_dim.get_attributes()
>>> for attr in attributes:
...     print(attr.get_name())
Gender

Explore hierarchical attributes:

>>> date_dim = session.get_dimension("Admissions", "AdmitDate")
>>> attributes = date_dim.get_attributes()
>>> for attr in attributes:
...     print(f"{attr.get_name()}: {attr.get_levels()}")
AdmitDate: ['Date']
YearMonth: ['Year', 'Month']
Quarter: ['Year', 'Quarter']

Use attributes in queries:

>>> age_dim = session.get_dimension("Patients", "Age")
>>> age_group = age_dim.get_attribute("AgeGroup")
>>> query = Query_config()
>>> query.add_groupby(
...     object_name="Patients",
...     dimension_name="Age",
...     attribute_name="AgeGroup"  # From get_attributes()
... )

get_attribute : Retrieve a specific attribute by name Attribute : Detailed attribute information and hierarchy navigation

Return type: list

get_name()

Returns the dimension name.

Return type: str

Methods:

Method	Description
`get_name()`	Return the dimension name
`get_attributes()`	Get list of Attribute instances
`get_attribute(attribute_name)`	Get a specific Attribute by name

xplain.Attribute

class xplain.Attribute(object_name, dimension_name, attribute_name, ref_session)

Bases: object

Represents an attribute that categorizes dimension values.

Attributes organize dimension values into hierarchies or categories, enabling grouping and filtering in queries. For example, a continuous “Age” dimension might have an “AgeGroup” attribute with categories like “0-18”, “18-65”, “65+”.

Attributes can be:

Flat: Single-level categorization (e.g., Gender: Male/Female)
Hierarchical: Multi-level trees (e.g., Date → Year → Month → Day)
Derived: Computed from dimension values (e.g., age bins, quantiles)

object_namestr: Name of the object containing the dimension.
dimension_namestr: Name of the dimension containing this attribute.
attribute_namestr: Name of the attribute.
ref_sessionXsession: Reference to the active session for API interactions.

object_namestr: The parent object name.
dimension_namestr: The parent dimension name.
attribute_namestr: The attribute name.
_ref_sessionXsession: Session reference for API calls.

Attributes are accessed via Dimension.get_attribute(name) or Dimension.get_attributes().
Hierarchical attributes enable drill-down analysis (expand/collapse).
Attribute states (values) can be explored via get_state_hierarchy().

Get an attribute and explore its hierarchy:

>>> session.startup("MIMIC_IV")
>>> age_dim = session.get_dimension("Patients", "Age")
>>> age_group = age_dim.get_attribute("AgeGroup")
>>> print(age_group.get_name())
AgeGroup
>>> print(age_group.get_levels())
['AgeGroup']

Hierarchical attribute (Date):

>>> date_dim = session.get_dimension("Admissions", "AdmitDate")
>>> year_month = date_dim.get_attribute("YearMonth")
>>> levels = year_month.get_levels()
>>> print(levels)
['Year', 'Month']

Get state hierarchy:

>>> hierarchy = age_group.get_state_hierarchy()
>>> print(hierarchy)
{'stateName': 'All', 'children': [{'stateName': '0-18'}, {'stateName': '18-65'}, ...]}

Dimension : Parent dimension containing attributes Dimension.get_attribute : Retrieve an attribute by name

get_levels()

Retrieve the hierarchy level names of this attribute.

For hierarchical attributes, this returns the ordered list of levels from coarsest to finest granularity. For flat attributes, returns a single-element list.

list of str: Ordered list of hierarchy level names. For example: - Flat attribute: ["Gender"] - Hierarchical: ["Year", "Quarter", "Month", "Week"]

ValueError: If the attribute information cannot be retrieved or if ‘hierarchyLevelNames’ is missing from the response.

The first level is the coarsest (e.g., “Year”), the last is finest (e.g., “Day”).
Level names are used in Query_config.add_groupby() to specify which hierarchy level to group by.
For non-hierarchical attributes, the list contains only the attribute name itself.

Flat attribute (single level):

>>> session.startup("MIMIC_IV")
>>> gender_dim = session.get_dimension("Patients", "Gender")
>>> gender_attr = gender_dim.get_attribute("Gender")
>>> print(gender_attr.get_levels())
['Gender']

Hierarchical attribute (date/time):

>>> date_dim = session.get_dimension("Admissions", "AdmitDate")
>>> year_month = date_dim.get_attribute("YearMonth")
>>> print(year_month.get_levels())
['Year', 'Month']

ICD diagnosis hierarchy:

>>> icd_dim = session.get_dimension("Diagnoses", "ICD_Code")
>>> icd_hierarchy = icd_dim.get_attribute("ICD_Hierarchy")
>>> print(icd_hierarchy.get_levels())
['Chapter', 'Block', 'Category', 'Code']

Use levels in queries:

>>> query = Query_config()
>>> query.add_groupby(
...     object_name="Admissions",
...     dimension_name="AdmitDate",
...     attribute_name="YearMonth",
...     groupby_level="Year"  # Group by year only
... )

get_state_hierarchy : Explore the actual values (states) in the hierarchy Query_config.add_groupby : Use hierarchy levels in queries

get_name(): Retrieves the name of the attribute. :return: Attribute name as a string.

get_root_state()

Get the root state (top-level category) of this attribute.

The root state represents the most aggregated level in the attribute hierarchy, typically named “All” or representing the total population.

str: The name of the root state (e.g., “All”, “Total”, or a custom name).

The root state encompasses all child states in the hierarchy.
For flat attributes, the root may be the only state or a summary category.
This is useful for understanding the top-level category before drilling down.

Get root state:

>>> session.startup("MIMIC_IV")
>>> gender_dim = session.get_dimension("Patients", "Gender")
>>> gender_attr = gender_dim.get_attribute("Gender")
>>> root = gender_attr.get_root_state()
>>> print(root)
All

Hierarchical attribute root:

>>> date_dim = session.get_dimension("Admissions", "AdmitDate")
>>> year_month = date_dim.get_attribute("YearMonth")
>>> root = year_month.get_root_state()
>>> print(root)
All

get_state_hierarchy : Get the full state hierarchy get_levels : Get hierarchy level names

get_state_hierarchy(state=None, levels=None)

Retrieve the state hierarchy showing all values and their structure.

Returns a tree structure of attribute states (values), showing parent-child relationships for hierarchical attributes. Useful for exploring available categories and understanding the attribute’s organization.

statestr, optional: Specific state to retrieve the sub-hierarchy for. If None, returns the full hierarchy starting from the root.
levelslist of str, optional: Specific hierarchy levels to include. If None, returns all levels.

dict

Nested dictionary representing the state hierarchy. Structure:

{
    "stateName": "Root",
    "children": [
        {"stateName": "Child1", "children": [...]},
        {"stateName": "Child2", "children": [...]}
    ]
}

For flat attributes, the hierarchy is shallow with no nested children.
For hierarchical attributes, children represent progressively finer granularities.
States are the actual categorical values used in selections and groupings.
This method delegates to Xsession.get_state_hierarchy().

Get full hierarchy for a flat attribute:

>>> session.startup("MIMIC_IV")
>>> gender_dim = session.get_dimension("Patients", "Gender")
>>> gender_attr = gender_dim.get_attribute("Gender")
>>> hierarchy = gender_attr.get_state_hierarchy()
>>> print(hierarchy)
{
    'stateName': 'All',
    'children': [
        {'stateName': 'Male'},
        {'stateName': 'Female'},
        {'stateName': 'Unknown'}
    ]
}

Hierarchical attribute (date by year/month):

>>> date_dim = session.get_dimension("Admissions", "AdmitDate")
>>> year_month = date_dim.get_attribute("YearMonth")
>>> hierarchy = year_month.get_state_hierarchy()
>>> print(hierarchy)
{
    'stateName': 'All',
    'children': [
        {
            'stateName': '2020',
            'children': [
                {'stateName': '2020-01'},
                {'stateName': '2020-02'},
                ...
            ]
        },
        ...
    ]
}

Get sub-hierarchy for specific state:

>>> sub_hierarchy = year_month.get_state_hierarchy(state='2020')
>>> print(sub_hierarchy)
{
    'stateName': '2020',
    'children': [
        {'stateName': '2020-01'},
        {'stateName': '2020-02'},
        ...
    ]
}

get_levels : Get hierarchy level names get_root_state : Get the root state name Xsession.get_state_hierarchy : Underlying implementation

Methods:

Method	Description
`get_name()`	Return the attribute name
`get_levels()`	Get hierarchy level names
`get_state_hierarchy(state=None, levels=None)`	Get the hierarchical state structure
`get_root_state()`	Get the root state name

xplain.Query_config

class xplain.Query_config(name=None)

Bases: object

Builder for constructing Xplain analytical query configurations.

Provides a fluent API for building complex queries with aggregations, group-bys, and selections. Queries are executed via Xsession.execute_query().

The builder pattern allows chaining method calls to incrementally construct queries. Each query consists of three main components:

Aggregations: Compute summary statistics (COUNT, SUM, AVG, etc.)
Group-bys: Organize results by attribute categories
Selections: Filter data to specific attribute states

requestdict: The internal query configuration containing aggregations, groupBys, and selections in JSON-serializable format.

All aggregation methods return self to enable method chaining.
Queries are identified by a unique requestName, auto-generated if not provided.
The query configuration can be serialized to JSON via to_json().
For simpler queries, consider using Xsession.open_attribute() or the fluent QueryBuilder API via Xsession.query_builder().

Basic aggregation with grouping:

>>> from xplain import Query_config
>>> query = Query_config()
>>> query.add_aggregation(
...     object_name="Patients",
...     dimension_name="Age",
...     type="AVG"
... )
>>> query.add_groupby(
...     object_name="Patients",
...     dimension_name="Gender",
...     attribute_name="Gender"
... )
>>> df = session.execute_query(query)

Multiple aggregations:

>>> query = Query_config(name="patient_stats")
>>> query.add_aggregation(object_name="Patients", dimension_name="Age", type="AVG") \
...      .add_aggregation(object_name="Patients", dimension_name="Weight", type="AVG") \
...      .add_groupby(object_name="Patients", dimension_name="Gender", attribute_name="Gender")
>>> results = session.execute_query(query)

With selections (filtering):

>>> query = Query_config()
>>> query.add_aggregation(object_name="Admissions", dimension_name="LOS", type="AVG") \
...      .add_groupby(object_name="Admissions", dimension_name="AdmissionType", attribute_name="Type") \
...      .add_selection(
...          object_name="Patients",
...          dimension_name="Age",
...          attribute_name="AgeGroup",
...          selected_states=["65-75", "75-85", ">85"]
...      )
>>> elderly_los = session.execute_query(query)

MIMIC-IV analysis - ICU mortality by diagnosis:

>>> query = Query_config(name="icu_mortality_by_diagnosis")
>>> query.add_aggregation(object_name="Patients", dimension_name="Mortality", type="AVG") \
...      .add_groupby(object_name="Diagnoses", dimension_name="ICD_Code", attribute_name="Category") \
...      .add_selection(
...          object_name="Admissions",
...          dimension_name="ICU_Stay",
...          attribute_name="ICU_Flag",
...          selected_states=["Yes"]
...      )
>>> df = session.execute_query(query)

Xsession.execute_query : Execute the constructed query Xsession.query_builder : Alternative fluent API via QueryBuilder Xsession.open_attribute : Convenience method for simple attribute counts

Parameters: name (str) –

__init__(name=None)

Initialize the QueryConfig instance with a default or provided name.

Parameters: name (str, optional) – The name or identifier for the query. Defaults to a UUID.

add_aggregation(object_name, dimension_name, type, aggregation_name=None)

Add an aggregation to compute summary statistics on a dimension.

Aggregations define what to calculate from the data. Multiple aggregations can be added to a single query, and each produces a column in the result.

object_namestr: Name of the object containing the dimension to aggregate.
dimension_namestr: Name of the dimension to compute the aggregation on.
typestr: Aggregation type. Supported values: - "COUNT" : Count of non-null values - "COUNTDISTINCT" : Count of unique values - "COUNTENTITY" : Count of entities - "SUM" : Sum of numeric values - "AVG" : Average (mean) of numeric values - "MIN" : Minimum value - "MAX" : Maximum value - "VAR" : Variance - "STDEV" : Standard deviation - "QUANTILE" : Quantile (requires additional config)
aggregation_namestr, optional: Custom name for the aggregation column in results. If not provided, the server auto-generates a name (e.g., "COUNT_DimensionName").

Query_config: Returns self to enable method chaining.

ValueError: If required parameters are missing or if type is not a valid aggregation type.

Multiple aggregations on the same or different dimensions are allowed.
Aggregation results appear as columns in the returned DataFrame.
For COUNT operations, the dimension value itself doesn’t matter; it counts the number of entities with that dimension defined.

Count patients:

>>> query = Query_config()
>>> query.add_aggregation(
...     object_name="Patients",
...     dimension_name="PatientID",
...     type="COUNT"
... )

Average age with custom name:

>>> query.add_aggregation(
...     object_name="Patients",
...     dimension_name="Age",
...     type="AVG",
...     aggregation_name="AvgAge"
... )

Multiple aggregations (chained):

>>> query = Query_config()
>>> query.add_aggregation(object_name="LabEvents", dimension_name="Creatinine", type="AVG") \
...      .add_aggregation(object_name="LabEvents", dimension_name="Creatinine", type="MIN") \
...      .add_aggregation(object_name="LabEvents", dimension_name="Creatinine", type="MAX")

MIMIC-IV: Vital signs statistics:

>>> query = Query_config()
>>> query.add_aggregation(object_name="VitalSigns", dimension_name="HeartRate", type="AVG", aggregation_name="AvgHR") \
...      .add_aggregation(object_name="VitalSigns", dimension_name="BloodPressureSystolic", type="AVG", aggregation_name="AvgSBP") \
...      .add_groupby(object_name="Patients", dimension_name="AgeGroup", attribute_name="AgeGroup")

add_groupby : Add grouping to organize aggregated results add_selection : Filter data before aggregation

Parameters

object_name (str) –
dimension_name (str) –
type (str) –
aggregation_name (str) –

add_groupby(attribute_name, object_name=None, dimension_name=None, groupby_level=None, groupby_level_number=None, groupby_states=None)

Add a group-by to organize aggregation results by attribute categories.

Group-bys partition the data into categories based on attribute values, creating separate rows in the result for each category. Multiple group-bys create nested hierarchies.

attribute_namestr: Name of the attribute to group by. For hierarchical attributes, this groups by the first level unless groupby_level is specified.
object_namestr, optional: Name of the object containing the dimension. If omitted, the builder attempts auto-resolution (experimental).
dimension_namestr, optional: Name of the dimension containing the attribute. If omitted, the builder attempts auto-resolution (experimental).
groupby_levelstr, optional: Specific level name in a hierarchical attribute to group by. Overrides the default first-level grouping.
groupby_level_numberint, optional: Numeric level in the attribute hierarchy to group by (0-indexed). Alternative to groupby_level for hierarchical attributes.
groupby_stateslist, optional: Specific attribute states to include in the grouping. If provided, only these states will appear in results. (Currently unused in implementation)

Query_config: Returns self to enable method chaining.

ValueError: If attribute_name is not provided or if groupby_level_number is not an integer.
RuntimeError: If the group-by specification cannot be constructed.

Group-bys are applied in the order they are added, creating nested hierarchies.
Each group-by creates a new dimension in the result structure.
For non-hierarchical attributes, groupby_level and groupby_level_number are ignored.
Auto-resolution of object/dimension names is experimental and may not work in all cases; explicitly providing them is recommended.

Group by gender:

>>> query = Query_config()
>>> query.add_aggregation(object_name="Patients", dimension_name="Age", type="AVG")
>>> query.add_groupby(
...     object_name="Patients",
...     dimension_name="Gender",
...     attribute_name="Gender"
... )

Multiple group-bys (nested):

>>> query = Query_config()
>>> query.add_aggregation(object_name="Admissions", dimension_name="LOS", type="AVG")
>>> query.add_groupby(object_name="Patients", dimension_name="Gender", attribute_name="Gender")
>>> query.add_groupby(object_name="Patients", dimension_name="Age", attribute_name="AgeGroup")
# Results grouped first by Gender, then by AgeGroup within each gender

Hierarchical attribute grouping:

>>> query = Query_config()
>>> query.add_aggregation(object_name="Diagnoses", dimension_name="ICD_Code", type="COUNT")
>>> query.add_groupby(
...     object_name="Diagnoses",
...     dimension_name="ICD_Code",
...     attribute_name="Hierarchy",
...     groupby_level="Chapter"  # Group by ICD chapter level
... )

MIMIC-IV: Admissions by type and age group:

>>> query = Query_config(name="admissions_analysis")
>>> query.add_aggregation(object_name="Admissions", dimension_name="AdmissionID", type="COUNT") \
...      .add_groupby(object_name="Admissions", dimension_name="AdmissionType", attribute_name="Type") \
...      .add_groupby(object_name="Patients", dimension_name="Age", attribute_name="AgeGroup")

add_aggregation : Define what statistics to compute add_selection : Filter data before grouping

Parameters

attribute_name (str) –
object_name (str) –
dimension_name (str) –
groupby_level (str) –
groupby_level_number (int) –
groupby_states (list) –

add_selection(attribute_name, object_name=None, dimension_name=None, selected_states=None)

Add a selection (filter) to restrict query results to specific attribute states.

Selections filter the dataset before aggregations and group-bys are applied, effectively creating a cohort or subset of data. Multiple selections act as AND conditions, narrowing the dataset further.

attribute_namestr: Name of the attribute to filter on.
object_namestr, optional: Name of the object containing the dimension. If omitted, the builder attempts auto-resolution (experimental).
dimension_namestr, optional: Name of the dimension containing the attribute. If omitted, the builder attempts auto-resolution (experimental).
selected_stateslist of str, optional: List of specific attribute states (values) to include. Only entities with these attribute values will be included in the query results. If None, all states are selected (effectively no filter).

Query_config: Returns self to enable method chaining.

ValueError: If attribute_name is not provided.
RuntimeError: If the selection specification cannot be constructed.

Selections are applied before aggregations and groupings.
Multiple selections create AND conditions (all must be satisfied).
An empty selected_states list means no filtering (all states included).
For date/time selections, states often correspond to time periods or formatted date strings.
Auto-resolution of object/dimension names is experimental; explicitly providing them is recommended for production code.

Filter by gender:

>>> query = Query_config()
>>> query.add_aggregation(object_name="Patients", dimension_name="Age", type="AVG")
>>> query.add_selection(
...     object_name="Patients",
...     dimension_name="Gender",
...     attribute_name="Gender",
...     selected_states=["Female"]
... )

Filter by multiple age groups:

>>> query = Query_config()
>>> query.add_aggregation(object_name="Admissions", dimension_name="LOS", type="AVG")
>>> query.add_selection(
...     object_name="Patients",
...     dimension_name="Age",
...     attribute_name="AgeGroup",
...     selected_states=["65-75", "75-85", ">85"]
... )

Multiple filters (AND condition):

>>> query = Query_config()
>>> query.add_aggregation(object_name="Patients", dimension_name="PatientID", type="COUNT")
>>> query.add_selection(
...     object_name="Patients",
...     dimension_name="Gender",
...     attribute_name="Gender",
...     selected_states=["Male"]
... )
>>> query.add_selection(
...     object_name="Diagnoses",
...     dimension_name="ICD_Code",
...     attribute_name="Category",
...     selected_states=["Circulatory"]
... )
# Results: Male patients with circulatory diagnoses

MIMIC-IV: ICU patients with high severity:

>>> query = Query_config(name="high_severity_icu")
>>> query.add_aggregation(object_name="Patients", dimension_name="Mortality", type="AVG") \
...      .add_selection(
...          object_name="Admissions",
...          dimension_name="ICU_Stay",
...          attribute_name="ICU_Flag",
...          selected_states=["Yes"]
...      ) \
...      .add_selection(
...          object_name="Severity",
...          dimension_name="SOFA_Score",
...          attribute_name="ScoreCategory",
...          selected_states=["High", "Very High"]
...      )

Time-based selection:

>>> query = Query_config()
>>> query.add_aggregation(object_name="Admissions", dimension_name="AdmissionID", type="COUNT")
>>> query.add_selection(
...     object_name="Admissions",
...     dimension_name="AdmitDate",
...     attribute_name="YearMonth",
...     selected_states=["2020-01", "2020-02", "2020-03"]
... )

add_aggregation : Define statistics to compute on filtered data add_groupby : Organize filtered results by categories

Parameters

attribute_name (str) –
object_name (str) –
dimension_name (str) –
selected_states (list) –

set_name(request_name)

Assign a specific name or ID to the query.

Parameters: request_name (str) – The name or ID to be assigned.
Raises: ValueError – If the request_name is not a valid string.

to_json()

Return the configuration of this query as JSON.

Returns: The query configuration.
Return type: dict

Methods:

Method	Description
`set_name(request_name)`	Set the query name/ID
`add_aggregation(object_name, dimension_name, type, aggregation_name=None)`	Add an aggregation (SUM, AVG, COUNT, etc.)
`add_groupby(attribute_name, object_name, dimension_name, ...)`	Add a group-by specification
`add_selection(attribute_name, object_name, dimension_name, selected_states)`	Add a selection (filter)
`to_json()`	Return the query configuration as a dictionary

Aggregation Types:

SUM - Sum of values
AVG - Average
COUNT - Count of records
COUNTDISTINCT - Count of distinct values
COUNTENTITY - Count of entities
MAX - Maximum value
MIN - Minimum value
VAR - Variance
STDEV - Standard deviation
QUANTILE - Quantile

Example:

from xplain import Query_config

query = Query_config(name="my_query")
query.add_aggregation(
    object_name="Sales",
    dimension_name="Revenue",
    type="SUM"
)
query.add_groupby(
    object_name="Sales",
    dimension_name="Product",
    attribute_name="Category"
)
query.add_selection(
    object_name="Sales",
    dimension_name="Date",
    attribute_name="Year",
    selected_states=["2024"]
)

df = session.execute_query(query)

xplain.QueryBuilder

class xplain.QueryBuilder(session, name=None)

Bases: object

Fluent builder for Xplain queries.

Obtained via Xsession.query_builder(name=...). Chain calls to aggregate(), groupby(), and selection(), then finalise with execute() (one-shot result) or open() (live result that updates when session selections change).

When only attribute is supplied to groupby() or selection(), the builder searches the loaded session object tree and resolves the matching object and dimension automatically. If the attribute name is ambiguous (found in more than one place), a ValueError is raised listing all candidates.

Example:

df = (
    session.query_builder(name="lab_counts")
    .aggregate(object="Lab Events", type="COUNT")
    .groupby(attribute="TestType")
    .selection(attribute="Date", selected_states=["2024-01", "2024-02"])
    .execute()
)

Parameters: name (str) –

aggregate(object, type, dimension=None, name=None)

Add an aggregation measure to the query.

Parameters

object (str) – Name of the Xobject to aggregate over.
type (str) – Aggregation function. One of SUM, AVG, COUNT, COUNTDISTINCT, MAX, MIN, COUNTENTITY, VAR, STDEV, QUANTILE.
dimension (str) – Dimension within the object. May be omitted for entity-level aggregations such as COUNT / COUNTENTITY.
name (str) – Optional display name for the resulting measure column.

Returns

self – for method chaining.

Raises

ValueError – If type is not a recognised aggregation function.

execute(data_frame=True)

Execute the query and return results immediately.

Equivalent to calling Xsession.execute_query with the built request. The query is not kept open – subsequent session changes (e.g. selection changes) will not affect the returned result.

Parameters: data_frame (bool) – When True (default) the result is returned as a pandas.DataFrame; otherwise raw JSON is returned.
Returns: DataFrame or JSON depending on data_frame.

groupby(attribute, object=None, dimension=None, groupby_level_name=None, groupby_level_number=None)

Add a group-by dimension to the query.

When object or dimension are omitted, the builder searches the session object tree for an attribute whose name matches attribute. If exactly one match is found, its object/dimension are used automatically. If more than one match exists, a ValueError is raised listing all candidates so you can disambiguate.

Parameters

attribute (str) – Attribute name to group by.
object (str) – Xobject name. Resolved automatically when omitted.
dimension (str) – Dimension name. Resolved automatically when omitted.
groupby_level_name (str) – Named level within a hierarchical dimension.
groupby_level_number (int) – Numeric level within a hierarchical dimension.

Returns

self – for method chaining.

Raises

ValueError – If attribute is not provided, not found in the session tree, or is ambiguous.

open(data_frame=True)

Open the query and return a live result.

Equivalent to calling Xsession.open_query. The query stays active inside the session so that further selection changes automatically update its result.

Parameters: data_frame (bool) – When True (default) the result is returned as a pandas.DataFrame; otherwise raw JSON is returned.
Returns: DataFrame or JSON depending on data_frame.

selection(attribute, object=None, dimension=None, selected_states=None)

Add a selection filter to the query.

Like groupby(), object and dimension are resolved automatically from the session tree when omitted.

Parameters

attribute (str) – Attribute name to filter on.
object (str) – Xobject name. Resolved automatically when omitted.
dimension (str) – Dimension name. Resolved automatically when omitted.
selected_states (list) – List of state values to keep. None means no state filtering (all states).

Returns

self – for method chaining.

Raises

ValueError – If attribute is not provided, not found, or ambiguous.

Obtained via Xsession.query_builder(). Every method (except the terminal execute() / open()) returns self so calls can be chained.

Builder Methods:

Method	Description
`aggregate(object, type, dimension=None, name=None)`	Add an aggregation measure. dimension may be omitted for entity-level types such as `COUNT` / `COUNTENTITY`.
`groupby(attribute, object=None, dimension=None, groupby_level_name=None, groupby_level_number=None)`	Add a group-by dimension. object and dimension are resolved automatically from the session tree when omitted.
`selection(attribute, object=None, dimension=None, selected_states=None)`	Add a selection filter. Auto-resolves object/dimension like `groupby`.
`execute(data_frame=True)`	Run the query and return results immediately (one-shot).
`open(data_frame=True)`	Run the query and keep it alive so results update with session selection changes.

Auto-resolution of object and dimension

When object or dimension is omitted from groupby / selection, QueryBuilder walks the session object tree and finds every (object, dimension) pair that contains the named attribute:

Unique match — used automatically, no action required.
No match — raises ValueError with the attribute name.
Multiple matches — raises ValueError listing all candidates so you can disambiguate by passing object and/or dimension explicitly.

Aggregation Types:

SUM - Sum of values
AVG - Average
COUNT - Count of records
COUNTDISTINCT - Count of distinct values
COUNTENTITY - Count of entities
MAX - Maximum value
MIN - Minimum value
VAR - Variance
STDEV - Standard deviation
QUANTILE - Quantile

Examples:

# Minimal: attribute auto-resolved from the session tree
df = (
    session.query_builder(name="lab_counts")
    .aggregate(object="Lab Events", type="COUNT")
    .groupby(attribute="TestType")
    .execute()
)

# With selection filter and hierarchical level
df = (
    session.query_builder()
    .aggregate(object="Orders", type="SUM", dimension="Revenue")
    .groupby(attribute="Month", groupby_level_name="Month")
    .selection(attribute="Year", selected_states=["2024"])
    .execute()
)

# Live query — updates when session selections change
df = (
    session.query_builder(name="live_revenue")
    .aggregate(object="Orders", type="SUM", dimension="Revenue")
    .groupby(attribute="Category")
    .open()
)

# Disambiguation: attribute "Date" exists in multiple objects
df = (
    session.query_builder()
    .aggregate(object="Lab Events", type="COUNT")
    .groupby(attribute="Date", object="Lab Events", dimension="Date")
    .execute()
)