Core API Reference
This section documents the core classes of the Xplain Python package.
xplain.Xsession
- class xplain.Xsession(url='http://localhost:8080', user='user', password='xplainData', httpsession=None, http_session_id=None, jwt_dispatch_url=None, jwt_cookie_name=None, jwt_token=None)
Bases:
objectXplain session manager for data analytics operations.
The primary interface for interacting with an Xplain server. Each instance represents an authenticated session that can load data models, execute queries, perform statistical analyses, and manage data transformations.
Xsession provides comprehensive functionality for:
Session Management: Connect, authenticate, load configurations
Data Querying: Execute aggregations, group-bys, selections
Object Navigation: Explore hierarchical data structures
Statistical Modeling: Run regressions, build predictive models
Data Import/Export: Load from databases, export results
Visualization: Generate collapsible trees and data views
The class supports multiple authentication methods (credentials, JWT, session reuse) and can be used standalone or in multi-session scenarios for parallel analysis.
- __url__str
Base URL of the connected Xplain server
- __id__str
Unique 32-character session identifier
- __xplain_session__dict
Current session state and metadata
- __requests_session__requests.Session
Underlying HTTP session for server communication
Each Xsession instance maintains independent state and can connect to different servers or use different credentials.
Sessions persist on the server until explicitly terminated or until server timeout expires.
For production use, always call
terminate()when done or use the context manager pattern to ensure proper cleanup.
Basic usage:
>>> from xplain import Xsession, Query_config >>> session = Xsession(url="http://localhost:8080", user="admin", password="admin") >>> session.startup("MIMIC_IV") >>> >>> query = Query_config() >>> query.add_aggregation(object_name="Patients", dimension_name="Age", type="AVG") >>> df = session.execute_query(query) >>> print(df) >>> session.terminate()
Context manager pattern (recommended):
>>> with Xsession(url="http://localhost:8080", user="admin", password="admin") as session: ... session.startup("Analysis") ... df = session.open_attribute("Patients", "Gender", "Gender") ... print(df)
Multi-session analysis:
>>> session1 = Xsession(url="http://server1:8080", user="admin", password="pass1") >>> session2 = Xsession(url="http://server2:8080", user="admin", password="pass2") >>> session1.startup("Dataset_A") >>> session2.startup("Dataset_B") >>> # Compare results from different servers >>> df1 = session1.execute_query(query) >>> df2 = session2.execute_query(query)
XplainSession : Unified API with namespaced methods (alternative interface) XplainClient : Low-level client for direct Web API calls Query_config : Builder for constructing analytical queries
- __init__(url='http://localhost:8080', user='user', password='xplainData', httpsession=None, http_session_id=None, jwt_dispatch_url=None, jwt_cookie_name=None, jwt_token=None)
Create a new Xplain session for data analytics operations.
Establishes an authenticated connection to an Xplain server instance. Supports multiple authentication methods: standard credentials, JWT tokens, or session reuse via existing HTTP session IDs.
- urlstr, default=’http://localhost:8080’
The base URL of the Xplain server (including protocol and port). Can also be set via environment variable
xplain_urlor global variablexplain_url.- userstr, default=’user’
Username for authentication. Required unless using JWT or session ID.
- passwordstr, default=’xplainData’
Password for authentication. Required unless using JWT or session ID.
- httpsessionrequests.Session, optional
Existing Python requests Session object to reuse. Useful for sharing session state across multiple Xplain connections.
- http_session_idstr, optional
Existing HTTP session ID (JSESSIONID) to reuse an active session. Must be a valid 32-character session identifier.
- jwt_dispatch_urlstr, optional
URL endpoint for JWT-based authentication. Required for JWT auth.
- jwt_cookie_namestr, optional
Cookie name containing the JWT token. Required for JWT auth.
- jwt_tokenstr, optional
JWT token string for authentication. Required for JWT auth.
- RuntimeError
If the URL is not provided via any method (argument, environment, or globals).
- HTTPError
If HTTP-level errors occur during authentication.
- ConnectionError
If network connection to the server fails.
- Timeout
If the server does not respond within the timeout period.
- ValueError
If an invalid session ID format is provided.
Authentication is attempted in the following order: 1. Credential-based (user/password) 2. JWT-based (if JWT parameters provided) 3. Session ID reuse (if http_session_id provided)
The session can be loaded from an existing session ID via the environment variable
xplain_session_id.SSL verification is disabled by default for development environments. Enable it in production by modifying the verify parameter in requests.
Basic authentication:
>>> from xplain import Xsession >>> session = Xsession( ... url="http://myhost:8080", ... user="analyst", ... password="secret123" ... ) >>> session.startup("PatientCohort")
JWT authentication:
>>> session = Xsession( ... url="https://secure.xplain.com", ... jwt_dispatch_url="https://auth.example.com/dispatch", ... jwt_cookie_name="auth_token", ... jwt_token="eyJhbGciOi..." ... )
Reuse existing session:
>>> session = Xsession( ... url="http://myhost:8080", ... http_session_id="A1B2C3D4E5F6G7H8I9J0K1L2M3N4O5P6" ... )
Environment-based URL configuration:
>>> import os >>> os.environ['xplain_url'] = 'http://production:8080' >>> session = Xsession(user="admin", password="prod_pass")
startup : Load a startup configuration file startup_from_xview_config : Load session from XView configuration load_from_session_id : Load session by existing session ID terminate : Close the session and logout
- add_quantile_based_attributes(object_name=None, dimension=None, quantiles=None, quantiles_attribute_name=None, ranges_attribute_name=None, use_names_as_postfix=False, selections=None, sample_size=None, script_file=None, script_file_ownership='PUBLIC')
Generate quantile-based attributes for FLOAT/DOUBLE dimensions.
For each target dimension the server computes two optional attributes:
Quantiles attribute (
quantiles_attribute_name): bins whose boundaries are the actual data quantiles — non-equidistant, but each bin contains roughly the same number of records.Ranges attribute (
ranges_attribute_name): equidistant bins spanning [min-quantile .. max-quantile] — equal width, unequal population.
If neither name is supplied, both are created with default names
"Quantiles"and"Ranges". If exactly one name isNonethat attribute is skipped entirely.- Parameters
object_name (str, optional) – Name of the XObject. When given without
dimension, attributes are created for all FLOAT/DOUBLE dimensions of that object.dimension (dict, optional) –
{"object": "...", "dimension": "..."}dict targeting a single dimension.object_namemay be omitted when this is provided.quantiles (list of float, optional) – List of quantile fractions, e.g.
[0.1, 0.25, 0.5, 0.75, 0.9]. Every value must be strictly between 0 and 1, and the list must have at least 2 entries. When omitted the server uses all 1 % steps (0.01 … 0.99).quantiles_attribute_name (str or None) – Name for the quantile-bins attribute. Pass
None(while providingranges_attribute_name) to skip.ranges_attribute_name (str or None) – Name for the equidistant-ranges attribute. Pass
None(while providingquantiles_attribute_name) to skip.use_names_as_postfix (bool) – When
Truethe dimension name is prepended to each attribute name (e.g."Torque - Quantiles").selections (dict, optional) – Active session selections to scope the quantile computation, e.g.
{"object": "...", "attribute": "...", "dimension": "...", "selectedStates": [...]}.sample_size (int, optional) – If provided, quantiles are computed on a random sample. The value is in permille units (1–999):
10means a 1 % sample,50means a 5 % sample,100means a 10 % sample. Must be strictly between 0 and 1000.script_file (dict, optional) – Server-side file path to persist the generated
addNumberRangesAttributecalls as a re-runnable.xscript. Example:{"ownership": "PUBLIC", "filePath": ["quantiles.xscript"]}script_file_ownership (str) – Ownership for
script_filewhen only a plain string filename is given (default:"PUBLIC").
- Raises
ValueError – If neither
object_namenordimensionis given.RuntimeError – On server-side errors.
Example — all numeric dims of one object, save re-runnable script:
xsession.add_quantile_based_attributes( object_name="screwing station", use_names_as_postfix=True, script_file={"ownership": "PUBLIC", "filePath": ["screwing_station_quantiles.xscript"]}, )
Example — single dimension with explicit quantiles:
xsession.add_quantile_based_attributes( dimension={"object": "screwing station", "dimension": "Torque max"}, quantiles=[0.03, 0.1, 0.25, 0.5, 0.75, 0.9, 0.97], quantiles_attribute_name="Torque max Quantiles", ranges_attribute_name=None, )
- build_formula(response, predictors)
Dynamically build an R-style formula for Patsy.
- Parameters
response (str) – The dependent variable.
predictors (list) – A list of predictor variable names.
- Returns
The constructed formula in R-style syntax.
- Return type
str
- build_predictive_model(model_name, xmodel_configuration_file_name, target_event_object)
build predictive model [BETA!!]
- build_tree_data(json_object)
Convert complex JSON structure into a format suitable for D3.js tree visualization. This recursively parses the JSON, building a nested dictionary format compatible with D3.js.
- collapsible_tree()
Generate and visualize a collapsible tree using hierarchical data.
This function builds a tree structure based on the current focus object, processes it into a source-target DataFrame suitable for visualization, and then uses pyecharts to render the tree directly in Jupyter.
Example
Xsession.collapsible_tree()
- Parameters
None –
- Returns
The function directly renders the visualization in the notebook.
- Return type
None
- convert_to_dataframe(data)
Convert query result JSON to pandas DataFrame format.
Transforms nested JSON result structures from Xplain queries into a flat pandas DataFrame suitable for analysis. Handles hierarchical data by extracting leaf node values.
- datadict
Query result in JSON format with ‘fields’ and ‘children’ keys. Expected structure:
{ "fields": ["Attribute1", "Attribute2", "Count"], "children": [ {"data": [{"field1": value1}, {"field2": value2}]}, ... ] }
- pandas.DataFrame
Tabular data with columns corresponding to the ‘fields’ list. Each row represents a leaf node from the hierarchical result.
- KeyError
If expected keys (‘fields’ or ‘children’) are missing from data.
- TypeError
If data contains invalid types that cannot be converted.
Nested hierarchies are flattened; only leaf nodes contribute rows.
Dict values in result data are unwrapped (e.g.,
{"value": 123}becomes123).Missing values are filled with
None.This method is called automatically by
execute_query()whendata_frame=True.
Convert query results:
>>> result_json = session.perform({"method": "getResult", "requestName": "my_query"}) >>> df = session.convert_to_dataframe(result_json) >>> print(df.head())
Manual conversion of custom result:
>>> custom_data = { ... "fields": ["Category", "Count"], ... "children": [ ... {"data": [{"Category": "A"}, {"Count": 100}]}, ... {"data": [{"Category": "B"}, {"Count": 200}]} ... ] ... } >>> df = session.convert_to_dataframe(custom_data)
execute_query : Execute query and return DataFrame directly get_result : Retrieve query result (optionally as DataFrame)
- count_attribute(attribute_name, object_name=None, dimension_name=None, request_name=None, data_frame=True)
Convenient method to count an attribute. Automatically resolves object and dimension if not provided by searching through the object structure.
- Parameters
attribute_name (string) – name of attribute (required)
object_name (string) – name of object (optional, auto-resolved if omitted)
dimension_name (string) – name of dimension (optional, auto-resolved if omitted)
request_name (string) – id or name of request
data_frame (boolean) – if result shall be returned as pandas
- Returns
attribute grouped by on first level and aggregated by count.
- Return type
data frame or json
- Raises
ValueError – if attribute_name is ambiguous (exists in multiple locations)
Example
>>> session = xplain.Xsession(url="myhost:8080", user="myUser", password="myPwd") >>> session.startup("mystartup") >>> # Simple case - just provide attribute name >>> session.count_attribute("Agegroup") >>> # Explicit case - provide all three >>> session.count_attribute("Type", object_name="Hospital Diagnose", dimension_name="Diagnose")
- create_contingency_table(df, var1, var2)
Create a contingency table (frequency table) for two variables.
- Parameters
df (pd.DataFrame) – The data frame containing the variables.
var1 (str) – Name of the first variable (row).
var2 (str) – Name of the second variable (column).
- Returns
A contingency table.
- Return type
pd.DataFrame
- download_result(filename, save_as)
download a file from result directory of server and save it to current local path
- Parameters
file_name (string) – file name in result directory
save_as (string) – downloaded file save as local file
- download_selections(objects, selection_set=None)
returns the selection as json for given objects and selection set
- Parameters
objects (list of strings) – list of object names
selectionSet (string) – the selection set name
- execute_query(query, data_frame=True)
Execute an analytical query and return results.
Runs a query specification against the current session’s data and returns aggregated, grouped, or filtered results. Queries can be specified using the Query_config builder or as raw JSON dictionaries.
- queryQuery_config or dict
Query specification containing aggregations, group-bys, and selections. Can be: - A Query_config object (recommended for type safety) - A dictionary with query structure in JSON format
- data_framebool, default=True
If True, return results as a pandas DataFrame. If False, return raw JSON structure.
- pandas.DataFrame or dict
Query results in the requested format: - DataFrame: Columns correspond to grouped attributes and aggregated values - dict: Nested JSON structure with full hierarchy information
- ValueError
If the query object is invalid or missing required fields.
- RuntimeError
If query execution fails on the server or results cannot be retrieved.
- AttributeError
If a Query_config object lacks the required to_json() method.
If no requestName is provided in the query, a unique identifier is auto-generated using format
query_<8-char-uuid>.The query remains available for inspection until explicitly deleted.
For large result sets, consider using get_result() separately to control result retrieval timing.
Aggregation types supported: COUNT, SUM, AVG, MIN, MAX, DISTINCT, etc.
Using Query_config (recommended):
>>> from xplain import Xsession, Query_config >>> session = Xsession(url="http://localhost:8080", user="admin", password="admin") >>> session.startup("MIMIC_IV") >>> >>> # Count diagnoses grouped by type >>> query = Query_config() >>> query.add_aggregation( ... object_name="Diagnoses", ... dimension_name="ICD_Code", ... type="COUNT" ... ) >>> query.add_groupby( ... object_name="Diagnoses", ... dimension_name="ICD_Code", ... attribute_name="Category" ... ) >>> df = session.execute_query(query) >>> print(df.head()) Category COUNT_ICD_Code 0 Circulatory 15234 1 Respiratory 12456 2 Injury 8901
Filter by selection:
>>> query = Query_config() >>> query.add_aggregation( ... object_name="LabEvents", ... dimension_name="Creatinine", ... type="AVG" ... ) >>> query.add_groupby( ... object_name="Patients", ... dimension_name="Gender", ... attribute_name="Gender" ... ) >>> query.add_selection( ... object_name="Patients", ... dimension_name="Age", ... attribute_name="AgeGroup", ... selected_states=["65-75", "75-85", ">85"] ... ) >>> elderly_creatinine = session.execute_query(query)
Using raw JSON (alternative):
>>> query_json = { ... "aggregations": [{ ... "object": "Admissions", ... "dimension": "LOS", ... "type": "AVG" ... }], ... "groupBys": [{ ... "attribute": { ... "object": "Admissions", ... "dimension": "AdmissionType", ... "attribute": "Type" ... } ... }], ... "requestName": "avg_los_by_type" ... } >>> df = session.execute_query(query_json)
Return raw JSON instead of DataFrame:
>>> result_json = session.execute_query(query, data_frame=False) >>> print(result_json.keys()) dict_keys(['fields', 'children', 'requestName', 'status'])
Query_config : Builder class for constructing queries QueryBuilder : Fluent API for building queries with method chaining query : Start building a query using the fluent API get_result : Retrieve results from a named query open_attribute : Convenience method to open and count an attribute
- gen_xtable(data, xtable_config, file_name)
- get(params=None)
Send a GET request to the /xplainsession endpoint.
- Parameters
params – Optional URL parameters.
- Returns
API response.
- get_attribute_info(object_name, dimension_name, attribute_name)
find and retrieves the details of an attribute
- Parameters
object_name – the name of xobject
dimension_name – the name of dimension
attribute_name – the name of attribute
- Returns
details of this attribute in json format
- get_current_xplain_session()
Get the current xplain session instance.
- get_dimension_info(object_name, dimension_name)
find and retrieves the details of a dimension
- Parameters
object_name – the name of the xobject
dimension_name – the name of dimension
- Returns
details of this dimension in json format
- get_full_object_structure()
Returns a flat list of all objects with their parent, dimensions, and attributes.
Each entry contains: -
object: object name -parent: parent object name (None for root) -dimensions: list of{"name", "attributes"}dictsThe flat structure makes it easy to search, filter, and read without recursive traversal.
- get_importer()
Get the importer instance for managing database connections and imports.
- get_independent_variables_of_model(model_name)
get the list of independent variables of given predictive model
- Parameters
model_name (string) – name of predictive model
- Returns
list of independent variables with details
- Return type
array of dict
- get_instance_as_dataframe(elements)
get a pandas dataframe representation of the xplain artifacts references by elements, equivalent to the standard csv download functionality in XOE
- Parameters
elements (list) – array of x-element paths, each one referring a Xplain artifact — an object, a dimension or an attribute.
- Returns
Dataframe representation of requested instance
- Return type
pd.Dataframe
Example:
elements = [ {"object": "Person"}, {"object": "Diagnosis", "dimension": "Physician"}, {"object": "Prescription", "dimension": "Rx Code", "attribute": "ATC Hierarchy"}, {"object": "Prescription", "dimension": "Rx Code", "attribute": "Substance"}, ]
- get_model_names()
list all loaded predictive models
- Returns
list of model names
- Return type
array of string
- get_object_info(object_name, root=None)
find and display the details of a xobject in json
- Parameters
object_name –
root – the object name from where the search starts. if none root is provided, the root node of the entire object tree
- Returns
details of the Xobject in json
- get_open_sequences(sequence_name)
Retrieves details of open sequences by name.
- get_queries()
get the list of the existing query ids
- Returns
list of query ids
- Return type
array of string
- get_result(query_name, data_frame=True)
get the result of the query :param query_name: the name /id of the query :type query_name: string :return: Dataframe result of the query :rtype: pd.Dataframe or json
- get_root_object()
[Beta] Retrieve the root object.
- Returns
The root object.
- Return type
Xobject
- Raises
KeyError – If ‘focusObject’ or ‘objectName’ is missing from the session.
- get_selections()
display all global selections in the current xplain session
- Returns
selections as json
- Return type
list of json
- get_sequence_transition_matrix(sequence_name)
Retrieves the transition matrix for the specified sequence.
- Parameters
sequence_name – Name of the sequence.
- Returns
Transition matrix as a dictionary with labels, sources, targets, and values.
- get_session()
- get_session_id()
Get the current Xplain session identifier.
Returns the unique 32-character session ID assigned by the server when the session was created or loaded.
- str
The 32-character alphanumeric session identifier (e.g.,
"A1B2C3D4E5F6G7H8I9J0K1L2M3N4O5P6").
The session ID can be used to share or resume sessions across different clients using
load_from_session_id().Session IDs remain valid until explicitly terminated or until server timeout expires.
Can be set via environment variable
xplain_session_idduring initialization.
Get current session ID:
>>> session = Xsession(url="http://localhost:8080", user="admin", password="admin") >>> session.startup("MIMIC_IV") >>> session_id = session.get_session_id() >>> print(f"Current session: {session_id}") Current session: A1B2C3D4E5F6G7H8I9J0K1L2M3N4O5P6
Share session with another client:
>>> # Client 1 >>> session1 = Xsession(url="http://localhost:8080", user="admin", password="admin") >>> session1.startup("Analysis") >>> shared_id = session1.get_session_id() >>> >>> # Client 2 (reuses same session) >>> session2 = Xsession(url="http://localhost:8080", user="admin", password="admin") >>> session2.load_from_session_id(shared_id)
load_from_session_id : Load session from an existing session ID terminate : Close and invalidate the current session
- get_state_hierarchy(object_name, dimension_name, attribute_name, state=None, levels=None, request_name=None)
Retrieve the hierarchical structure of states for a given attribute.
- Parameters
object_name – Name of the object.
dimension_name – Name of the dimension.
attribute_name – Name of the attribute.
state – The name of a state in the attribute’s hierarchy. Optional.
levels – The number of hierarchy levels to return. Optional.
data_frame – Whether to return the result as a pandas DataFrame. Default is True.
- Returns
Hierarchical structure of states.
- Return type
dict or DataFrame
- get_tree_details(object_name=None, dimension_name=None, attribute_name=None)
get the metadata details of certain xplain object, dimension or attribute as json
- Parameters
object_name (string, optional) – the name of object optional, if empty show the whole object tree from root. If only objectName is specified, this function will return the metadata of this object.
dimension_name (string, optional) – the name of dimension, optional. If object_name and dimension_name are specified, returns the dimension metadata.
attribute_name (string, optional) – the name of attribute, optional. If object_name, dimension_name and attribute_name are specified, returns the attribute metadata.
- Returns
object tree details
- Return type
json
- get_variable_details(model_name, data_frame=True)
Retrieve the details of the independent variables for a predictive model.
- Parameters
model_name (str) – The name of the predictive model.
data_frame (bool) – Whether to return the result as a pandas DataFrame.
- Returns
The model’s independent variables details as a DataFrame or JSON.
- Return type
pd.DataFrame or dict
- Raises
ValueError – If the predictive model or its variables are not found.
- get_variable_list(model_name)
get the list of independent variables of given predictive model
- Parameters
model_name (string) – name of predictive model
- Returns
list of independent variables
- Return type
array of string
- get_xobject(object_name)
[Beta] Retrieve the object with the given name.
- Parameters
object_name (str) – The name of the object to retrieve.
- Returns
The object with the given name, or None if not found.
- Return type
Xobject or None
- http_get(entrypoint, params=None)
Performs an HTTP GET request to the specified endpoint.
- Parameters
entrypoint – API endpoint relative to the base URL.
params – Query parameters for the GET request.
- Returns
Parsed JSON response or raw content.
- Raises
RuntimeError – If the GET request fails.
- http_post(entrypoint, payload_json=None, data=None, files=None, params=None)
Performs an HTTP POST request to the specified endpoint.
- Parameters
entrypoint – API endpoint relative to the base URL.
payload_json (dict) – Dictionary payload for the POST request
data (dict) – Form data for the POST request.
files (dict) –
params (dict) –
- Returns
Parsed JSON response or raw content.
- Raises
RuntimeError – If the POST request fails.
- list_analyses()
List available xanalysis configurations
- list_existing_analyses()
List available xanalysis configurations
- list_files(ownership, file_type, file_extension=None)
Lists files with the specified ownership and type.
- Parameters
ownership – Ownership type.
file_type – File type.
file_extension – Optional file extension.
- Returns
List of files or raises exception on failure.
- load_analysis(file_name)
Load xanalysis
- load_from_session_id(session_id)
load xplain session by given exisiting session id
- Parameters
session_id (string) – the 32 digit xplain session id
- load_result_file_as_df(filename)
Load a file from the session as a pandas DataFrame.
- Parameters
filename – Name of the file to load.
- Returns
DataFrame containing file content.
- open_attribute(object_name, dimension_name, attribute_name, request_name=None, data_frame=True)
Open an attribute and get counts grouped by its values.
Convenience method that creates a simple aggregation query counting entities grouped by the first level of the specified attribute hierarchy. Equivalent to a COUNT aggregation with a single GROUP BY.
- object_namestr
Name of the object containing the dimension.
- dimension_namestr
Name of the dimension containing the attribute.
- attribute_namestr
Name of the attribute to open and count.
- request_namestr, optional
Identifier for the query request. If None, a unique UUID is generated.
- data_framebool, default=True
If True, return results as a pandas DataFrame. If False, return raw JSON structure.
- pandas.DataFrame or dict
Counts of entities grouped by attribute values: - DataFrame: Two columns (attribute value, count) - dict: Nested JSON with full hierarchy
- RuntimeError
If the attribute cannot be opened or does not exist in the structure.
This method is optimized for quick exploration of categorical attributes.
For multi-level hierarchies, only the first level is expanded by default.
To navigate deeper levels, use expand() or expand_to_level() methods.
The request remains available for further operations (expand, collapse, etc.).
Count patients by gender:
>>> from xplain import Xsession >>> session = Xsession(url="http://localhost:8080", user="admin", password="admin") >>> session.startup("MIMIC_IV") >>> >>> df = session.open_attribute( ... object_name="Patients", ... dimension_name="Gender", ... attribute_name="Gender" ... ) >>> print(df) Gender Count 0 M 25431 1 F 23892
Analyze admission types:
>>> admissions_df = session.open_attribute( ... object_name="Admissions", ... dimension_name="AdmissionType", ... attribute_name="Type" ... ) >>> print(admissions_df) Type Count 0 EMERGENCY 45123 1 ELECTIVE 12456 2 URGENT 8901
Explore ICD diagnosis categories:
>>> diagnoses = session.open_attribute( ... object_name="Diagnoses", ... dimension_name="ICD_Code", ... attribute_name="Category", ... request_name="diag_category_counts" ... )
Return raw JSON for custom processing:
>>> result = session.open_attribute( ... object_name="LabEvents", ... dimension_name="ItemID", ... attribute_name="TestName", ... data_frame=False ... )
count_attribute : Auto-resolve object/dimension and count attribute execute_query : Execute full query with multiple aggregations expand : Expand attribute hierarchy to show child nodes expand_to_level : Expand hierarchy to a specific depth level
- open_query(query, data_frame=True)
perform the query and keep it open, the result of this query will be impacted by further modification of current session, like selection changes
- Parameters
query – either xplain.Query instance or JSON
data_frame – if True, the result will be returned as DataFrame
- Returns
result of given query
- Return type
JSON or DataFrame, depending on parameter dataFrame
- open_sequence(target_object, base_object, ranks, reverse, names, name_postfixes, dimensions_2_replicate, sort_dimension, zero_point_dimension, selections, selection_set_definition_rank, floating_semantics, attribute_2_copy, sequence_name, rank_dimension_name, rank_zero_is_first_instance_equal_or_greater_zero_point, transition_attribute, transition_level, open_marginal_queries, open_transition_queries, selection_set)
- perform(payload)
Send POST request against entry point /xplainsession with payload as json
- Parameters
method_call (json) – content of xplain web api
- Returns
request response
- Return type
json
- Example
>>> session.perform({"method": "deleteRequest", "requestName":"abcd"})
- post(payload)
Send POST request against entry point /xplainsession with payload as json
- Parameters
payload – xplain web api in json
- Returns
request response as JSON
- post_and_broadcast(payload)
Send a POST request and notify the backend of session updates.
- Parameters
payload – JSON payload for the API request.
- post_file_download(file_name, file_type, ownership='PUBLIC', team=None, user=None, delete_after_download=True)
Triggers the flat table download functionality in XOE.
- Parameters
file_name – Name of the file to be downloaded.
file_type – Type of the file.
ownership – Ownership type, defaults to “PUBLIC”.
team – Team identifier, optional.
user – User identifier, optional.
delete_after_download – Whether to delete the file after download, defaults to True.
- Returns
HTTP response object or raises exception on failure.
- print_error()
Print the last error message.
- print_last_stack_trace()
Print the stack trace of the last error.
- query_builder(name=None)
Start building a query using the fluent
QueryBuilderAPI.This is an alternative to
Query_configthat lets you chainaggregate,groupby, andselectioncalls and then finalise withexecute()oropen().When only attribute is supplied to
groupby/selection, the builder searches the session object tree and resolves the matching object and dimension automatically.- Parameters
name (str, optional) – A label for the query used as its
requestName. Defaults to a random UUID.- Returns
A new query builder bound to this session.
- Return type
Example:
df = ( session.query_builder(name="lab_counts") .aggregate(object="Lab Events", type="COUNT") .groupby(attribute="TestType") .selection(attribute="Date", selected_states=["2024-01"]) .execute() )
- read_file(ownership, file_type, file_path)
Reads the specified file.
- Parameters
ownership – Ownership type.
file_type – File type.
file_path – Path of the file.
- Returns
File content or raises exception on failure.
- refresh()
synchronize the session content with the backend
- resume_analysis(file_name)
resume the stored session
- Parameters
file_name (string) – name of stored session file
- Returns
False (fail) or True (success)
- Return type
Boolean
- run(method)
perform xplain web api method and broadcast the change to other client sharing with same session id
- Parameters
method (json) – xplain web api method in json format
- run_py(file_name, options, ownership)
Executes a Python script file on the server.
- Parameters
file_name – Name of the Python file.
options – Execution options.
ownership – File ownership type.
- Returns
Parsed JSON result or raw content.
- Raises
RuntimeError – If the request fails.
- run_statsmodels(df, formula, model_type='logit')
Fit a statistical model to the provided dataframe using the specified formula and model type.
- Parameters
df (pandas.DataFrame) – The input dataframe containing the data.
formula (str) – A Patsy-compatible formula specifying the dependent and independent variables.
model_type (str) – The type of model to fit. Supported options are ‘logit’, ‘probit’, ‘ols’, ‘mnlogit’, ‘glm’, ‘poisson’, ‘negative_binomial’. Default is ‘logit’.
- Returns
statsmodels.regression.linear_model.OLSResults or statsmodels.discrete.discrete_model.LogitResults or other statsmodels result object depending on the model_type.
- Raises
ValueError – If the model_type is unsupported or if the dependent variable is not appropriate for the chosen model (e.g., non-binary dependent variable for logit/probit).
- property session
Returns the underlying requests.Session object. This allows external code to reuse the authenticated session.
- set_default_broadcast(broadcast)
set default broadcast behaviour so that other xplain client sharing the same xplain session could get informed about the update of current xplain session.
- Parameters
broadcast (boolean) – after successful session update via python call, if a default refresh signal should be broadcasted to all xplain clients sharing the same session, to force them to refresh.
- show_tree()
show object tree
- Returns
render the object hierarchy as a tree
- Return type
string
- Raises
RuntimeError – if the session is not properly initialized.
Exception – if an unexpected error occurs.
- show_tree_details()
Display the details of the object tree.
- startup(startup_file)
Load an Xplain session from a startup configuration file.
Initializes the session’s object structure, dimensions, and default settings from a saved .xstartup configuration file. The file extension is optional and will be added automatically if not provided.
- startup_filestr
Name of the startup configuration file. The .xstartup extension is optional and will be appended automatically if missing.
- RuntimeError
If the startup file cannot be found or loaded, or if the file contains invalid configuration.
- Startup files define the initial object tree structure, including:
Objects and their hierarchies (parent-child relationships)
Dimensions attached to each object
Attributes within dimensions
Default selections and filters
Loading a startup file replaces any existing session state.
After loading, the session is ready for query execution without additional configuration.
Load a MIMIC-IV patient cohort:
>>> from xplain import Xsession >>> session = Xsession(url="http://localhost:8080", user="admin", password="admin") >>> session.startup("MIMIC_IV_Patients") # .xstartup extension added automatically
Load different configurations sequentially:
>>> session = Xsession(url="http://localhost:8080", user="researcher", password="pass") >>> session.startup("ICU_Admissions.xstartup") >>> # ... perform analysis ... >>> session.startup("Lab_Events") # Switch to different configuration
Check loaded structure:
>>> session.startup("MIMIC_Cohort") >>> session.show_tree() # Display the loaded object hierarchy
startup_from_xview_config : Load session from an XView configuration object show_tree : Display the current object structure get_session : Get the current session information
- startup_from_xview_config(xview_config)
load xplain session by given view configuration json
:param xview_config: the view configuration in json format
- store_xsession(response_json)
Store session details from the response.
- Parameters
response_json – Response parsed as JSON.
- terminate()
Terminate the Xplain session and logout from the server.
Closes the current session, invalidating the session ID and releasing server resources. After termination, the session cannot be reused and a new session must be created.
All pending queries and results are lost after termination.
The session ID becomes invalid and cannot be loaded again.
It is good practice to terminate sessions explicitly when done to free server resources, especially in long-running applications.
Automatic session cleanup occurs on server timeout if not explicitly terminated.
Basic session lifecycle:
>>> from xplain import Xsession >>> session = Xsession(url="http://localhost:8080", user="admin", password="admin") >>> session.startup("MIMIC_IV") >>> # ... perform analysis ... >>> session.terminate()
Using context manager (recommended):
>>> with Xsession(url="http://localhost:8080", user="admin", password="admin") as session: ... session.startup("Analysis") ... df = session.execute_query(query) ... # Session automatically terminated when exiting context
Multiple sessions:
>>> session1 = Xsession(url="http://localhost:8080", user="admin", password="admin") >>> session2 = Xsession(url="http://other:8080", user="admin", password="admin") >>> session1.startup("Dataset_A") >>> session2.startup("Dataset_B") >>> # ... work with both sessions ... >>> session1.terminate() >>> session2.terminate()
__init__ : Create a new session get_session_id : Get the current session identifier
- upload_data(file_name)
upload the file from current local directory to data directory on server :param file_name: file :type file_name: string
- upload_xmodel(model_or_path, filename=None, ownership='PUBLIC')
Upload an .xmodel configuration file to the server’s public model store.
The file is stored in the server directory resolved by file-type
XMODEL_CONFIG(config/models/). Once uploaded it can be referenced by name inbuildModel/crossValidateModelpayloads:{"method": "buildModel", "xmodelConfigurationFileName": "My_Model.xmodel", ...}- Parameters
model_or_path – Either an
XModelinstance or a local file-system path (str) to an existing.xmodelfile.filename (str) – Name to use on the server (e.g.
"My_Model.xmodel"). Defaults to<model.name>.xmodelwhen anXModelis passed, or to the basename of the path otherwise..xmodelis appended automatically if omitted.ownership (str) – File-store ownership scope. One of
"PUBLIC"(shared by all users, default),"TEAM","USER", or"SYSTEM".
- Raises
RuntimeError – if the HTTP upload fails or the server returns an error response.
- Return type
None
Example:
from xplain.xmodel import XModel, IndependentVariableSet, AutoSpaceDefinition model = XModel( name="Failure_Model", predictive_model_object="actuator", independent_variable_sets=[ IndependentVariableSet( predictive_model_object="actuator", auto_space_definitions=[ AutoSpaceDefinition("screwing station", ["Result"]), ], ) ], ) xsession.upload_xmodel(model) # → uploaded as "Failure_Model.xmodel" to config/models/ (PUBLIC)
- validate_db(db_connection_config)
Validates a database connection configuration.
- Parameters
db_connection_config – Dictionary containing DB connection settings.
- Raises
RuntimeError – If validation fails or an error occurs.
Constructor Parameters:
Parameter |
Type |
Description |
|---|---|---|
|
str |
URL of the Xplain server (default: |
|
str |
Username for authentication (default: |
|
str |
Password for authentication (default: |
|
requests.Session |
Existing requests session object (optional) |
|
str |
Existing HTTP session ID / JSESSIONID (optional) |
|
str |
JWT authentication endpoint URL (optional) |
|
str |
Cookie name for JWT token (optional) |
|
str |
JWT token value (optional) |
Note
Authentication Methods:
Password authentication: Provide
userandpasswordJWT authentication: Provide all three:
jwt_dispatch_url,jwt_cookie_name,jwt_tokenRecommended: Use
create_session()to load credentials from config file or environment variables
See Authentication & Credential Management for credential management best practices.
Session Management Methods:
Method |
Description |
|---|---|
|
Load a session from a startup configuration file |
|
Load a session from an XView configuration |
|
Connect to an existing session by its 32-character ID |
|
Get the current session ID |
|
Terminate the session and logout |
|
Synchronize local session state with the server |
|
Enable/disable broadcasting updates to other clients |
Query Methods:
Method |
Description |
|---|---|
|
Start a fluent |
|
Execute a query (Query_config or JSON) and return results |
|
Execute a query and keep it open (results update with selections) |
|
Open an attribute grouped by first level, aggregated by count |
|
Get results of an existing query by name |
|
List IDs of all open queries |
|
Convert JSON result to pandas DataFrame |
Object Tree Methods:
Method |
Description |
|---|---|
|
Print the object hierarchy as a text tree |
|
Display detailed object tree as JSON |
|
Render interactive tree in Jupyter using pyecharts |
|
Get metadata for object/dimension/attribute |
|
Get detailed JSON info for an object |
|
Get detailed JSON info for a dimension |
|
Get detailed JSON info for an attribute |
|
Get the root XObject instance [Beta] |
|
Get an XObject by name [Beta] |
|
Get nested dict of all objects, dimensions, and attributes |
Selection Methods:
Method |
Description |
|---|---|
|
Get all global selections in the session |
|
Download selections for specific objects |
|
Get the hierarchical state structure for an attribute |
Data Export Methods:
Method |
Description |
|---|---|
|
Export instance data as a pandas DataFrame (CSV download) |
|
Download a file from the server result directory |
|
Upload a local file to the server data directory |
Predictive Modeling Methods:
Method |
Description |
|---|---|
|
Build a predictive model [Beta] |
|
List all loaded predictive models |
|
Get independent variable names for a model |
|
Get detailed independent variable info |
|
Get variable details as DataFrame or JSON |
Statistical Modeling Methods:
Method |
Description |
|---|---|
|
Fit a statistical model (logit, probit, ols, mnlogit, glm, poisson, negative_binomial) |
|
Build an R-style formula string |
|
Create a cross-tabulation table |
File Management Methods:
Method |
Description |
|---|---|
|
List files of a given type and ownership |
|
Read a file from the server |
|
Execute a Python script on the server |
|
List available xanalysis configurations |
|
Load an xanalysis (startup + saved state) |
|
Resume a stored analysis session |
Low-Level API Methods:
Method |
Description |
|---|---|
|
Execute an Xplain Web API method and broadcast changes |
|
Send POST to /xplainsession and return JSON response |
|
Send raw POST to /xplainsession |
|
Send GET to /xplainsession |
|
Generic HTTP GET to any endpoint |
|
Generic HTTP POST to any endpoint |
|
Get an Api instance for advanced operations |
|
Get an Importer instance for data import operations |
xplain.XObject
- class xplain.XObject(object_name, ref_session)
Bases:
objectRepresents an Xplain data object with navigation and dimension manipulation.
XObjects are the fundamental building blocks of the Xplain object model, representing entities in your data domain (e.g., Patients, Admissions, Lab Events). Each XObject contains:
Child objects: Hierarchical relationships to other objects
Dimensions: Measurable or categorical properties of the object
Aggregations: Computed values derived from child object data
This class provides methods to explore the object structure, retrieve dimensions and child objects, and dynamically add aggregation dimensions that compute summary statistics from related data.
- object_namestr
The name of the XObject in the current session.
- ref_sessionXsession
Reference to the active Xsession for API interactions.
- object_namestr
The name of the XObject.
- _ref_sessionXsession
The session object used for API interactions.
- TypeError
If object_name is not a string.
XObjects are retrieved via
Xsession.get_xobject(object_name).The object tree structure is defined by the loaded startup configuration or XView.
Aggregation dimensions enable analysis across object hierarchies without manual joins.
Get an XObject and explore its structure:
>>> from xplain import Xsession >>> session = Xsession(url="http://localhost:8080", user="admin", password="admin") >>> session.startup("MIMIC_IV") >>> >>> patients = session.get_xobject("Patients") >>> print(patients.get_name()) Patients >>> print(patients.get_dimensions()) ['PatientID', 'Age', 'Gender', 'Ethnicity'] >>> print(patients.get_child_objects()) ['Admissions', 'Diagnoses', 'LabEvents']
Navigate child objects:
>>> admissions = session.get_xobject("Admissions") >>> print(admissions.get_dimensions()) ['AdmissionID', 'AdmitDate', 'DischargeDate', 'LOS', 'AdmissionType']
Add aggregation dimension:
>>> # Add average length of stay to Patients object >>> patients.add_aggregation_dimension( ... dimension_name="AvgLOS", ... aggregation={ ... "object": "Admissions", ... "dimension": "LOS", ... "type": "AVG" ... } ... )
Xsession.get_xobject : Retrieve an XObject by name Dimension : Represents a dimension within an XObject Attribute : Represents an attribute within a dimension
- Parameters
object_name (str) –
- __init__(object_name, ref_session)
Initialize a Xobject instance.
- Parameters
object_name (str) – The name of the Xobject.
ref_session – A session object for making API calls.
- add_aggregation_dimension(dimension_name, aggregation, selections=None, floating_semantics=False)
Add an aggregation dimension that computes values from child object data.
Aggregation dimensions enable computing summary statistics from related objects without writing explicit joins. For example, add “AvgLOS” to a Patients object by averaging the Length-of-Stay dimension from the child Admissions object.
The aggregation dimension becomes part of the object’s schema and can be used in queries, groupings, and further aggregations.
- dimension_namestr
Name for the new aggregation dimension. Must be unique within the object.
- aggregationdict
Aggregation specification defining what to compute. Required keys: -
"object"(str): Name of the child object to aggregate from -"dimension"(str): Name of the dimension to aggregate -"type"(str): Aggregation type (COUNT, SUM, AVG, MIN, MAX, etc.)Example:
{ "object": "Admissions", "dimension": "LOS", "type": "AVG" }
- selectionslist of dict, optional
Filters to apply before aggregation. Each selection is a dict with: -
"attribute"(dict): Object/dimension/attribute to filter on -"selectedStates"(list): Values to includeUseful for conditional aggregations (e.g., “count only ICU admissions”).
- floating_semanticsbool, default=False
If True, the dimension uses floating semantics, meaning it updates dynamically based on current selections. If False (default), the dimension is computed once and remains static.
- dict
Server response confirming the dimension was added.
- ValueError
If dimension_name is empty, aggregation is not a dict, or selections is not a list.
- RuntimeError
If the API call to add the dimension fails.
Aggregation dimensions are computed server-side and cached for performance.
They appear in the object’s dimension list immediately after creation.
Floating semantics dimensions recalculate when selections change, enabling dynamic “what-if” analysis.
The aggregation can reference any descendant object, not just direct children.
Add average length of stay to Patients:
>>> session.startup("MIMIC_IV") >>> patients = session.get_xobject("Patients") >>> patients.add_aggregation_dimension( ... dimension_name="AvgLOS", ... aggregation={ ... "object": "Admissions", ... "dimension": "LOS", ... "type": "AVG" ... } ... )
Count total admissions per patient:
>>> patients.add_aggregation_dimension( ... dimension_name="TotalAdmissions", ... aggregation={ ... "object": "Admissions", ... "dimension": "AdmissionID", ... "type": "COUNT" ... } ... )
Conditional aggregation - ICU admissions only:
>>> patients.add_aggregation_dimension( ... dimension_name="ICU_AdmissionCount", ... aggregation={ ... "object": "Admissions", ... "dimension": "AdmissionID", ... "type": "COUNT" ... }, ... selections=[{ ... "attribute": { ... "object": "Admissions", ... "dimension": "ICU_Stay", ... "attribute": "ICU_Flag" ... }, ... "selectedStates": ["Yes"] ... }] ... )
Average creatinine level (multi-level aggregation):
>>> patients.add_aggregation_dimension( ... dimension_name="AvgCreatinine", ... aggregation={ ... "object": "LabEvents", # Grandchild of Patients ... "dimension": "Creatinine", ... "type": "AVG" ... } ... )
Floating semantics for dynamic analysis:
>>> patients.add_aggregation_dimension( ... dimension_name="SelectedAdmissionCount", ... aggregation={ ... "object": "Admissions", ... "dimension": "AdmissionID", ... "type": "COUNT" ... }, ... floating_semantics=True # Updates when selections change ... )
get_dimensions : List all dimensions including aggregations Xsession.execute_query : Use aggregation dimensions in queries Query_config.add_aggregation : Alternative aggregation method
- Parameters
dimension_name (str) –
aggregation (dict) –
selections (list) –
floating_semantics (bool) –
- Return type
dict
- get_child_objects()
Retrieve the names of all child objects in the hierarchy.
Child objects represent one-to-many relationships from the current object. For example, a “Patients” object might have “Admissions” and “Diagnoses” as child objects.
- list of str
Names of all child objects. Returns an empty list if the object has no children.
- KeyError
If the response from the server is missing expected keys.
- RuntimeError
If the API call to fetch object details fails.
Child objects are defined in the startup configuration or XView.
The parent-child relationship enables aggregation dimensions that compute statistics across the hierarchy.
This method returns names only; use
session.get_xobject(name)to get the actual child XObject instances.
Explore object hierarchy:
>>> session.startup("MIMIC_IV") >>> patients = session.get_xobject("Patients") >>> children = patients.get_child_objects() >>> print(children) ['Admissions', 'Diagnoses', 'LabEvents', 'Prescriptions']
Navigate to child objects:
>>> for child_name in patients.get_child_objects(): ... child_obj = session.get_xobject(child_name) ... print(f"{child_name}: {child_obj.get_dimensions()}") Admissions: ['AdmissionID', 'AdmitDate', 'LOS'] Diagnoses: ['DiagnosisID', 'ICD_Code', 'DiagnosisDate'] ...
get_dimensions : Get dimensions of the current object Xsession.get_xobject : Retrieve a child object instance
- Return type
list
- get_dimension(dimension_name)
Retrieve a specific dimension by its name.
- Parameters
dimension_name (str) – The name of the dimension to retrieve.
- Returns
The name of the dimension if found, otherwise None.
- Return type
str
- Raises
ValueError – If the dimension name is invalid.
RuntimeError – If fetching dimensions fails.
- get_dimensions()
Retrieve the names of all dimensions attached to this object.
Dimensions represent properties or measurements of the object. They can be:
Stored dimensions: Values imported from source data
Aggregation dimensions: Computed from child object data
Derived dimensions: Calculated from other dimensions
- list of str
Names of all dimensions attached to the object. Returns an empty list if the object has no dimensions.
- KeyError
If the response from the server is missing expected keys.
- RuntimeError
If the API call to fetch object details fails.
Dimensions are defined in the object’s configuration or added dynamically.
Each dimension can have one or more attributes that categorize its values.
To retrieve dimension objects (not just names), iterate and call
session.get_dimension(object_name, dimension_name).
List all dimensions of an object:
>>> session.startup("MIMIC_IV") >>> patients = session.get_xobject("Patients") >>> dimensions = patients.get_dimensions() >>> print(dimensions) ['PatientID', 'Age', 'Gender', 'Ethnicity', 'DOB', 'AdmissionCount']
Explore dimension details:
>>> for dim_name in patients.get_dimensions(): ... print(f"Dimension: {dim_name}") Dimension: PatientID Dimension: Age Dimension: Gender ...
Filter for specific dimensions:
>>> numeric_dims = [d for d in patients.get_dimensions() ... if d in ['Age', 'Weight', 'Height']] >>> print(numeric_dims) ['Age', 'Weight', 'Height']
get_dimension : Retrieve a specific dimension by name get_child_objects : Get child objects in the hierarchy add_aggregation_dimension : Add a computed dimension
- Return type
list
- get_name()
Return the name of the Xobject.
- Return type
str
Methods:
Method |
Description |
|---|---|
|
Return the name of the XObject |
|
Get list of child object names |
|
Get list of dimension names |
|
Get a specific dimension by name |
|
Add a computed aggregation dimension |
xplain.Dimension
- class xplain.Dimension(object_name, dimension_name, ref_session)
Bases:
objectRepresents a dimension within an Xplain object.
Dimensions are properties or measurements associated with objects in the Xplain data model. They can be numeric (e.g., Age, Temperature) or categorical (e.g., Gender, Diagnosis Code). Each dimension has one or more attributes that organize its values into hierarchies or categories.
Dimensions are the fundamental units of analysis in Xplain:
Aggregations compute statistics on dimensions (COUNT, AVG, SUM)
Attributes categorize dimension values for grouping
Selections filter data based on attribute states
- object_namestr
Name of the parent object containing this dimension.
- dimension_namestr
Name of the dimension.
- ref_sessionXsession
Reference to the active session for API interactions.
- object_namestr
Name of the associated object.
- dimension_namestr
Name of the dimension.
- _ref_sessionXsession
Reference to the session object for API interaction.
- TypeError
If object_name or dimension_name are not strings.
Dimensions are accessed via
session.get_dimension(object, dimension)or throughXObject.get_dimensions().Each dimension has at least one default attribute (often named the same as the dimension itself).
Hierarchical dimensions have multi-level attributes (e.g., Date → Year → Month → Day).
Get a dimension and explore its attributes:
>>> session.startup("MIMIC_IV") >>> age_dim = session.get_dimension("Patients", "Age") >>> print(age_dim.get_name()) Age >>> attributes = age_dim.get_attributes() >>> for attr in attributes: ... print(attr.get_name()) Age AgeGroup AgeDecade
Access a specific attribute:
>>> age_group_attr = age_dim.get_attribute("AgeGroup") >>> if age_group_attr: ... levels = age_group_attr.get_levels() ... print(levels) ['0-18', '18-30', '30-45', '45-65', '65+']
XObject : Parent object containing dimensions Attribute : Categorization of dimension values Xsession.get_dimension : Retrieve a dimension by name
- Parameters
object_name (str) –
dimension_name (str) –
- __init__(object_name, dimension_name, ref_session)
Initialize the Dimension instance.
- Parameters
object_name (str) – Name of the object.
dimension_name (str) – Name of the dimension.
ref_session – Session object for API calls.
- get_attribute(attribute_name)
Retrieve a specific attribute by name.
Searches the dimension’s attributes and returns the matching Attribute object if found. Useful for accessing hierarchical attributes or specific categorizations.
- attribute_namestr
The name of the attribute to retrieve. Case-sensitive.
- Attribute or None
The matching Attribute object, or None if no attribute with the given name exists in this dimension.
- ValueError
If attribute_name is not a non-empty string.
- KeyError
If the server response is missing expected keys.
- RuntimeError
If the API call to fetch dimension details fails.
Returns None (not an exception) if the attribute doesn’t exist, allowing safe existence checks.
Attribute names are case-sensitive and must match exactly.
The default attribute typically has the same name as the dimension.
Check if an attribute exists:
>>> age_dim = session.get_dimension("Patients", "Age") >>> age_group = age_dim.get_attribute("AgeGroup") >>> if age_group: ... print(f"Found attribute: {age_group.get_name()}") ... else: ... print("Attribute not found") Found attribute: AgeGroup
Get hierarchy levels:
>>> date_dim = session.get_dimension("Admissions", "AdmitDate") >>> year_month = date_dim.get_attribute("YearMonth") >>> if year_month: ... levels = year_month.get_levels() ... print(f"Hierarchy levels: {levels}") Hierarchy levels: ['Year', 'Month']
Safe attribute access:
>>> gender_dim = session.get_dimension("Patients", "Gender") >>> custom_attr = gender_dim.get_attribute("CustomGrouping") >>> if custom_attr is None: ... print("Custom attribute doesn't exist, using default") ... custom_attr = gender_dim.get_attribute("Gender")
get_attributes : Retrieve all attributes of the dimension Attribute.get_levels : Get hierarchy levels of an attribute
- Parameters
attribute_name (str) –
- get_attributes()
Retrieve all attributes attached to this dimension.
Attributes organize dimension values into categories or hierarchies. A dimension typically has at least one default attribute, and may have additional custom or hierarchical attributes.
- list of Attribute
List of Attribute objects representing all attributes of the dimension. Returns an empty list if the dimension has no attributes defined.
- KeyError
If the server response is missing expected keys.
- RuntimeError
If the API call to fetch dimension details fails.
Each returned Attribute is a fully instantiated object that can be used to explore hierarchy levels and states.
Attributes enable grouping in queries via
add_groupby()and filtering viaadd_selection().The default attribute usually has the same name as the dimension.
List all attributes of a dimension:
>>> session.startup("MIMIC_IV") >>> gender_dim = session.get_dimension("Patients", "Gender") >>> attributes = gender_dim.get_attributes() >>> for attr in attributes: ... print(attr.get_name()) Gender
Explore hierarchical attributes:
>>> date_dim = session.get_dimension("Admissions", "AdmitDate") >>> attributes = date_dim.get_attributes() >>> for attr in attributes: ... print(f"{attr.get_name()}: {attr.get_levels()}") AdmitDate: ['Date'] YearMonth: ['Year', 'Month'] Quarter: ['Year', 'Quarter']
Use attributes in queries:
>>> age_dim = session.get_dimension("Patients", "Age") >>> age_group = age_dim.get_attribute("AgeGroup") >>> query = Query_config() >>> query.add_groupby( ... object_name="Patients", ... dimension_name="Age", ... attribute_name="AgeGroup" # From get_attributes() ... )
get_attribute : Retrieve a specific attribute by name Attribute : Detailed attribute information and hierarchy navigation
- Return type
list
- get_name()
Returns the dimension name.
- Return type
str
Methods:
Method |
Description |
|---|---|
|
Return the dimension name |
|
Get list of Attribute instances |
|
Get a specific Attribute by name |
xplain.Attribute
- class xplain.Attribute(object_name, dimension_name, attribute_name, ref_session)
Bases:
objectRepresents an attribute that categorizes dimension values.
Attributes organize dimension values into hierarchies or categories, enabling grouping and filtering in queries. For example, a continuous “Age” dimension might have an “AgeGroup” attribute with categories like “0-18”, “18-65”, “65+”.
Attributes can be:
Flat: Single-level categorization (e.g., Gender: Male/Female)
Hierarchical: Multi-level trees (e.g., Date → Year → Month → Day)
Derived: Computed from dimension values (e.g., age bins, quantiles)
- object_namestr
Name of the object containing the dimension.
- dimension_namestr
Name of the dimension containing this attribute.
- attribute_namestr
Name of the attribute.
- ref_sessionXsession
Reference to the active session for API interactions.
- object_namestr
The parent object name.
- dimension_namestr
The parent dimension name.
- attribute_namestr
The attribute name.
- _ref_sessionXsession
Session reference for API calls.
Attributes are accessed via
Dimension.get_attribute(name)orDimension.get_attributes().Hierarchical attributes enable drill-down analysis (expand/collapse).
Attribute states (values) can be explored via
get_state_hierarchy().
Get an attribute and explore its hierarchy:
>>> session.startup("MIMIC_IV") >>> age_dim = session.get_dimension("Patients", "Age") >>> age_group = age_dim.get_attribute("AgeGroup") >>> print(age_group.get_name()) AgeGroup >>> print(age_group.get_levels()) ['AgeGroup']
Hierarchical attribute (Date):
>>> date_dim = session.get_dimension("Admissions", "AdmitDate") >>> year_month = date_dim.get_attribute("YearMonth") >>> levels = year_month.get_levels() >>> print(levels) ['Year', 'Month']
Get state hierarchy:
>>> hierarchy = age_group.get_state_hierarchy() >>> print(hierarchy) {'stateName': 'All', 'children': [{'stateName': '0-18'}, {'stateName': '18-65'}, ...]}
Dimension : Parent dimension containing attributes Dimension.get_attribute : Retrieve an attribute by name
- get_levels()
Retrieve the hierarchy level names of this attribute.
For hierarchical attributes, this returns the ordered list of levels from coarsest to finest granularity. For flat attributes, returns a single-element list.
- list of str
Ordered list of hierarchy level names. For example: - Flat attribute:
["Gender"]- Hierarchical:["Year", "Quarter", "Month", "Week"]
- ValueError
If the attribute information cannot be retrieved or if ‘hierarchyLevelNames’ is missing from the response.
The first level is the coarsest (e.g., “Year”), the last is finest (e.g., “Day”).
Level names are used in
Query_config.add_groupby()to specify which hierarchy level to group by.For non-hierarchical attributes, the list contains only the attribute name itself.
Flat attribute (single level):
>>> session.startup("MIMIC_IV") >>> gender_dim = session.get_dimension("Patients", "Gender") >>> gender_attr = gender_dim.get_attribute("Gender") >>> print(gender_attr.get_levels()) ['Gender']
Hierarchical attribute (date/time):
>>> date_dim = session.get_dimension("Admissions", "AdmitDate") >>> year_month = date_dim.get_attribute("YearMonth") >>> print(year_month.get_levels()) ['Year', 'Month']
ICD diagnosis hierarchy:
>>> icd_dim = session.get_dimension("Diagnoses", "ICD_Code") >>> icd_hierarchy = icd_dim.get_attribute("ICD_Hierarchy") >>> print(icd_hierarchy.get_levels()) ['Chapter', 'Block', 'Category', 'Code']
Use levels in queries:
>>> query = Query_config() >>> query.add_groupby( ... object_name="Admissions", ... dimension_name="AdmitDate", ... attribute_name="YearMonth", ... groupby_level="Year" # Group by year only ... )
get_state_hierarchy : Explore the actual values (states) in the hierarchy Query_config.add_groupby : Use hierarchy levels in queries
- get_name()
Retrieves the name of the attribute. :return: Attribute name as a string.
- get_root_state()
Get the root state (top-level category) of this attribute.
The root state represents the most aggregated level in the attribute hierarchy, typically named “All” or representing the total population.
- str
The name of the root state (e.g., “All”, “Total”, or a custom name).
The root state encompasses all child states in the hierarchy.
For flat attributes, the root may be the only state or a summary category.
This is useful for understanding the top-level category before drilling down.
Get root state:
>>> session.startup("MIMIC_IV") >>> gender_dim = session.get_dimension("Patients", "Gender") >>> gender_attr = gender_dim.get_attribute("Gender") >>> root = gender_attr.get_root_state() >>> print(root) All
Hierarchical attribute root:
>>> date_dim = session.get_dimension("Admissions", "AdmitDate") >>> year_month = date_dim.get_attribute("YearMonth") >>> root = year_month.get_root_state() >>> print(root) All
get_state_hierarchy : Get the full state hierarchy get_levels : Get hierarchy level names
- get_state_hierarchy(state=None, levels=None)
Retrieve the state hierarchy showing all values and their structure.
Returns a tree structure of attribute states (values), showing parent-child relationships for hierarchical attributes. Useful for exploring available categories and understanding the attribute’s organization.
- statestr, optional
Specific state to retrieve the sub-hierarchy for. If None, returns the full hierarchy starting from the root.
- levelslist of str, optional
Specific hierarchy levels to include. If None, returns all levels.
- dict
Nested dictionary representing the state hierarchy. Structure:
{ "stateName": "Root", "children": [ {"stateName": "Child1", "children": [...]}, {"stateName": "Child2", "children": [...]} ] }
For flat attributes, the hierarchy is shallow with no nested children.
For hierarchical attributes, children represent progressively finer granularities.
States are the actual categorical values used in selections and groupings.
This method delegates to
Xsession.get_state_hierarchy().
Get full hierarchy for a flat attribute:
>>> session.startup("MIMIC_IV") >>> gender_dim = session.get_dimension("Patients", "Gender") >>> gender_attr = gender_dim.get_attribute("Gender") >>> hierarchy = gender_attr.get_state_hierarchy() >>> print(hierarchy) { 'stateName': 'All', 'children': [ {'stateName': 'Male'}, {'stateName': 'Female'}, {'stateName': 'Unknown'} ] }
Hierarchical attribute (date by year/month):
>>> date_dim = session.get_dimension("Admissions", "AdmitDate") >>> year_month = date_dim.get_attribute("YearMonth") >>> hierarchy = year_month.get_state_hierarchy() >>> print(hierarchy) { 'stateName': 'All', 'children': [ { 'stateName': '2020', 'children': [ {'stateName': '2020-01'}, {'stateName': '2020-02'}, ... ] }, ... ] }
Get sub-hierarchy for specific state:
>>> sub_hierarchy = year_month.get_state_hierarchy(state='2020') >>> print(sub_hierarchy) { 'stateName': '2020', 'children': [ {'stateName': '2020-01'}, {'stateName': '2020-02'}, ... ] }
get_levels : Get hierarchy level names get_root_state : Get the root state name Xsession.get_state_hierarchy : Underlying implementation
Methods:
Method |
Description |
|---|---|
|
Return the attribute name |
|
Get hierarchy level names |
|
Get the hierarchical state structure |
|
Get the root state name |
xplain.Query_config
- class xplain.Query_config(name=None)
Bases:
objectBuilder for constructing Xplain analytical query configurations.
Provides a fluent API for building complex queries with aggregations, group-bys, and selections. Queries are executed via
Xsession.execute_query().The builder pattern allows chaining method calls to incrementally construct queries. Each query consists of three main components:
Aggregations: Compute summary statistics (COUNT, SUM, AVG, etc.)
Group-bys: Organize results by attribute categories
Selections: Filter data to specific attribute states
- requestdict
The internal query configuration containing aggregations, groupBys, and selections in JSON-serializable format.
All aggregation methods return
selfto enable method chaining.Queries are identified by a unique
requestName, auto-generated if not provided.The query configuration can be serialized to JSON via
to_json().For simpler queries, consider using
Xsession.open_attribute()or the fluentQueryBuilderAPI viaXsession.query_builder().
Basic aggregation with grouping:
>>> from xplain import Query_config >>> query = Query_config() >>> query.add_aggregation( ... object_name="Patients", ... dimension_name="Age", ... type="AVG" ... ) >>> query.add_groupby( ... object_name="Patients", ... dimension_name="Gender", ... attribute_name="Gender" ... ) >>> df = session.execute_query(query)
Multiple aggregations:
>>> query = Query_config(name="patient_stats") >>> query.add_aggregation(object_name="Patients", dimension_name="Age", type="AVG") \ ... .add_aggregation(object_name="Patients", dimension_name="Weight", type="AVG") \ ... .add_groupby(object_name="Patients", dimension_name="Gender", attribute_name="Gender") >>> results = session.execute_query(query)
With selections (filtering):
>>> query = Query_config() >>> query.add_aggregation(object_name="Admissions", dimension_name="LOS", type="AVG") \ ... .add_groupby(object_name="Admissions", dimension_name="AdmissionType", attribute_name="Type") \ ... .add_selection( ... object_name="Patients", ... dimension_name="Age", ... attribute_name="AgeGroup", ... selected_states=["65-75", "75-85", ">85"] ... ) >>> elderly_los = session.execute_query(query)
MIMIC-IV analysis - ICU mortality by diagnosis:
>>> query = Query_config(name="icu_mortality_by_diagnosis") >>> query.add_aggregation(object_name="Patients", dimension_name="Mortality", type="AVG") \ ... .add_groupby(object_name="Diagnoses", dimension_name="ICD_Code", attribute_name="Category") \ ... .add_selection( ... object_name="Admissions", ... dimension_name="ICU_Stay", ... attribute_name="ICU_Flag", ... selected_states=["Yes"] ... ) >>> df = session.execute_query(query)
Xsession.execute_query : Execute the constructed query Xsession.query_builder : Alternative fluent API via QueryBuilder Xsession.open_attribute : Convenience method for simple attribute counts
- Parameters
name (str) –
- __init__(name=None)
Initialize the QueryConfig instance with a default or provided name.
- Parameters
name (str, optional) – The name or identifier for the query. Defaults to a UUID.
- add_aggregation(object_name, dimension_name, type, aggregation_name=None)
Add an aggregation to compute summary statistics on a dimension.
Aggregations define what to calculate from the data. Multiple aggregations can be added to a single query, and each produces a column in the result.
- object_namestr
Name of the object containing the dimension to aggregate.
- dimension_namestr
Name of the dimension to compute the aggregation on.
- typestr
Aggregation type. Supported values: -
"COUNT": Count of non-null values -"COUNTDISTINCT": Count of unique values -"COUNTENTITY": Count of entities -"SUM": Sum of numeric values -"AVG": Average (mean) of numeric values -"MIN": Minimum value -"MAX": Maximum value -"VAR": Variance -"STDEV": Standard deviation -"QUANTILE": Quantile (requires additional config)- aggregation_namestr, optional
Custom name for the aggregation column in results. If not provided, the server auto-generates a name (e.g.,
"COUNT_DimensionName").
- Query_config
Returns self to enable method chaining.
- ValueError
If required parameters are missing or if
typeis not a valid aggregation type.
Multiple aggregations on the same or different dimensions are allowed.
Aggregation results appear as columns in the returned DataFrame.
For COUNT operations, the dimension value itself doesn’t matter; it counts the number of entities with that dimension defined.
Count patients:
>>> query = Query_config() >>> query.add_aggregation( ... object_name="Patients", ... dimension_name="PatientID", ... type="COUNT" ... )
Average age with custom name:
>>> query.add_aggregation( ... object_name="Patients", ... dimension_name="Age", ... type="AVG", ... aggregation_name="AvgAge" ... )
Multiple aggregations (chained):
>>> query = Query_config() >>> query.add_aggregation(object_name="LabEvents", dimension_name="Creatinine", type="AVG") \ ... .add_aggregation(object_name="LabEvents", dimension_name="Creatinine", type="MIN") \ ... .add_aggregation(object_name="LabEvents", dimension_name="Creatinine", type="MAX")
MIMIC-IV: Vital signs statistics:
>>> query = Query_config() >>> query.add_aggregation(object_name="VitalSigns", dimension_name="HeartRate", type="AVG", aggregation_name="AvgHR") \ ... .add_aggregation(object_name="VitalSigns", dimension_name="BloodPressureSystolic", type="AVG", aggregation_name="AvgSBP") \ ... .add_groupby(object_name="Patients", dimension_name="AgeGroup", attribute_name="AgeGroup")
add_groupby : Add grouping to organize aggregated results add_selection : Filter data before aggregation
- Parameters
object_name (str) –
dimension_name (str) –
type (str) –
aggregation_name (str) –
- add_groupby(attribute_name, object_name=None, dimension_name=None, groupby_level=None, groupby_level_number=None, groupby_states=None)
Add a group-by to organize aggregation results by attribute categories.
Group-bys partition the data into categories based on attribute values, creating separate rows in the result for each category. Multiple group-bys create nested hierarchies.
- attribute_namestr
Name of the attribute to group by. For hierarchical attributes, this groups by the first level unless
groupby_levelis specified.- object_namestr, optional
Name of the object containing the dimension. If omitted, the builder attempts auto-resolution (experimental).
- dimension_namestr, optional
Name of the dimension containing the attribute. If omitted, the builder attempts auto-resolution (experimental).
- groupby_levelstr, optional
Specific level name in a hierarchical attribute to group by. Overrides the default first-level grouping.
- groupby_level_numberint, optional
Numeric level in the attribute hierarchy to group by (0-indexed). Alternative to
groupby_levelfor hierarchical attributes.- groupby_stateslist, optional
Specific attribute states to include in the grouping. If provided, only these states will appear in results. (Currently unused in implementation)
- Query_config
Returns self to enable method chaining.
- ValueError
If
attribute_nameis not provided or ifgroupby_level_numberis not an integer.- RuntimeError
If the group-by specification cannot be constructed.
Group-bys are applied in the order they are added, creating nested hierarchies.
Each group-by creates a new dimension in the result structure.
For non-hierarchical attributes,
groupby_levelandgroupby_level_numberare ignored.Auto-resolution of object/dimension names is experimental and may not work in all cases; explicitly providing them is recommended.
Group by gender:
>>> query = Query_config() >>> query.add_aggregation(object_name="Patients", dimension_name="Age", type="AVG") >>> query.add_groupby( ... object_name="Patients", ... dimension_name="Gender", ... attribute_name="Gender" ... )
Multiple group-bys (nested):
>>> query = Query_config() >>> query.add_aggregation(object_name="Admissions", dimension_name="LOS", type="AVG") >>> query.add_groupby(object_name="Patients", dimension_name="Gender", attribute_name="Gender") >>> query.add_groupby(object_name="Patients", dimension_name="Age", attribute_name="AgeGroup") # Results grouped first by Gender, then by AgeGroup within each gender
Hierarchical attribute grouping:
>>> query = Query_config() >>> query.add_aggregation(object_name="Diagnoses", dimension_name="ICD_Code", type="COUNT") >>> query.add_groupby( ... object_name="Diagnoses", ... dimension_name="ICD_Code", ... attribute_name="Hierarchy", ... groupby_level="Chapter" # Group by ICD chapter level ... )
MIMIC-IV: Admissions by type and age group:
>>> query = Query_config(name="admissions_analysis") >>> query.add_aggregation(object_name="Admissions", dimension_name="AdmissionID", type="COUNT") \ ... .add_groupby(object_name="Admissions", dimension_name="AdmissionType", attribute_name="Type") \ ... .add_groupby(object_name="Patients", dimension_name="Age", attribute_name="AgeGroup")
add_aggregation : Define what statistics to compute add_selection : Filter data before grouping
- Parameters
attribute_name (str) –
object_name (str) –
dimension_name (str) –
groupby_level (str) –
groupby_level_number (int) –
groupby_states (list) –
- add_selection(attribute_name, object_name=None, dimension_name=None, selected_states=None)
Add a selection (filter) to restrict query results to specific attribute states.
Selections filter the dataset before aggregations and group-bys are applied, effectively creating a cohort or subset of data. Multiple selections act as AND conditions, narrowing the dataset further.
- attribute_namestr
Name of the attribute to filter on.
- object_namestr, optional
Name of the object containing the dimension. If omitted, the builder attempts auto-resolution (experimental).
- dimension_namestr, optional
Name of the dimension containing the attribute. If omitted, the builder attempts auto-resolution (experimental).
- selected_stateslist of str, optional
List of specific attribute states (values) to include. Only entities with these attribute values will be included in the query results. If None, all states are selected (effectively no filter).
- Query_config
Returns self to enable method chaining.
- ValueError
If
attribute_nameis not provided.- RuntimeError
If the selection specification cannot be constructed.
Selections are applied before aggregations and groupings.
Multiple selections create AND conditions (all must be satisfied).
An empty
selected_stateslist means no filtering (all states included).For date/time selections, states often correspond to time periods or formatted date strings.
Auto-resolution of object/dimension names is experimental; explicitly providing them is recommended for production code.
Filter by gender:
>>> query = Query_config() >>> query.add_aggregation(object_name="Patients", dimension_name="Age", type="AVG") >>> query.add_selection( ... object_name="Patients", ... dimension_name="Gender", ... attribute_name="Gender", ... selected_states=["Female"] ... )
Filter by multiple age groups:
>>> query = Query_config() >>> query.add_aggregation(object_name="Admissions", dimension_name="LOS", type="AVG") >>> query.add_selection( ... object_name="Patients", ... dimension_name="Age", ... attribute_name="AgeGroup", ... selected_states=["65-75", "75-85", ">85"] ... )
Multiple filters (AND condition):
>>> query = Query_config() >>> query.add_aggregation(object_name="Patients", dimension_name="PatientID", type="COUNT") >>> query.add_selection( ... object_name="Patients", ... dimension_name="Gender", ... attribute_name="Gender", ... selected_states=["Male"] ... ) >>> query.add_selection( ... object_name="Diagnoses", ... dimension_name="ICD_Code", ... attribute_name="Category", ... selected_states=["Circulatory"] ... ) # Results: Male patients with circulatory diagnoses
MIMIC-IV: ICU patients with high severity:
>>> query = Query_config(name="high_severity_icu") >>> query.add_aggregation(object_name="Patients", dimension_name="Mortality", type="AVG") \ ... .add_selection( ... object_name="Admissions", ... dimension_name="ICU_Stay", ... attribute_name="ICU_Flag", ... selected_states=["Yes"] ... ) \ ... .add_selection( ... object_name="Severity", ... dimension_name="SOFA_Score", ... attribute_name="ScoreCategory", ... selected_states=["High", "Very High"] ... )
Time-based selection:
>>> query = Query_config() >>> query.add_aggregation(object_name="Admissions", dimension_name="AdmissionID", type="COUNT") >>> query.add_selection( ... object_name="Admissions", ... dimension_name="AdmitDate", ... attribute_name="YearMonth", ... selected_states=["2020-01", "2020-02", "2020-03"] ... )
add_aggregation : Define statistics to compute on filtered data add_groupby : Organize filtered results by categories
- Parameters
attribute_name (str) –
object_name (str) –
dimension_name (str) –
selected_states (list) –
- set_name(request_name)
Assign a specific name or ID to the query.
- Parameters
request_name (str) – The name or ID to be assigned.
- Raises
ValueError – If the request_name is not a valid string.
- to_json()
Return the configuration of this query as JSON.
- Returns
The query configuration.
- Return type
dict
Methods:
Method |
Description |
|---|---|
|
Set the query name/ID |
|
Add an aggregation (SUM, AVG, COUNT, etc.) |
|
Add a group-by specification |
|
Add a selection (filter) |
|
Return the query configuration as a dictionary |
Aggregation Types:
SUM- Sum of valuesAVG- AverageCOUNT- Count of recordsCOUNTDISTINCT- Count of distinct valuesCOUNTENTITY- Count of entitiesMAX- Maximum valueMIN- Minimum valueVAR- VarianceSTDEV- Standard deviationQUANTILE- Quantile
Example:
from xplain import Query_config
query = Query_config(name="my_query")
query.add_aggregation(
object_name="Sales",
dimension_name="Revenue",
type="SUM"
)
query.add_groupby(
object_name="Sales",
dimension_name="Product",
attribute_name="Category"
)
query.add_selection(
object_name="Sales",
dimension_name="Date",
attribute_name="Year",
selected_states=["2024"]
)
df = session.execute_query(query)
xplain.QueryBuilder
- class xplain.QueryBuilder(session, name=None)
Bases:
objectFluent builder for Xplain queries.
Obtained via
Xsession.query_builder(name=...). Chain calls toaggregate(),groupby(), andselection(), then finalise withexecute()(one-shot result) oropen()(live result that updates when session selections change).When only attribute is supplied to
groupby()orselection(), the builder searches the loaded session object tree and resolves the matching object and dimension automatically. If the attribute name is ambiguous (found in more than one place), aValueErroris raised listing all candidates.Example:
df = ( session.query_builder(name="lab_counts") .aggregate(object="Lab Events", type="COUNT") .groupby(attribute="TestType") .selection(attribute="Date", selected_states=["2024-01", "2024-02"]) .execute() )
- Parameters
name (str) –
- aggregate(object, type, dimension=None, name=None)
Add an aggregation measure to the query.
- Parameters
object (str) – Name of the Xobject to aggregate over.
type (str) – Aggregation function. One of
SUM,AVG,COUNT,COUNTDISTINCT,MAX,MIN,COUNTENTITY,VAR,STDEV,QUANTILE.dimension (str) – Dimension within the object. May be omitted for entity-level aggregations such as
COUNT/COUNTENTITY.name (str) – Optional display name for the resulting measure column.
- Returns
self – for method chaining.
- Raises
ValueError – If type is not a recognised aggregation function.
- execute(data_frame=True)
Execute the query and return results immediately.
Equivalent to calling
Xsession.execute_querywith the built request. The query is not kept open – subsequent session changes (e.g. selection changes) will not affect the returned result.- Parameters
data_frame (bool) – When
True(default) the result is returned as apandas.DataFrame; otherwise raw JSON is returned.- Returns
DataFrame or JSON depending on data_frame.
- groupby(attribute, object=None, dimension=None, groupby_level_name=None, groupby_level_number=None)
Add a group-by dimension to the query.
When object or dimension are omitted, the builder searches the session object tree for an attribute whose name matches attribute. If exactly one match is found, its object/dimension are used automatically. If more than one match exists, a
ValueErroris raised listing all candidates so you can disambiguate.- Parameters
attribute (str) – Attribute name to group by.
object (str) – Xobject name. Resolved automatically when omitted.
dimension (str) – Dimension name. Resolved automatically when omitted.
groupby_level_name (str) – Named level within a hierarchical dimension.
groupby_level_number (int) – Numeric level within a hierarchical dimension.
- Returns
self – for method chaining.
- Raises
ValueError – If attribute is not provided, not found in the session tree, or is ambiguous.
- open(data_frame=True)
Open the query and return a live result.
Equivalent to calling
Xsession.open_query. The query stays active inside the session so that further selection changes automatically update its result.- Parameters
data_frame (bool) – When
True(default) the result is returned as apandas.DataFrame; otherwise raw JSON is returned.- Returns
DataFrame or JSON depending on data_frame.
- selection(attribute, object=None, dimension=None, selected_states=None)
Add a selection filter to the query.
Like
groupby(), object and dimension are resolved automatically from the session tree when omitted.- Parameters
attribute (str) – Attribute name to filter on.
object (str) – Xobject name. Resolved automatically when omitted.
dimension (str) – Dimension name. Resolved automatically when omitted.
selected_states (list) – List of state values to keep.
Nonemeans no state filtering (all states).
- Returns
self – for method chaining.
- Raises
ValueError – If attribute is not provided, not found, or ambiguous.
Obtained via Xsession.query_builder(). Every method (except the terminal
execute() / open()) returns self
so calls can be chained.
Builder Methods:
Method |
Description |
|---|---|
|
Add an aggregation measure. dimension may be omitted for
entity-level types such as |
|
Add a group-by dimension. object and dimension are resolved automatically from the session tree when omitted. |
|
Add a selection filter. Auto-resolves object/dimension like |
|
Run the query and return results immediately (one-shot). |
|
Run the query and keep it alive so results update with session selection changes. |
Auto-resolution of object and dimension
When object or dimension is omitted from groupby / selection,
QueryBuilder walks the session object tree and finds every
(object, dimension) pair that contains the named attribute:
Unique match — used automatically, no action required.
No match — raises
ValueErrorwith the attribute name.Multiple matches — raises
ValueErrorlisting all candidates so you can disambiguate by passing object and/or dimension explicitly.
Aggregation Types:
SUM- Sum of valuesAVG- AverageCOUNT- Count of recordsCOUNTDISTINCT- Count of distinct valuesCOUNTENTITY- Count of entitiesMAX- Maximum valueMIN- Minimum valueVAR- VarianceSTDEV- Standard deviationQUANTILE- Quantile
Examples:
# Minimal: attribute auto-resolved from the session tree
df = (
session.query_builder(name="lab_counts")
.aggregate(object="Lab Events", type="COUNT")
.groupby(attribute="TestType")
.execute()
)
# With selection filter and hierarchical level
df = (
session.query_builder()
.aggregate(object="Orders", type="SUM", dimension="Revenue")
.groupby(attribute="Month", groupby_level_name="Month")
.selection(attribute="Year", selected_states=["2024"])
.execute()
)
# Live query — updates when session selections change
df = (
session.query_builder(name="live_revenue")
.aggregate(object="Orders", type="SUM", dimension="Revenue")
.groupby(attribute="Category")
.open()
)
# Disambiguation: attribute "Date" exists in multiple objects
df = (
session.query_builder()
.aggregate(object="Lab Events", type="COUNT")
.groupby(attribute="Date", object="Lab Events", dimension="Date")
.execute()
)