TRAPI Response Validation

class reasoner_validator.validator.TRAPIResponseValidator(default_test: str | None = None, default_target: str | None = None, trapi_version: str | None = None, biolink_version: str | None = None, target_provenance: Dict[str, str] | None = None, strict_validation: bool | None = None, suppress_empty_data_warnings: bool = False)

Bases: BiolinkValidator

TRAPIResponseValidator is an overall wrapper class for validating conformance of full TRAPI Responses to TRAPI and the Biolink Model.

Methods

add_messages(new_messages)

Batch addition of MESSAGES_BY_TARGET messages to a ValidationReporter instance. :param new_messages: MESSAGES_BY_TARGET, messages indexed by target, test and categories: one of "information", "skipped tests", "warnings", "errors" or "critical", with code-keyed dictionaries of (structured) message parameters.

apply_validation(validation_method, *args, ...)

Wrapper to allow validation_methods direct access to the ValidationReporter.

build_source_trail(sources)

Returns a 'source_trail' path from 'primary_knowledge_source' upwards.

category_matched(source_categories, ...)

For each 'source' Biolink Model category given (list of CURIEs as strings?), first get the union set of all parent (ancestral) categories, then check if at least one of these categories is matched to the list of target categories.

check_biolink_model_compliance(graph, graph_type)

Validate a TRAPI-schema compliant Message graph-like data structure against the currently active Biolink Model Toolkit model version.

check_biolink_model_compliance_of_input_edge(edge)

Validate a templated test input edge contents against the current BMT Biolink Model release.

check_compliance_of_trapi_response(response)

One stop validation of all components of a TRAPI-schema compliant Query.Response, including its Message against a designated Biolink Model release.

dump([title, id_rows, msg_rows, ...])

Dump all available messages captured by the ValidationReporter, printed as formatted human-readable text, on a specified file device.

dump_all_messages([test, target, flat])

Dump all messages for a given test from a given target, as JSON.

dump_critical([test, target, flat])

Dump critical error messages as JSON.

dump_errors([test, target, flat])

Dump 'error' messages as JSON.

dump_info([test, target, flat])

Dump 'information' messages as JSON.

dump_messages_type(message_type[, test, ...])

Dump ValidationReporter messages of type 'message_type' as JSON.

dump_skipped([test, target, flat])

Dump 'skipped test' messages as JSON.

dump_warnings([test, target, flat])

Dump 'warning' messages as JSON.

dumps([id_rows, msg_rows, compact_format])

Text string version of dump(): returns all available messages captured by the ValidationReporter, as a formatted human-readable text blob.

get_aliases(curie)

Get clique of related identifiers from the Node Normalizer. Note that except for the cases of a missing or invalid CURIE input, this method is guaranteed to succeed in returning at least the input CURIE as one of the aliases; however, the method reports various validation warnings based on the completeness of the entry reported by the Node Normalizer. :param curie: str, CURIE of node identifier for which aliases are needed. :return: List[str], of all aliases (including at least the CURIE itself, unless validation error is encountered, then None).

get_all_messages()

Get copy of all MESSAGES_BY_TARGET as a Python data structure.

get_all_messages_of_type(message_type)

Get MESSAGE_PARTITION dictionary of all ValidationReporter messages of a given 'message_type', harvested from all target and test contexts.

get_biolink_version()

return:

Biolink Model version currently tracked by the TRAPISchemaValidator.

get_critical([test, target])

Get copy of all recorded 'critical' error messages, for a given test from a given target.

get_default_target()

Returns the current target of the ValidationReporter.

get_default_test()

Returns the current default test identifier of the ValidationReporter.

get_errors([test, target])

Get copy of all recorded 'error' messages, for a given test from a given target.

get_info([test, target])

Get copy of all recorded 'information' messages, for a given test from a given target.

get_inverse_predicate(predicate)

Utility wrapper of logic to robustly test if a predicate exists and has an inverse.

get_message_type(code)

Get type of message code.

get_messages_by_target([target])

Returns a block of MESSAGES_BY_TEST corresponding to a given or default target.

get_messages_by_test([test, target])

Returns MESSAGE_CATALOG corresponding to a given or default target. Note that the dictionary returned is not a copy of the original thus caution should be taken not to mutate it! :param test: str, specified test (gets current 'default' test if not given) :param target: str, specified target (gets current 'default' test if not given) :return: MESSAGES_BY_TEST corresponding to a resolved target.

get_messages_of_type(message_type[, test, ...])

Get Python data dictionary of ValidationReporter messages of 'message_type', for a specified (or default?) target and test.

get_node_categories(node_id)

Categories by 'node_id'. :param node_id: :return: For a given node_id, returns the associated categories; None if node_id is currently unknown or has no categories.

get_node_identifiers()

return:

List of currently registered node_ids

get_result()

Get result of validation.

get_skipped([test, target])

Get copy of all recorded 'skipped test' messages, for a given test from a given target.

get_target_provenance()

Returns infores-prefix-normalized target provenance metadata.

get_trapi_version()

return:

str, TRAPI (SemVer) version currently targeted by the TRAPISchemaValidator.

get_warnings([test, target])

Get copy of all recorded 'warning' messages, for a given test from a given target.

has_critical([test, target])

Predicate to detect any recorded critical error messages.

has_errors([test, target])

Predicate to detect any recorded error messages.

has_information([test, target])

Predicate to detect any recorded information messages.

has_message_type(message_type[, test, target])

Predicate to detect if ValidationReporter has any non-empty messages of type 'message_type'.

has_messages([test, target])

Predicate to detect any recorded validation messages.

has_skipped([test, target])

Predicate to detect any recorded 'skipped test' messages.

has_valid_knowledge_graph(message[, edges_limit])

Validate a TRAPI Knowledge Graph.

has_valid_query_graph(message)

Validate a TRAPI Query Graph.

has_valid_results(message[, sample_size])

Validate a TRAPI Results.

has_warnings([test, target])

Predicate to detect any recorded warning messages.

is_strict_validation(graph_type[, ...])

Predicate to test if strict validation is to be applied.

is_symmetric(name)

Checks if a given element identified by name, is a symmetric (predicate) slot.

is_valid_trapi_query(instance[, component])

Make sure that the Message is a syntactically valid TRAPI Query JSON object.

merge(reporter)

Merge all messages and metadata from a second BiolinkValidator, into the calling TRAPISchemaValidator instance.

merge_coded_messages(aggregated, additions)

Merge additional MESSAGE_PARTITION content into an already aggregate MESSAGE_PARTITION.

minimum_required_biolink_version(version)

param version:

simple 'major.minor.patch' Biolink Model SemVer

minimum_required_trapi_version(version)

param version:

simple 'major.minor.patch' TRAPI schema release SemVer

report(code[, test, target, source_trail])

Capture a single validation message, as per specified 'code' (with any code-specific contextual parameters). :param code: str, dot delimited validation path code :param test: str, specified test (gets current 'default' test if not given) :param target: str, specified target (gets current 'default' test if not given) :param source_trail, Optional[str], audit trail of knowledge source provenance for a given Edge, as a string. Defaults to "global" if not specified. :param message: **Dict, named parameters representing extra (str-formatted) context for the given code message :return: None (internally record the validation message).

report_header([title, compact_format])

Return a suitably generated report header. :param title: Optional[str], if title is None, then only the 'reasoner-validator' version is printed out in the header. If the title is an empty string (the default), then 'Validation Report' used. :param compact_format: bool, whether to print the header in compact format (default: True). Extra line feeds are otherwise provided to provide space around the header and control characters are output to underline the header. :return: str, generated header.

reset_biolink_version(version)

Reset Biolink Model version tracked by the ValidationReporter.

reset_default_target(name)

Resets the default target identifier of the ValidationReporter to a new string.

reset_default_test(name)

Resets the default test identifier of the ValidationReporter to a new string.

reset_trapi_version(version)

Reset TRAPI version tracked by the TRAPISchemaValidator.

resolve_testcase_node(target, testcase, nodes)

Resolve the knowledge graph node identifiers against the testcase identifier of the 'target' context ('subject' or 'object' node).

sample_graph(graph[, edges_limit])

Only process a strict subsample of the TRAPI Response Message knowledge graph.

sample_results(results[, sample_size])

Subsample the results to a maximum size of 'sample_size'

sanitize_workflow(response)

Workflows in TRAPI Responses cannot be validated further due to missing tags and None values.

set_nodes(nodes)

Records additional nodes, uniquely by node_id, with specified categories.

test_case_has_validation_errors(tag, case)

Check if test case has validation errors.

testcase_edge_bindings(query_edges, ...)

Check if target query edge id and knowledge graph edge id are in specified edge_bindings.

testcase_input_found_in_response(testcase, ...)

Predicate to validate if test data test case specified edge is returned in the Knowledge Graph of the TRAPI Response Message.

testcase_node_bindings(query_nodes, ...)

Check if the specified subject and object identifier are found in the result node bindings.

testcase_node_category_found(target, ...)

Retrieve the most specific Biolink Model category match of knowledge graph node to testcase.

testcase_node_found(target, ...)

Check for presence of at least one of the given identifiers, with expected categories, in the "nodes" catalog.

testcase_result_found(query_graph, ...)

Validate that test testcase S--P->O edge is found bound to the Results? :param query_graph: Dict, query graph to which the results pertain :param subject_id: str, subject node (CURIE) identifier :param subject_query_id: Optional[str], subject node (CURIE) query node identifier (if applicable) :param object_id: str, object node (CURIE) identifier :param object_query_id: Optional[str], object node (CURIE) query node identifier (if applicable) :param edge_id: str, edge identifier :param results: List of (TRAPI-version specific) Result objects :return: bool, True if testcase S-P-O edge was found in the results

to_dict()

Export BiolinkValidator contents as a Python dictionary (including Biolink version and parent class dictionary content).

validate(instance, component)

Validate instance against schema.

validate_agent_type(edge_id, found, value)

Validate the value of a 'agent_type' of the given edge.

validate_attribute_constraints(edge_id, edge)

Validate Query Edge Attributes.

validate_attributes(graph_type, edge_id, edge)

Validate Knowledge Edge Attributes.

validate_binding(q_node_entry, target_id, ...)

Validate that a specified target_id has a valid node_binding in specified node binding details.

validate_biolink()

Predicate to check if the Biolink (version) is tagged to 'suppress' compliance validation.

validate_category(context, node_id, category)

Validate a Biolink category.

validate_element_status(graph_type, context, ...)

Detect element missing from Biolink, or is deprecated, abstract or mixin, signalled as a failure or warning.

validate_graph_edge(edge, graph_type)

Validate slot properties of a relationship ('biolink:Association') edge.

validate_graph_node(node_id, slots, graph_type)

Validate slot properties (mainly 'categories') of a node.

validate_infores(context, edge_id, identifier)

Validate that the specified identifier is a well-formed Infores CURIE.

validate_knowledge_level(edge_id, found, value)

Validate the value of a 'knowledge_level' of the given edge.

validate_predicate(edge_id, predicate, ...)

Validates predicates based on their meta-nature: existence, mixin, deprecation, etc.

validate_provenance(edge_id, ara_source, ...)

Validates ARA and KP infores knowledge sources based on surveyed Edge slots (recorded in edge "attributes" pre-1.4.0; in "sources", post-1.4.0).

validate_qualifier_constraints(edge_id, edge)

Validate Query Edge Qualifier Constraints.

validate_qualifier_entry(context, edge_id, ...)

Validate Qualifier Entry (JSON Object).

validate_qualifiers(edge_id, edge[, ...])

Validate Knowledge Edge Qualifiers.

validate_slot_value(slot_name, context, ...)

Validate the single-valued value of a specified slot of the given knowledge graph entity slot. :param slot_name, str, name of a valid slot, a value for which is to be validated :param context: str, context of the validation (e.g. node or edge id) :param found: bool, current status of slot detection, Should be true if the slot was already previously seen :param value: Optional[str], the value to be validated :return: bool, True if valid slot and value (validation messages recorded in the BiolinkValidator).

validate_sources(edge_id, edge)

Validate (TRAPI 1.4.0-beta ++) Edge sources provenance.

count_node

get_attribute_type_exclusions

get_bmt

has_dangling_nodes

has_valid_node_information

is_trapi_1_4_or_later

is_treats

merge_identified_messages

merge_scoped_messages

reset_node_info

validate_input_edge_node

category_matched(source_categories: List[str], target_categories: List[str]) str | None

For each ‘source’ Biolink Model category given (list of CURIEs as strings?), first get the union set of all parent (ancestral) categories, then check if at least one of these categories is matched to the list of target categories.

Parameters:
  • source_categories – List[str], list of ‘source’ categories whose category hierarchy is to be matched.

  • target_categories – List[str], list of ‘target’ categories to be matched against ‘source’ (or ‘source parent’ categories

Returns:

bool, returned category matched (could be a generic parent of a ‘source’ category)

check_compliance_of_trapi_response(response: Dict | None, max_kg_edges: int = 0, max_results: int = 0)

One stop validation of all components of a TRAPI-schema compliant Query.Response, including its Message against a designated Biolink Model release. The high level structure of a Query.Response is described in https://github.com/NCATSTranslator/ReasonerAPI/blob/master/docs/reference.md#response-.

The TRAPI Query.Response.Message is a Python Dictionary with three entries:

  • Query Graph (“QGraph”): knowledge graph query input parameters

  • Knowledge Graph: output knowledge (sub-)graph containing knowledge (Biolink Model compliant nodes, edges)

    returned from the target resource (KP, ARA) for the query.

  • Results: a list of (annotated) node and edge bindings pointing into the Knowledge Graph, to represent the

    specific answers (subgraphs) satisfying the query graph constraints.

Parameters:
  • response – Optional[Dict], Query.Response to be validated.

  • max_kg_edges – int, maximum number of edges to be validated from the knowledge graph of the response. A value of zero triggers validation of all edges in the knowledge graph (Default: 0 - use all edges)

  • max_results – int, target sample number of results to validate (default: 0 for ‘use all results’).

get_aliases(curie: str) List[str] | None

Get clique of related identifiers from the Node Normalizer. Note that except for the cases of a missing or invalid CURIE input, this method is guaranteed to succeed in returning at least the input CURIE as one of the aliases; however, the method reports various validation warnings based on the completeness of the entry reported by the Node Normalizer. :param curie: str, CURIE of node identifier for which aliases are needed. :return: List[str], of all aliases (including at least the CURIE itself,

unless validation error is encountered, then None)

has_valid_knowledge_graph(message: Dict, edges_limit: int = 0) bool

Validate a TRAPI Knowledge Graph.

Parameters:
  • message – Dict, input message expected to contain the ‘knowledge_graph’

  • edges_limit – int, integer maximum number of edges to be validated in the knowledge graph. A value of zero triggers validation of all edges in the knowledge graph (Default: 0 - use all edges)

Returns:

bool, False, if validation errors

has_valid_query_graph(message: Dict) bool

Validate a TRAPI Query Graph. :param message: input message expected to contain the ‘query_graph’ :return: bool, False, if validation errors

has_valid_results(message: Dict, sample_size: int = 0) bool

Validate a TRAPI Results.

Parameters:
  • message – input message expected to contain the ‘results’

  • sample_size – int, sample number of results to validate (default: 0 for ‘use all results’).

Returns:

bool, False, if validation errors

is_trapi_1_4_or_later() bool
resolve_testcase_node(target: str, testcase: Dict, nodes: Dict) Tuple[str, str, str | None] | None

Resolve the knowledge graph node identifiers against the testcase identifier of the ‘target’ context (‘subject’ or ‘object’ node). If a direct match is not found for the testcase identifier, check if the nodes identifiers returned in the knowledge graph are strict ontological subclasses of the target testcase identifier (e.g. the knowledge graph may return a subclass of an instance of MONDO disease as requested by the testcase). Node matches must also be compatible in terms of Biolink Model category.

Parameters:
  • target – ‘subject’ or ‘object’

  • testcase – Dict, full test testcase (to access the target node ‘category’)

  • nodes – Dict, details about knowledge graph nodes, indexed by node identifiers

Returns:

Optional[Tuple[str, str, Optional[str]]], returns the KG node identifier, category, and query identifier matched (if applicable); None if no match

static sample_graph(graph: Dict, edges_limit: int = 0) Dict

Only process a strict subsample of the TRAPI Response Message knowledge graph.

Parameters:
  • graph (Dict) – original knowledge graph

  • edges_limit (int) – integer maximum number of edges to be validated in the knowledge graph. A value of zero triggers validation of all edges in the knowledge graph (Default: 0 - use all edges)

Returns:

Dict, ‘edges_limit’ sized subset of knowledge graph

static sample_results(results: List, sample_size: int = 0) List

Subsample the results to a maximum size of ‘sample_size’

Parameters:
  • results – List, original list of Results

  • sample_size – int, target sample size (default: 0 for ‘use all results’).

Returns:

List, ‘sample_size’ sized subset of Results

static sanitize_workflow(response: Dict) Dict

Workflows in TRAPI Responses cannot be validated further due to missing tags and None values. This method is a temporary workaround to sanitize the query for additional validation.

Parameters:

response – Dict full TRAPI Response JSON object

Returns:

Dict, response with discretionary removal of content which triggers (temporarily) unwarranted TRAPI validation failures

static testcase_edge_bindings(query_edges: Dict, target_edge_id: str, data: Dict) bool

Check if target query edge id and knowledge graph edge id are in specified edge_bindings. :param query_edges: List[str], expected query edge identifiers in a matching result :param target_edge_id: str, expected knowledge edge identifier in a matching result :param data: TRAPI version-specific Response context from which the ‘edge_bindings’ may be retrieved :return: True, if found

testcase_input_found_in_response(testcase: Dict, response: Dict) bool

Predicate to validate if test data test case specified edge is returned in the Knowledge Graph of the TRAPI Response Message. This method assumes that the TRAPI response is already generally validated as well-formed.

Parameters:
  • testcase – Dict, input data test case

  • response – Dict, TRAPI Response whose message ought to contain the test case edge

Returns:

True if test case edge found; False otherwise

testcase_node_bindings(query_nodes: Dict, subject_id: str, subject_query_id: str | None, object_id: str, object_query_id: str | None, data: Dict) bool

Check if the specified subject and object identifier are found in the result node bindings. Expected query_id’s are also validated.

Parameters:
  • query_nodes – query nodes dictionary

  • subject_id – expected node identifier of the knowledge graph subject

  • subject_query_id – expected bound ‘query_id’ if not the ‘subject_id’ (see TRAPI spec)

  • object_id – expected node identifier of the knowledge graph object

  • object_query_id – expected bound ‘query_id’ if not the ‘object_id’ (see TRAPI spec)

  • data – the result object

Returns:

bool, True if node_bindings found for specified subject and object

testcase_node_category_found(target, node_id, testcase, node_details) str | None

Retrieve the most specific Biolink Model category match of knowledge graph node to testcase.

Parameters:
  • target – the concept node type of interest: the ‘subject’ or the ‘object’

  • node_id – str, identifier of node in “nodes” catalog whose category is to be matched against the testcase

  • testcase – Dict, full test testcase against which the input node is being matched

  • node_details – Dict, details about an individual knowledge graph node being processed.

Returns:

str, most specific Biolink Model category match of knowledge graph node to testcase; None if not found

testcase_node_found(target: str, target_id_aliases: List[str], testcase: Dict, nodes: Dict) Tuple[str, str, str | None] | None

Check for presence of at least one of the given identifiers, with expected categories, in the “nodes” catalog. If such identifier is found, and at least one KG node category is the expected category or a proper subclass category of the test testcase category, then return True; if the node is found but the testcase category is not the expected category but is a subclass category of the KG node categories (i.e. KG node categories are too general), then return False. If the identifier is NOT found in the nodes list or there is no overlap in the (expected or parent) testcase and node categories, then return False.

Parameters:
  • target – the concept node type of interest: the ‘subject’ or the ‘object’

  • target_id_aliases – List of (CURIE) target identifier aliases to be matched against the “nodes” catalog

  • testcase – Dict, full test testcase (to access the target node ‘category’)

  • nodes – Dict, catalog of knowledge graph nodes, indexed by node identifiers, with node details as values.

Returns:

Optional[Tuple[str, str, Optional[str]]], returns the KG node identifier, category, and query identifier matched (if applicable); None if no match

testcase_result_found(query_graph: Dict, subject_id: str, subject_query_id: str | None, object_id: str, object_query_id: str | None, edge_id: str, results: List) bool

Validate that test testcase S–P->O edge is found bound to the Results? :param query_graph: Dict, query graph to which the results pertain :param subject_id: str, subject node (CURIE) identifier :param subject_query_id: Optional[str], subject node (CURIE) query node identifier (if applicable) :param object_id: str, object node (CURIE) identifier :param object_query_id: Optional[str], object node (CURIE) query node identifier (if applicable) :param edge_id: str, edge identifier :param results: List of (TRAPI-version specific) Result objects :return: bool, True if testcase S-P-O edge was found in the results

validate_binding(q_node_entry: Dict, target_id: str, target_query_id: str | None, node_binding_details: Dict) bool

Validate that a specified target_id has a valid node_binding in specified node binding details.

Parameters:
  • q_node_entry – Dict, query node data currently being searched

  • target_id – str, target knowledge graph node identifier to be matched to node binding

  • target_query_id – Optional[str], query identifier related to the target identifier, if not identical to the target_id

  • node_binding_details – Dict, data relating to a given node_binding of query to knowledge graph identifier

Returns:

bool, True if a valid node binding was found