TRAPI Response Validation
- class reasoner_validator.validator.TRAPIResponseValidator(default_test: str | None = None, default_target: str | None = None, trapi_version: str | None = None, biolink_version: str | None = None, target_provenance: Dict[str, str] | None = None, strict_validation: bool | None = None, suppress_empty_data_warnings: bool = False)
Bases:
BiolinkValidator
TRAPIResponseValidator is an overall wrapper class for validating conformance of full TRAPI Responses to TRAPI and the Biolink Model.
Methods
add_messages
(new_messages)Batch addition of MESSAGES_BY_TARGET messages to a ValidationReporter instance. :param new_messages: MESSAGES_BY_TARGET, messages indexed by target, test and categories: one of "information", "skipped tests", "warnings", "errors" or "critical", with code-keyed dictionaries of (structured) message parameters.
apply_validation
(validation_method, *args, ...)Wrapper to allow validation_methods direct access to the ValidationReporter.
build_source_trail
(sources)Returns a 'source_trail' path from 'primary_knowledge_source' upwards.
category_matched
(source_categories, ...)For each 'source' Biolink Model category given (list of CURIEs as strings?), first get the union set of all parent (ancestral) categories, then check if at least one of these categories is matched to the list of target categories.
check_biolink_model_compliance
(graph, graph_type)Validate a TRAPI-schema compliant Message graph-like data structure against the currently active Biolink Model Toolkit model version.
check_biolink_model_compliance_of_input_edge
(edge)Validate a templated test input edge contents against the current BMT Biolink Model release.
check_compliance_of_trapi_response
(response)One stop validation of all components of a TRAPI-schema compliant Query.Response, including its Message against a designated Biolink Model release.
dump
([title, id_rows, msg_rows, ...])Dump all available messages captured by the ValidationReporter, printed as formatted human-readable text, on a specified file device.
dump_all_messages
([test, target, flat])Dump all messages for a given test from a given target, as JSON.
dump_critical
([test, target, flat])Dump critical error messages as JSON.
dump_errors
([test, target, flat])Dump 'error' messages as JSON.
dump_info
([test, target, flat])Dump 'information' messages as JSON.
dump_messages_type
(message_type[, test, ...])Dump ValidationReporter messages of type 'message_type' as JSON.
dump_skipped
([test, target, flat])Dump 'skipped test' messages as JSON.
dump_warnings
([test, target, flat])Dump 'warning' messages as JSON.
dumps
([id_rows, msg_rows, compact_format])Text string version of dump(): returns all available messages captured by the ValidationReporter, as a formatted human-readable text blob.
get_aliases
(curie)Get clique of related identifiers from the Node Normalizer. Note that except for the cases of a missing or invalid CURIE input, this method is guaranteed to succeed in returning at least the input CURIE as one of the aliases; however, the method reports various validation warnings based on the completeness of the entry reported by the Node Normalizer. :param curie: str, CURIE of node identifier for which aliases are needed. :return: List[str], of all aliases (including at least the CURIE itself, unless validation error is encountered, then None).
get_all_messages
()Get copy of all MESSAGES_BY_TARGET as a Python data structure.
get_all_messages_of_type
(message_type)Get MESSAGE_PARTITION dictionary of all ValidationReporter messages of a given 'message_type', harvested from all target and test contexts.
get_biolink_version
()- return:
Biolink Model version currently tracked by the TRAPISchemaValidator.
get_critical
([test, target])Get copy of all recorded 'critical' error messages, for a given test from a given target.
get_default_target
()Returns the current target of the ValidationReporter.
get_default_test
()Returns the current default test identifier of the ValidationReporter.
get_errors
([test, target])Get copy of all recorded 'error' messages, for a given test from a given target.
get_info
([test, target])Get copy of all recorded 'information' messages, for a given test from a given target.
get_inverse_predicate
(predicate)Utility wrapper of logic to robustly test if a predicate exists and has an inverse.
get_message_type
(code)Get type of message code.
get_messages_by_target
([target])Returns a block of MESSAGES_BY_TEST corresponding to a given or default target.
get_messages_by_test
([test, target])Returns MESSAGE_CATALOG corresponding to a given or default target. Note that the dictionary returned is not a copy of the original thus caution should be taken not to mutate it! :param test: str, specified test (gets current 'default' test if not given) :param target: str, specified target (gets current 'default' test if not given) :return: MESSAGES_BY_TEST corresponding to a resolved target.
get_messages_of_type
(message_type[, test, ...])Get Python data dictionary of ValidationReporter messages of 'message_type', for a specified (or default?) target and test.
get_node_categories
(node_id)Categories by 'node_id'. :param node_id: :return: For a given node_id, returns the associated categories; None if node_id is currently unknown or has no categories.
get_node_identifiers
()- return:
List of currently registered node_ids
get_result
()Get result of validation.
get_skipped
([test, target])Get copy of all recorded 'skipped test' messages, for a given test from a given target.
get_target_provenance
()Returns infores-prefix-normalized target provenance metadata.
get_trapi_version
()- return:
str, TRAPI (SemVer) version currently targeted by the TRAPISchemaValidator.
get_warnings
([test, target])Get copy of all recorded 'warning' messages, for a given test from a given target.
has_critical
([test, target])Predicate to detect any recorded critical error messages.
has_errors
([test, target])Predicate to detect any recorded error messages.
has_information
([test, target])Predicate to detect any recorded information messages.
has_message_type
(message_type[, test, target])Predicate to detect if ValidationReporter has any non-empty messages of type 'message_type'.
has_messages
([test, target])Predicate to detect any recorded validation messages.
has_skipped
([test, target])Predicate to detect any recorded 'skipped test' messages.
has_valid_knowledge_graph
(message[, edges_limit])Validate a TRAPI Knowledge Graph.
has_valid_query_graph
(message)Validate a TRAPI Query Graph.
has_valid_results
(message[, sample_size])Validate a TRAPI Results.
has_warnings
([test, target])Predicate to detect any recorded warning messages.
is_strict_validation
(graph_type[, ...])Predicate to test if strict validation is to be applied.
is_symmetric
(name)Checks if a given element identified by name, is a symmetric (predicate) slot.
is_valid_trapi_query
(instance[, component])Make sure that the Message is a syntactically valid TRAPI Query JSON object.
merge
(reporter)Merge all messages and metadata from a second BiolinkValidator, into the calling TRAPISchemaValidator instance.
merge_coded_messages
(aggregated, additions)Merge additional MESSAGE_PARTITION content into an already aggregate MESSAGE_PARTITION.
minimum_required_biolink_version
(version)- param version:
simple 'major.minor.patch' Biolink Model SemVer
minimum_required_trapi_version
(version)- param version:
simple 'major.minor.patch' TRAPI schema release SemVer
report
(code[, test, target, source_trail])Capture a single validation message, as per specified 'code' (with any code-specific contextual parameters). :param code: str, dot delimited validation path code :param test: str, specified test (gets current 'default' test if not given) :param target: str, specified target (gets current 'default' test if not given) :param source_trail, Optional[str], audit trail of knowledge source provenance for a given Edge, as a string. Defaults to "global" if not specified. :param message: **Dict, named parameters representing extra (str-formatted) context for the given code message :return: None (internally record the validation message).
report_header
([title, compact_format])Return a suitably generated report header. :param title: Optional[str], if title is None, then only the 'reasoner-validator' version is printed out in the header. If the title is an empty string (the default), then 'Validation Report' used. :param compact_format: bool, whether to print the header in compact format (default: True). Extra line feeds are otherwise provided to provide space around the header and control characters are output to underline the header. :return: str, generated header.
reset_biolink_version
(version)Reset Biolink Model version tracked by the ValidationReporter.
reset_default_target
(name)Resets the default target identifier of the ValidationReporter to a new string.
reset_default_test
(name)Resets the default test identifier of the ValidationReporter to a new string.
reset_trapi_version
(version)Reset TRAPI version tracked by the TRAPISchemaValidator.
resolve_testcase_node
(target, testcase, nodes)Resolve the knowledge graph node identifiers against the testcase identifier of the 'target' context ('subject' or 'object' node).
sample_graph
(graph[, edges_limit])Only process a strict subsample of the TRAPI Response Message knowledge graph.
sample_results
(results[, sample_size])Subsample the results to a maximum size of 'sample_size'
sanitize_workflow
(response)Workflows in TRAPI Responses cannot be validated further due to missing tags and None values.
set_nodes
(nodes)Records additional nodes, uniquely by node_id, with specified categories.
test_case_has_validation_errors
(tag, case)Check if test case has validation errors.
testcase_edge_bindings
(query_edges, ...)Check if target query edge id and knowledge graph edge id are in specified edge_bindings.
testcase_input_found_in_response
(testcase, ...)Predicate to validate if test data test case specified edge is returned in the Knowledge Graph of the TRAPI Response Message.
testcase_node_bindings
(query_nodes, ...)Check if the specified subject and object identifier are found in the result node bindings.
testcase_node_category_found
(target, ...)Retrieve the most specific Biolink Model category match of knowledge graph node to testcase.
testcase_node_found
(target, ...)Check for presence of at least one of the given identifiers, with expected categories, in the "nodes" catalog.
testcase_result_found
(query_graph, ...)Validate that test testcase S--P->O edge is found bound to the Results? :param query_graph: Dict, query graph to which the results pertain :param subject_id: str, subject node (CURIE) identifier :param subject_query_id: Optional[str], subject node (CURIE) query node identifier (if applicable) :param object_id: str, object node (CURIE) identifier :param object_query_id: Optional[str], object node (CURIE) query node identifier (if applicable) :param edge_id: str, edge identifier :param results: List of (TRAPI-version specific) Result objects :return: bool, True if testcase S-P-O edge was found in the results
to_dict
()Export BiolinkValidator contents as a Python dictionary (including Biolink version and parent class dictionary content).
validate
(instance, component)Validate instance against schema.
validate_agent_type
(edge_id, found, value)Validate the value of a 'agent_type' of the given edge.
validate_attribute_constraints
(edge_id, edge)Validate Query Edge Attributes.
validate_attributes
(graph_type, edge_id, edge)Validate Knowledge Edge Attributes.
validate_binding
(q_node_entry, target_id, ...)Validate that a specified target_id has a valid node_binding in specified node binding details.
validate_biolink
()Predicate to check if the Biolink (version) is tagged to 'suppress' compliance validation.
validate_category
(context, node_id, category)Validate a Biolink category.
validate_element_status
(graph_type, context, ...)Detect element missing from Biolink, or is deprecated, abstract or mixin, signalled as a failure or warning.
validate_graph_edge
(edge, graph_type)Validate slot properties of a relationship ('biolink:Association') edge.
validate_graph_node
(node_id, slots, graph_type)Validate slot properties (mainly 'categories') of a node.
validate_infores
(context, edge_id, identifier)Validate that the specified identifier is a well-formed Infores CURIE.
validate_knowledge_level
(edge_id, found, value)Validate the value of a 'knowledge_level' of the given edge.
validate_predicate
(edge_id, predicate, ...)Validates predicates based on their meta-nature: existence, mixin, deprecation, etc.
validate_provenance
(edge_id, ara_source, ...)Validates ARA and KP infores knowledge sources based on surveyed Edge slots (recorded in edge "attributes" pre-1.4.0; in "sources", post-1.4.0).
validate_qualifier_constraints
(edge_id, edge)Validate Query Edge Qualifier Constraints.
validate_qualifier_entry
(context, edge_id, ...)Validate Qualifier Entry (JSON Object).
validate_qualifiers
(edge_id, edge[, ...])Validate Knowledge Edge Qualifiers.
validate_slot_value
(slot_name, context, ...)Validate the single-valued value of a specified slot of the given knowledge graph entity slot. :param slot_name, str, name of a valid slot, a value for which is to be validated :param context: str, context of the validation (e.g. node or edge id) :param found: bool, current status of slot detection, Should be true if the slot was already previously seen :param value: Optional[str], the value to be validated :return: bool, True if valid slot and value (validation messages recorded in the BiolinkValidator).
validate_sources
(edge_id, edge)Validate (TRAPI 1.4.0-beta ++) Edge sources provenance.
count_node
get_attribute_type_exclusions
get_bmt
has_dangling_nodes
has_valid_node_information
is_trapi_1_4_or_later
is_treats
merge_identified_messages
merge_scoped_messages
reset_node_info
validate_input_edge_node
- category_matched(source_categories: List[str], target_categories: List[str]) str | None
For each ‘source’ Biolink Model category given (list of CURIEs as strings?), first get the union set of all parent (ancestral) categories, then check if at least one of these categories is matched to the list of target categories.
- Parameters:
source_categories – List[str], list of ‘source’ categories whose category hierarchy is to be matched.
target_categories – List[str], list of ‘target’ categories to be matched against ‘source’ (or ‘source parent’ categories
- Returns:
bool, returned category matched (could be a generic parent of a ‘source’ category)
- check_compliance_of_trapi_response(response: Dict | None, max_kg_edges: int = 0, max_results: int = 0)
One stop validation of all components of a TRAPI-schema compliant Query.Response, including its Message against a designated Biolink Model release. The high level structure of a Query.Response is described in https://github.com/NCATSTranslator/ReasonerAPI/blob/master/docs/reference.md#response-.
The TRAPI Query.Response.Message is a Python Dictionary with three entries:
Query Graph (“QGraph”): knowledge graph query input parameters
- Knowledge Graph: output knowledge (sub-)graph containing knowledge (Biolink Model compliant nodes, edges)
returned from the target resource (KP, ARA) for the query.
- Results: a list of (annotated) node and edge bindings pointing into the Knowledge Graph, to represent the
specific answers (subgraphs) satisfying the query graph constraints.
- Parameters:
response – Optional[Dict], Query.Response to be validated.
max_kg_edges – int, maximum number of edges to be validated from the knowledge graph of the response. A value of zero triggers validation of all edges in the knowledge graph (Default: 0 - use all edges)
max_results – int, target sample number of results to validate (default: 0 for ‘use all results’).
- get_aliases(curie: str) List[str] | None
Get clique of related identifiers from the Node Normalizer. Note that except for the cases of a missing or invalid CURIE input, this method is guaranteed to succeed in returning at least the input CURIE as one of the aliases; however, the method reports various validation warnings based on the completeness of the entry reported by the Node Normalizer. :param curie: str, CURIE of node identifier for which aliases are needed. :return: List[str], of all aliases (including at least the CURIE itself,
unless validation error is encountered, then None)
- has_valid_knowledge_graph(message: Dict, edges_limit: int = 0) bool
Validate a TRAPI Knowledge Graph.
- Parameters:
message – Dict, input message expected to contain the ‘knowledge_graph’
edges_limit – int, integer maximum number of edges to be validated in the knowledge graph. A value of zero triggers validation of all edges in the knowledge graph (Default: 0 - use all edges)
- Returns:
bool, False, if validation errors
- has_valid_query_graph(message: Dict) bool
Validate a TRAPI Query Graph. :param message: input message expected to contain the ‘query_graph’ :return: bool, False, if validation errors
- has_valid_results(message: Dict, sample_size: int = 0) bool
Validate a TRAPI Results.
- Parameters:
message – input message expected to contain the ‘results’
sample_size – int, sample number of results to validate (default: 0 for ‘use all results’).
- Returns:
bool, False, if validation errors
- is_trapi_1_4_or_later() bool
- resolve_testcase_node(target: str, testcase: Dict, nodes: Dict) Tuple[str, str, str | None] | None
Resolve the knowledge graph node identifiers against the testcase identifier of the ‘target’ context (‘subject’ or ‘object’ node). If a direct match is not found for the testcase identifier, check if the nodes identifiers returned in the knowledge graph are strict ontological subclasses of the target testcase identifier (e.g. the knowledge graph may return a subclass of an instance of MONDO disease as requested by the testcase). Node matches must also be compatible in terms of Biolink Model category.
- Parameters:
target – ‘subject’ or ‘object’
testcase – Dict, full test testcase (to access the target node ‘category’)
nodes – Dict, details about knowledge graph nodes, indexed by node identifiers
- Returns:
Optional[Tuple[str, str, Optional[str]]], returns the KG node identifier, category, and query identifier matched (if applicable); None if no match
- static sample_graph(graph: Dict, edges_limit: int = 0) Dict
Only process a strict subsample of the TRAPI Response Message knowledge graph.
- Parameters:
graph (Dict) – original knowledge graph
edges_limit (int) – integer maximum number of edges to be validated in the knowledge graph. A value of zero triggers validation of all edges in the knowledge graph (Default: 0 - use all edges)
- Returns:
Dict, ‘edges_limit’ sized subset of knowledge graph
- static sample_results(results: List, sample_size: int = 0) List
Subsample the results to a maximum size of ‘sample_size’
- Parameters:
results – List, original list of Results
sample_size – int, target sample size (default: 0 for ‘use all results’).
- Returns:
List, ‘sample_size’ sized subset of Results
- static sanitize_workflow(response: Dict) Dict
Workflows in TRAPI Responses cannot be validated further due to missing tags and None values. This method is a temporary workaround to sanitize the query for additional validation.
- Parameters:
response – Dict full TRAPI Response JSON object
- Returns:
Dict, response with discretionary removal of content which triggers (temporarily) unwarranted TRAPI validation failures
- static testcase_edge_bindings(query_edges: Dict, target_edge_id: str, data: Dict) bool
Check if target query edge id and knowledge graph edge id are in specified edge_bindings. :param query_edges: List[str], expected query edge identifiers in a matching result :param target_edge_id: str, expected knowledge edge identifier in a matching result :param data: TRAPI version-specific Response context from which the ‘edge_bindings’ may be retrieved :return: True, if found
- testcase_input_found_in_response(testcase: Dict, response: Dict) bool
Predicate to validate if test data test case specified edge is returned in the Knowledge Graph of the TRAPI Response Message. This method assumes that the TRAPI response is already generally validated as well-formed.
- Parameters:
testcase – Dict, input data test case
response – Dict, TRAPI Response whose message ought to contain the test case edge
- Returns:
True if test case edge found; False otherwise
- testcase_node_bindings(query_nodes: Dict, subject_id: str, subject_query_id: str | None, object_id: str, object_query_id: str | None, data: Dict) bool
Check if the specified subject and object identifier are found in the result node bindings. Expected query_id’s are also validated.
- Parameters:
query_nodes – query nodes dictionary
subject_id – expected node identifier of the knowledge graph subject
subject_query_id – expected bound ‘query_id’ if not the ‘subject_id’ (see TRAPI spec)
object_id – expected node identifier of the knowledge graph object
object_query_id – expected bound ‘query_id’ if not the ‘object_id’ (see TRAPI spec)
data – the result object
- Returns:
bool, True if node_bindings found for specified subject and object
- testcase_node_category_found(target, node_id, testcase, node_details) str | None
Retrieve the most specific Biolink Model category match of knowledge graph node to testcase.
- Parameters:
target – the concept node type of interest: the ‘subject’ or the ‘object’
node_id – str, identifier of node in “nodes” catalog whose category is to be matched against the testcase
testcase – Dict, full test testcase against which the input node is being matched
node_details – Dict, details about an individual knowledge graph node being processed.
- Returns:
str, most specific Biolink Model category match of knowledge graph node to testcase; None if not found
- testcase_node_found(target: str, target_id_aliases: List[str], testcase: Dict, nodes: Dict) Tuple[str, str, str | None] | None
Check for presence of at least one of the given identifiers, with expected categories, in the “nodes” catalog. If such identifier is found, and at least one KG node category is the expected category or a proper subclass category of the test testcase category, then return True; if the node is found but the testcase category is not the expected category but is a subclass category of the KG node categories (i.e. KG node categories are too general), then return False. If the identifier is NOT found in the nodes list or there is no overlap in the (expected or parent) testcase and node categories, then return False.
- Parameters:
target – the concept node type of interest: the ‘subject’ or the ‘object’
target_id_aliases – List of (CURIE) target identifier aliases to be matched against the “nodes” catalog
testcase – Dict, full test testcase (to access the target node ‘category’)
nodes – Dict, catalog of knowledge graph nodes, indexed by node identifiers, with node details as values.
- Returns:
Optional[Tuple[str, str, Optional[str]]], returns the KG node identifier, category, and query identifier matched (if applicable); None if no match
- testcase_result_found(query_graph: Dict, subject_id: str, subject_query_id: str | None, object_id: str, object_query_id: str | None, edge_id: str, results: List) bool
Validate that test testcase S–P->O edge is found bound to the Results? :param query_graph: Dict, query graph to which the results pertain :param subject_id: str, subject node (CURIE) identifier :param subject_query_id: Optional[str], subject node (CURIE) query node identifier (if applicable) :param object_id: str, object node (CURIE) identifier :param object_query_id: Optional[str], object node (CURIE) query node identifier (if applicable) :param edge_id: str, edge identifier :param results: List of (TRAPI-version specific) Result objects :return: bool, True if testcase S-P-O edge was found in the results
- validate_binding(q_node_entry: Dict, target_id: str, target_query_id: str | None, node_binding_details: Dict) bool
Validate that a specified target_id has a valid node_binding in specified node binding details.
- Parameters:
q_node_entry – Dict, query node data currently being searched
target_id – str, target knowledge graph node identifier to be matched to node binding
target_query_id – Optional[str], query identifier related to the target identifier, if not identical to the target_id
node_binding_details – Dict, data relating to a given node_binding of query to knowledge graph identifier
- Returns:
bool, True if a valid node binding was found