Biolink Validation
Version-specific Biolink Model semantic validation of knowledge graph components.
- class reasoner_validator.biolink.BMTWrapper(biolink_version: str | None = None)
Bases:
object
Methods
- return:
Biolink Model version currently targeted by the ValidationReporter.
get_inverse_predicate
(predicate)Utility wrapper of logic to robustly test if a predicate exists and has an inverse.
is_symmetric
(name)Checks if a given element identified by name, is a symmetric (predicate) slot.
reset_biolink_version
(version)Reset Biolink Model version tracked by the ValidationReporter.
get_bmt
- get_biolink_version() str
- Returns:
Biolink Model version currently targeted by the ValidationReporter.
- Rtype biolink_version:
str
- get_bmt() Toolkit | None
- get_inverse_predicate(predicate: str | None) str | None
Utility wrapper of logic to robustly test if a predicate exists and has an inverse. :param predicate: CURIE or string name of predicate for which the inverse is sought. :return: CURIE string of inverse predicate, if it exists; None otherwise
- is_symmetric(name: str) bool
Checks if a given element identified by name, is a symmetric (predicate) slot. :param name: name of the element :return: True if element is a symmetric (predicate) slot.
- reset_biolink_version(version: str)
Reset Biolink Model version tracked by the ValidationReporter. :param version: new version :return: None
- class reasoner_validator.biolink.BiolinkValidator(default_test: str | None = None, default_target: str | None = None, trapi_version: str | None = None, biolink_version: str | None = None, target_provenance: Dict[str, str] | None = None, strict_validation: bool | None = None)
Bases:
TRAPISchemaValidator
,BMTWrapper
Wrapper class for Biolink Model validation of a TRAPI message.
Methods
add_messages
(new_messages)Batch addition of MESSAGES_BY_TARGET messages to a ValidationReporter instance. :param new_messages: MESSAGES_BY_TARGET, messages indexed by target, test and categories: one of "information", "skipped tests", "warnings", "errors" or "critical", with code-keyed dictionaries of (structured) message parameters.
apply_validation
(validation_method, *args, ...)Wrapper to allow validation_methods direct access to the ValidationReporter.
build_source_trail
(sources)Returns a 'source_trail' path from 'primary_knowledge_source' upwards.
check_biolink_model_compliance
(graph, graph_type)Validate a TRAPI-schema compliant Message graph-like data structure against the currently active Biolink Model Toolkit model version.
Validate a templated test input edge contents against the current BMT Biolink Model release.
dump
([title, id_rows, msg_rows, ...])Dump all available messages captured by the ValidationReporter, printed as formatted human-readable text, on a specified file device.
dump_all_messages
([test, target, flat])Dump all messages for a given test from a given target, as JSON.
dump_critical
([test, target, flat])Dump critical error messages as JSON.
dump_errors
([test, target, flat])Dump 'error' messages as JSON.
dump_info
([test, target, flat])Dump 'information' messages as JSON.
dump_messages_type
(message_type[, test, ...])Dump ValidationReporter messages of type 'message_type' as JSON.
dump_skipped
([test, target, flat])Dump 'skipped test' messages as JSON.
dump_warnings
([test, target, flat])Dump 'warning' messages as JSON.
dumps
([id_rows, msg_rows, compact_format])Text string version of dump(): returns all available messages captured by the ValidationReporter, as a formatted human-readable text blob.
get_all_messages
()Get copy of all MESSAGES_BY_TARGET as a Python data structure.
get_all_messages_of_type
(message_type)Get MESSAGE_PARTITION dictionary of all ValidationReporter messages of a given 'message_type', harvested from all target and test contexts.
- return:
Biolink Model version currently tracked by the TRAPISchemaValidator.
get_critical
([test, target])Get copy of all recorded 'critical' error messages, for a given test from a given target.
get_default_target
()Returns the current target of the ValidationReporter.
get_default_test
()Returns the current default test identifier of the ValidationReporter.
get_errors
([test, target])Get copy of all recorded 'error' messages, for a given test from a given target.
get_info
([test, target])Get copy of all recorded 'information' messages, for a given test from a given target.
get_inverse_predicate
(predicate)Utility wrapper of logic to robustly test if a predicate exists and has an inverse.
get_message_type
(code)Get type of message code.
get_messages_by_target
([target])Returns a block of MESSAGES_BY_TEST corresponding to a given or default target.
get_messages_by_test
([test, target])Returns MESSAGE_CATALOG corresponding to a given or default target. Note that the dictionary returned is not a copy of the original thus caution should be taken not to mutate it! :param test: str, specified test (gets current 'default' test if not given) :param target: str, specified target (gets current 'default' test if not given) :return: MESSAGES_BY_TEST corresponding to a resolved target.
get_messages_of_type
(message_type[, test, ...])Get Python data dictionary of ValidationReporter messages of 'message_type', for a specified (or default?) target and test.
get_node_categories
(node_id)Categories by 'node_id'. :param node_id: :return: For a given node_id, returns the associated categories; None if node_id is currently unknown or has no categories.
- return:
List of currently registered node_ids
Get result of validation.
get_skipped
([test, target])Get copy of all recorded 'skipped test' messages, for a given test from a given target.
Returns infores-prefix-normalized target provenance metadata.
get_trapi_version
()- return:
str, TRAPI (SemVer) version currently targeted by the TRAPISchemaValidator.
get_warnings
([test, target])Get copy of all recorded 'warning' messages, for a given test from a given target.
has_critical
([test, target])Predicate to detect any recorded critical error messages.
has_errors
([test, target])Predicate to detect any recorded error messages.
has_information
([test, target])Predicate to detect any recorded information messages.
has_message_type
(message_type[, test, target])Predicate to detect if ValidationReporter has any non-empty messages of type 'message_type'.
has_messages
([test, target])Predicate to detect any recorded validation messages.
has_skipped
([test, target])Predicate to detect any recorded 'skipped test' messages.
has_warnings
([test, target])Predicate to detect any recorded warning messages.
is_strict_validation
(graph_type[, ...])Predicate to test if strict validation is to be applied.
is_symmetric
(name)Checks if a given element identified by name, is a symmetric (predicate) slot.
is_valid_trapi_query
(instance[, component])Make sure that the Message is a syntactically valid TRAPI Query JSON object.
merge
(reporter)Merge all messages and metadata from a second BiolinkValidator, into the calling TRAPISchemaValidator instance.
merge_coded_messages
(aggregated, additions)Merge additional MESSAGE_PARTITION content into an already aggregate MESSAGE_PARTITION.
minimum_required_biolink_version
(version)- param version:
simple 'major.minor.patch' Biolink Model SemVer
minimum_required_trapi_version
(version)- param version:
simple 'major.minor.patch' TRAPI schema release SemVer
report
(code[, test, target, source_trail])Capture a single validation message, as per specified 'code' (with any code-specific contextual parameters). :param code: str, dot delimited validation path code :param test: str, specified test (gets current 'default' test if not given) :param target: str, specified target (gets current 'default' test if not given) :param source_trail, Optional[str], audit trail of knowledge source provenance for a given Edge, as a string. Defaults to "global" if not specified. :param message: **Dict, named parameters representing extra (str-formatted) context for the given code message :return: None (internally record the validation message).
report_header
([title, compact_format])Return a suitably generated report header. :param title: Optional[str], if title is None, then only the 'reasoner-validator' version is printed out in the header. If the title is an empty string (the default), then 'Validation Report' used. :param compact_format: bool, whether to print the header in compact format (default: True). Extra line feeds are otherwise provided to provide space around the header and control characters are output to underline the header. :return: str, generated header.
reset_biolink_version
(version)Reset Biolink Model version tracked by the ValidationReporter.
reset_default_target
(name)Resets the default target identifier of the ValidationReporter to a new string.
reset_default_test
(name)Resets the default test identifier of the ValidationReporter to a new string.
reset_trapi_version
(version)Reset TRAPI version tracked by the TRAPISchemaValidator.
set_nodes
(nodes)Records additional nodes, uniquely by node_id, with specified categories.
test_case_has_validation_errors
(tag, case)Check if test case has validation errors.
to_dict
()Export BiolinkValidator contents as a Python dictionary (including Biolink version and parent class dictionary content).
validate
(instance, component)Validate instance against schema.
validate_agent_type
(edge_id, found, value)Validate the value of a 'agent_type' of the given edge.
validate_attribute_constraints
(edge_id, edge)Validate Query Edge Attributes.
validate_attributes
(graph_type, edge_id, edge)Validate Knowledge Edge Attributes.
Predicate to check if the Biolink (version) is tagged to 'suppress' compliance validation.
validate_category
(context, node_id, category)Validate a Biolink category.
validate_element_status
(graph_type, context, ...)Detect element missing from Biolink, or is deprecated, abstract or mixin, signalled as a failure or warning.
validate_graph_edge
(edge, graph_type)Validate slot properties of a relationship ('biolink:Association') edge.
validate_graph_node
(node_id, slots, graph_type)Validate slot properties (mainly 'categories') of a node.
validate_infores
(context, edge_id, identifier)Validate that the specified identifier is a well-formed Infores CURIE.
validate_knowledge_level
(edge_id, found, value)Validate the value of a 'knowledge_level' of the given edge.
validate_predicate
(edge_id, predicate, ...)Validates predicates based on their meta-nature: existence, mixin, deprecation, etc.
validate_provenance
(edge_id, ara_source, ...)Validates ARA and KP infores knowledge sources based on surveyed Edge slots (recorded in edge "attributes" pre-1.4.0; in "sources", post-1.4.0).
validate_qualifier_constraints
(edge_id, edge)Validate Query Edge Qualifier Constraints.
validate_qualifier_entry
(context, edge_id, ...)Validate Qualifier Entry (JSON Object).
validate_qualifiers
(edge_id, edge[, ...])Validate Knowledge Edge Qualifiers.
validate_slot_value
(slot_name, context, ...)Validate the single-valued value of a specified slot of the given knowledge graph entity slot. :param slot_name, str, name of a valid slot, a value for which is to be validated :param context: str, context of the validation (e.g. node or edge id) :param found: bool, current status of slot detection, Should be true if the slot was already previously seen :param value: Optional[str], the value to be validated :return: bool, True if valid slot and value (validation messages recorded in the BiolinkValidator).
validate_sources
(edge_id, edge)Validate (TRAPI 1.4.0-beta ++) Edge sources provenance.
count_node
get_attribute_type_exclusions
get_bmt
has_dangling_nodes
has_valid_node_information
is_treats
merge_identified_messages
merge_scoped_messages
reset_node_info
validate_input_edge_node
- CATEGORY_INCLUSIONS = ['biolink:BiologicalEntity', 'biolink:InformationContentEntity']
- PREDICATE_INCLUSIONS = ['biolink:interacts_with', 'biolink:treats']
- static build_source_trail(sources: Dict[str, List[str]] | None) str | None
Returns a ‘source_trail’ path from ‘primary_knowledge_source’ upwards. The “sources” should have at least one and only one primary knowledge source (with an empty ‘upstream_resource_ids’ list).
- Parameters:
sources – Optional[Dict[str, List[str]]], catalog of upstream knowledge sources indexed by resource_id’s
- Returns:
Optional[str] source (“audit”) trail (‘path’) from primary to topmost wrapper knowledge source infores
- check_biolink_model_compliance(graph: Dict, graph_type: TRAPIGraphType)
Validate a TRAPI-schema compliant Message graph-like data structure against the currently active Biolink Model Toolkit model version.
- Parameters:
graph – Dict, knowledge graph to be validated
graph_type – TRAPIGraphType, component type of TRAPI graph to be validated
- check_biolink_model_compliance_of_input_edge(edge: Dict[str, str])
Validate a templated test input edge contents against the current BMT Biolink Model release.
Sample method ‘edge’ with expected dictionary tags:
- {
‘subject_category’: ‘biolink:AnatomicalEntity’, ‘object_category’: ‘biolink:AnatomicalEntity’, ‘predicate’: ‘biolink:subclass_of’, ‘subject’: ‘UBERON:0005453’, ‘object’: ‘UBERON:0035769’
}
- Parameters:
edge (Dict[str,str]) – basic dictionary of a templated input edge - S-P-O including concept Biolink Model categories
- count_node(node_id: str)
- get_attribute_type_exclusions() List[str]
- get_biolink_version() str
- Returns:
Biolink Model version currently tracked by the TRAPISchemaValidator.
- Rtype biolink_version:
str
- get_node_categories(node_id: str) List[str] | None
Categories by ‘node_id’. :param node_id: :return: For a given node_id, returns the associated categories;
None if node_id is currently unknown or has no categories.
- get_node_identifiers() List[str]
- Returns:
List of currently registered node_ids
- get_result() Tuple[str, Dict[str, Dict[str, Dict[str, Dict[str, Dict[str, Dict[str, List[Dict[str, str]] | None] | None]]]]]]
Get result of validation.
- Returns:
model version of the validation and dictionary of reported validation messages.
:rtype Tuple[str, Optional[Dict[str, Set[str]]]]
- get_target_provenance() Tuple[str | None, str | None, str | None]
Returns infores-prefix-normalized target provenance metadata. :return: Tuple[Optional[str], Optional[str], Optional[str]] of ara_source, kp_source, kp_source_type
- has_dangling_nodes() List[str]
- has_valid_node_information(graph_type: TRAPIGraphType) bool
- is_treats(predicate: str | None) bool
- merge(reporter)
Merge all messages and metadata from a second BiolinkValidator, into the calling TRAPISchemaValidator instance.
- Parameters:
reporter – second BiolinkValidator
- minimum_required_biolink_version(version: str) bool
- Parameters:
version – simple ‘major.minor.patch’ Biolink Model SemVer
- Returns:
True if current version is equal to, or newer than, a targeted ‘minimum_version’
- report_header(title: str | None = None, compact_format: bool = True) str
Return a suitably generated report header. :param title: Optional[str], if title is None, then only the ‘reasoner-validator’ version is printed out
in the header. If the title is an empty string (the default), then ‘Validation Report’ used.
- Parameters:
compact_format – bool, whether to print the header in compact format (default: True). Extra line feeds are otherwise provided to provide space around the header and control characters are output to underline the header.
- Returns:
str, generated header.
- reset_biolink_version(version: str)
Reset Biolink Model version tracked by the ValidationReporter. :param version: new version :return: None
- reset_node_info(graph_type: TRAPIGraphType)
- set_nodes(nodes: Dict)
Records additional nodes, uniquely by node_id, with specified categories. :param nodes: Dict, node_id indexed node categories. A given node_id is tagged with “None” if the categories are missing? :return: None
- to_dict() Dict
Export BiolinkValidator contents as a Python dictionary (including Biolink version and parent class dictionary content). :return: Dict
- validate_agent_type(edge_id: str, found: bool, value: str | None) bool
Validate the value of a ‘agent_type’ of the given edge. :param edge_id: str, identifier of the edge being validated :param found: bool, current status of slot detection, True if already seen previously (return ‘True’ value here) :param value: Optional[str], the value to be validated :return: None (validation messages recorded in the BiolinkValidator)
- validate_attribute_constraints(edge_id: str, edge: Dict)
Validate Query Edge Attributes.
- Parameters:
edge_id – str, string identifier for the edge (for reporting purposes)
edge – Dict, the edge object associated with some attributes are expected to be found
- Returns:
None (validation messages captured in the ‘self’ BiolinkValidator context)
- validate_attributes(graph_type: TRAPIGraphType, edge_id: str, edge: Dict, source_trail: str | None = None) str | None
Validate Knowledge Edge Attributes. For TRAPI 1.3.0, may also return an ordered audit trail of Edge provenance infores-specified knowledge sources, as parsed in from the list of attributes (returns ‘None’ otherwise).
- Parameters:
graph_type – TRAPIGraphType, type of TRAPI graph component being validated
edge_id – str, string identifier for the edge (for reporting purposes)
edge – Dict, the edge object associated with some attributes are expected to be found
source_trail – Optional[str], audit trail of knowledge source provenance for a given Edge, as a string.
- Returns:
Optional[str], audit trail of knowledge source provenance for a given Edge, as a string.
- validate_biolink() bool
Predicate to check if the Biolink (version) is tagged to ‘suppress’ compliance validation.
- Returns:
bool, returns ‘True’ if Biolink Validation is expected.
- validate_category(context: str, node_id: str | None, category: str | None) ClassDefinition
Validate a Biolink category.
Only returns a non-None value if it is a ‘concrete’ category, and reports ‘unknown’ or ‘missing’ (None or empty string) category names as errors; deprecated categories are reported as warnings; but both ‘mixin’ and ‘abstract’ categories are accepted as valid categories silently ignored, but are not considered ‘concrete’, thus the method returns None.
- Parameters:
context – str, label for context of concept whose category is being validated, i.e. ‘Subject’ or ‘Object’
node_id – str, CURIE of concept node whose category is being validated
category – str, CURIE of putative concept ‘category’
- Returns:
category as a ClassDefinition, only returned if ‘concrete’; None otherwise.
- validate_element_status(graph_type: TRAPIGraphType, context: str, identifier: str, edge_id: str, source_trail: str | None = None, ignore_graph_type: bool = False) Element | None
Detect element missing from Biolink, or is deprecated, abstract or mixin, signalled as a failure or warning.
- Parameters:
graph_type – TRAPIGraphType, type of TRAPI graph component being validated
context – str, parsing context (e.g. ‘Node’)
identifier – str, name of the putative Biolink element (‘class’)
edge_id – str, identifier of enclosing edge containing the element (e.g. the ‘edge_id’)
source_trail – Optional[str], audit trail of knowledge source provenance for a given Edge, as a string.
ignore_graph_type – bool, if strict validation is None (not set globally), then only apply graph-type-differential strict validation if ‘ignore_graph_type’ is False
- Returns:
Optional[Element], Biolink Element resolved to ‘name’ if element no validation error; None otherwise.
- validate_graph_edge(edge: Dict, graph_type: TRAPIGraphType)
Validate slot properties of a relationship (‘biolink:Association’) edge.
- Parameters:
edge – Dict[str, str], dictionary of slot properties of the edge.
graph_type – TRAPIGraphType, type of TRAPI component being validated
- validate_graph_node(node_id: str, slots: Dict[str, Any], graph_type: TRAPIGraphType)
Validate slot properties (mainly ‘categories’) of a node.
- Parameters:
node_id – str, identifier of a concept node
slots – Dict, properties of the node
graph_type – TRAPIGraphType, properties of the node
- validate_infores(context: str, edge_id: str, identifier: str) bool
Validate that the specified identifier is a well-formed Infores CURIE. Note that here we also now accept that the identifier can be a semicolon delimited list of such infores.
- Parameters:
context – reporting context as specified by a validation code prefix
edge_id – specific edge validated, for the purpose of reporting validation context
identifier – candidate (list of) infores curie(s) to be validated.
- Returns:
- validate_input_edge_node(context: str, node_id: str | None, category_name: str | None)
- validate_knowledge_level(edge_id: str, found: bool, value: str | None) bool
Validate the value of a ‘knowledge_level’ of the given edge. :param edge_id: str, identifier of the edge being validated :param found: bool, current status of slot detection, True if already seen previously (return ‘True’ value here) :param value: Optional[str], the value to be validated :return: bool, if valid slot and value found (validation messages recorded in the BiolinkValidator)
- validate_predicate(edge_id: str, predicate: str, graph_type: TRAPIGraphType, source_trail: str | None = None)
Validates predicates based on their meta-nature: existence, mixin, deprecation, etc. with some notable hard-coded explicit PREDICATE_INCLUSIONS exceptions in earlier Biolink Model releases.
- Parameters:
edge_id – str, identifier of the edge whose predicate is being validated
predicate – str, putative Biolink Model predicate to be validated
source_trail – str, putative Biolink Model predicate to be validated
graph_type – TRAPIGraphType, type of TRAPI graph component being validated
- Returns:
None (validation communicated via class instance of method)
- validate_provenance(edge_id, ara_source, found_ara_knowledge_source, kp_source, found_kp_knowledge_source, kp_source_type, found_primary_knowledge_source, source_trail: str | None)
Validates ARA and KP infores knowledge sources based on surveyed Edge slots (recorded in edge “attributes” pre-1.4.0; in “sources”, post-1.4.0).
- Parameters:
edge_id – str, string identifier for the edge (for reporting purposes)
ara_source – str, user specified target ARA infores
found_ara_knowledge_source – bool, True if target ARA infores knowledge source was found
kp_source – str, user specified target KP infores
found_kp_knowledge_source – bool, True if target KP infores knowledge source was found
kp_source_type – str, user specified KP knowledge source type (i.e. primary, aggregate, etc.)
source_trail – Optional[str], audit trail of knowledge source provenance for a given Edge, as a string.
found_primary_knowledge_source – List[str], list of all infores discovered tagged as ‘primary’
- Returns:
- validate_qualifier_constraints(edge_id: str, edge: Dict)
Validate Query Edge Qualifier Constraints.
- Parameters:
edge_id – str, string identifier for the edge (for reporting purposes)
edge – Dict, the edge object associated with some attributes are expected to be found
- Returns:
None (validation messages captured in the ‘self’ BiolinkValidator context)
- validate_qualifier_entry(context: str, edge_id: str, qualifiers: List[Dict[str, str]], associations: List[str] | None = None, source_trail: str | None = None)
Validate Qualifier Entry (JSON Object).
- Parameters:
context – str, Validation (subcode) context: - query graph qualifier constraints (“query_graph.edge.qualifier_constraints.qualifier_set”) or - knowledge graph edge qualifiers (knowledge_graph.edge.qualifiers)
edge_id – str, string identifier for the edge (for reporting purposes)
qualifiers – List[Dict[str, str]], of qualifier entries to be validated.
associations – Optional[List[str]] = None, Biolink association subclasses possibly related to the current edge.
- :param source_trail, Optional[str], audit trail of knowledge source provenance for a given Edge, as a string.
Defaults to “global” if not specified.
- Returns:
None (validation messages captured in the ‘self’ BiolinkValidator context)
- validate_qualifiers(edge_id: str, edge: Dict, associations: List[str] | None = None, source_trail: str | None = None)
Validate Knowledge Edge Qualifiers.
- Parameters:
edge_id – str, string identifier for the edge (for reporting purposes)
edge – Dict, the edge object associated with some attributes are expected to be found
associations – Optional[List[str]], Biolink association subclasses possibly related to the current edge.
- :param source_trail, Optional[str], audit trail of knowledge source provenance for a given Edge, as a string.
Defaults to “global” if not specified.
- Returns:
None (validation messages captured in the ‘self’ BiolinkValidator context)
- validate_slot_value(slot_name: str, context: str, found: bool, value: str | None) bool
Validate the single-valued value of a specified slot of the given knowledge graph entity slot. :param slot_name, str, name of a valid slot, a value for which is to be validated :param context: str, context of the validation (e.g. node or edge id) :param found: bool, current status of slot detection,
Should be true if the slot was already previously seen
- Parameters:
value – Optional[str], the value to be validated
- Returns:
bool, True if valid slot and value (validation messages recorded in the BiolinkValidator)
- validate_sources(edge_id: str, edge: Dict) str | None
Validate (TRAPI 1.4.0-beta ++) Edge sources provenance.
- Parameters:
edge_id – str, string identifier for the edge (for reporting purposes)
edge – Dict, the edge object associated with some attributes are expected to be found
- Returns:
Optional[str], audit trail of knowledge source provenance for a given Edge, as a string.
- reasoner_validator.biolink.get_biolink_model_toolkit(biolink_version: str | None = None) Toolkit
Return Biolink Model Toolkit corresponding to specified version of the model (Default: current ‘latest’ version).
- Parameters:
biolink_version (Optional[str] or None) – Optional[str], caller specified Biolink Model version (default: None)
- Returns:
Biolink Model Toolkit.
- Return type:
Toolkit
- reasoner_validator.biolink.get_reference(curie: str) str | None
Get the object_id reference of a given CURIE.
- Parameters:
- curie: str
The CURIE
- Returns:
- Optional[str]
The reference of a CURIE
- reasoner_validator.biolink.is_curie(s: str) bool
Check if a given string is a CURIE.
- Parameters:
s – str, string to be validated as a CURIE
- Returns:
bool, whether the given string is a CURIE