Ubergraph Ontological Hierarchy Reference Ingest Guide¶
Source Information¶
InfoRes ID: infores:ubergraph
Description: Ubergraph is a unified OWL ontology integrating multiple biomedical ontologies including GO, UBERON, CL, CHEBI, HP, and others. It provides both redundant (with full inference closure for subclass relations) and non-redundant graph representations. This ingest uses the redundant version which contains complete transitive, reflexive subclass relations for ontological hierarchy reasoning.
Citations: - Balhoff JP, Bayindir U, Caron AR, Matentzoglu N, Osumi-Sutherland D, Mungall CJ. Ubergraph: integrating OWL ontologies into a unified semantic graph. bioRxiv. 2022.
Data Access Locations: - Ubergraph Downloads: https://ubergraph.apps.renci.org/downloads/current/ - redundant-graph-table.tgz contains complete inference closure - nonredundant-graph-table.tgz contains direct assertions only
Data Provision Mechanisms: file_download
Data Formats: tsv
Data Versioning and Releases: Ubergraph is updated regularly. Version information is available via SPARQL query against the ontology metadata. The created date is stored in the ontology and can be retrieved programmatically.
Ingest Information¶
Ingest Categories: primary_knowledge_provider
Utility: Ubergraph provides comprehensive ontological relationships essential for Translator's reasoning capabilities. The complete inference closure enables ontology-based query expansion, ancestor/descendant lookups, and semantic similarity calculations. These hierarchical relationships are fundamental for mapping between different levels of biological granularity and supporting TRAPI queries that require ontological reasoning.
Scope: This ingest focuses on the redundant graph which includes all subclass (rdfs:subClassOf) relationships with full transitive closure. The graph contains nodes and edges from integrated ontologies represented as CURIEs, enabling seamless integration with other Translator knowledge sources.
Relevant Files¶
| File Name | Location | Description |
|---|---|---|
| redundant-graph-table.tgz | https://ubergraph.apps.renci.org/downloads/current/redundant-graph-table.tgz | Tar archive containing node-labels.tsv, edge-labels.tsv, and edges.tsv with complete inference closure |
Included Content¶
| File Name | Included Records | Fields Used |
|---|---|---|
| redundant-graph-table.tgz | Subclass (rdfs:subClassOf) edges where both subject and object are from selected biomedical ontologies | subject_id, predicate_id, object_id (mapped via node-labels.tsv and edge-labels.tsv) |
Filtered Content¶
| File Name | Filtered Records | Rationale |
|---|---|---|
| redundant-graph-table.tgz | Edges where subject, predicate, or object IRIs cannot be mapped to CURIEs | CURIE conversion is essential for integration with Translator infrastructure. IRIs that cannot be mapped lack appropriate prefix mappings in Biolink Model, OBO, or custom converters. |
| redundant-graph-table.tgz | All non-subClassOf predicates (e.g., part_of, has_part, regulates, etc.) | Initial focus on hierarchical subclass relationships only. Other predicate types may be added in future iterations. |
| redundant-graph-table.tgz | Edges involving ontology terms not in the selected prefix list (e.g., non-biomedical ontologies, deprecated terms) | Focused on core biomedical ontologies most relevant to Translator use cases. Reduces graph size from 90+ million edges to ~2-3 million while maintaining comprehensive biomedical coverage. |
Future Content Considerations¶
edge_content: Current ingest includes only rdfs:subClassOf (strict ontological hierarchy). In the future, consider adding support for other predicate types (e.g., part_of, regulates) as separate ingests or options.
node_property_content: Ubergraph provides node descriptions via SPARQL query (IAO_0000115 definitions). These could be added as node properties in future iterations.
edge_property_content: Additional edge properties from OWL axiom annotations could be included to provide provenance and context for ontological relationships
other: Evaluate ingesting the non-redundant graph as a separate resource for users who need only direct asserted relationships without inference closure
Target Information¶
Target InfoRes ID: infores:translator-ubergraph-kgx
Edge Types¶
| Subject Categories | Predicate | Object Categories | Knowledge Level | Agent Type | UI Explanation |
|---|---|---|---|---|---|
| biolink:NamedThing | biolink:NamedThing | knowledge_assertion | manual_agent | Ontological subclass relationships from Ubergraph representing hierarchical is-a relationships between biomedical concepts. These edges support ontology-based reasoning and query expansion in Translator. |
Node Types¶
| Node Category | Source Identifier Types | Additional Notes |
|---|---|---|
| biolink:NamedThing | UBERON (anatomical structures), CL (cell types), GO (gene ontology: biological processes, molecular functions, cellular components), CHEBI (chemical entities), PR (proteins), NCIT (NCI Thesaurus), HPO (human phenotype ontology), MONDO (diseases), RO (relation ontology - used as subjects/objects in some cases), SO (sequence ontology), MP (mammalian phenotype), PATO (phenotype and trait ontology), ECTO (environmental conditions), ENVO (environmental ontology), OBI (ontology for biomedical investigations), MAXO (medical actions), ECO (evidence codes), NCBITAXON (taxonomic classifications), FOODON (food ontology), MI (molecular interactions), UO (units of measurement) | Nodes represent ontology terms from selected biomedical ontologies integrated into Ubergraph. In the raw ingest, all nodes are assigned the generic biolink:NamedThing category. NodeNormalization subsequently enriches these nodes with more specific Biolink categories (e.g., biolink:Disease, biolink:ChemicalEntity, biolink:AnatomicalEntity, biolink:BiologicalProcess, biolink:CellularComponent, biolink:MolecularActivity, biolink:PhenotypicFeature) based on the ontology prefix and internal mappings. This normalization step provides proper semantic typing for downstream Translator operations. |
Future Modeling Considerations¶
predicates: Map additional OWL/RDF predicates beyond rdfs:subClassOf to appropriate Biolink predicates where applicable
edge_properties: Add edge properties for axiom annotations, provenance, and source ontology identifiers
node_properties: Include node labels, definitions, synonyms, and other metadata from Ubergraph SPARQL endpoint
Provenance Information¶
Contributors: - Sierra Moxon: code, data modeling - Evan Morris: code, data modeling - Jim Balhoff: code, domain expertise, Ubergraph development, artifact development
Artifacts: - Ubergraph GitHub: https://github.com/INCATools/ubergraph