Skip to content

Ubergraph Ontological Hierarchy Reference Ingest Guide

Source Information

InfoRes ID: infores:ubergraph

Description: Ubergraph is a unified OWL ontology integrating multiple biomedical ontologies including GO, UBERON, CL, CHEBI, HP, and others. It provides both redundant (with full inference closure for subclass relations) and non-redundant graph representations. This ingest uses the redundant version which contains complete transitive, reflexive subclass relations for ontological hierarchy reasoning.

Citations: - Balhoff JP, Bayindir U, Caron AR, Matentzoglu N, Osumi-Sutherland D, Mungall CJ. Ubergraph: integrating OWL ontologies into a unified semantic graph. bioRxiv. 2022.

Data Access Locations: - Ubergraph Downloads: https://ubergraph.apps.renci.org/downloads/current/ - redundant-graph-table.tgz contains complete inference closure - nonredundant-graph-table.tgz contains direct assertions only

Data Provision Mechanisms: file_download

Data Formats: tsv

Data Versioning and Releases: Ubergraph is updated regularly. Version information is available via SPARQL query against the ontology metadata. The created date is stored in the ontology and can be retrieved programmatically.

Ingest Information

Ingest Categories: primary_knowledge_provider

Utility: Ubergraph provides comprehensive ontological relationships essential for Translator's reasoning capabilities. The complete inference closure enables ontology-based query expansion, ancestor/descendant lookups, and semantic similarity calculations. These hierarchical relationships are fundamental for mapping between different levels of biological granularity and supporting TRAPI queries that require ontological reasoning.

Scope: This ingest focuses on the redundant graph which includes all subclass (rdfs:subClassOf) relationships with full transitive closure. The graph contains nodes and edges from integrated ontologies represented as CURIEs, enabling seamless integration with other Translator knowledge sources.

Relevant Files

File Name Location Description
redundant-graph-table.tgz https://ubergraph.apps.renci.org/downloads/current/redundant-graph-table.tgz Tar archive containing node-labels.tsv, edge-labels.tsv, and edges.tsv with complete inference closure

Included Content

File Name Included Records Fields Used
redundant-graph-table.tgz Subclass (rdfs:subClassOf) edges where both subject and object are from selected biomedical ontologies subject_id, predicate_id, object_id (mapped via node-labels.tsv and edge-labels.tsv)

Filtered Content

File Name Filtered Records Rationale
redundant-graph-table.tgz Edges where subject, predicate, or object IRIs cannot be mapped to CURIEs CURIE conversion is essential for integration with Translator infrastructure. IRIs that cannot be mapped lack appropriate prefix mappings in Biolink Model, OBO, or custom converters.
redundant-graph-table.tgz All non-subClassOf predicates (e.g., part_of, has_part, regulates, etc.) Initial focus on hierarchical subclass relationships only. Other predicate types may be added in future iterations.
redundant-graph-table.tgz Edges involving ontology terms not in the selected prefix list (e.g., non-biomedical ontologies, deprecated terms) Focused on core biomedical ontologies most relevant to Translator use cases. Reduces graph size from 90+ million edges to ~2-3 million while maintaining comprehensive biomedical coverage.

Future Content Considerations

edge_content: Current ingest includes only rdfs:subClassOf (strict ontological hierarchy). In the future, consider adding support for other predicate types (e.g., part_of, regulates) as separate ingests or options.

node_property_content: Ubergraph provides node descriptions via SPARQL query (IAO_0000115 definitions). These could be added as node properties in future iterations.

edge_property_content: Additional edge properties from OWL axiom annotations could be included to provide provenance and context for ontological relationships

other: Evaluate ingesting the non-redundant graph as a separate resource for users who need only direct asserted relationships without inference closure

Target Information

Target InfoRes ID: infores:translator-ubergraph-kgx

Edge Types

Subject Categories Predicate Object Categories Knowledge Level Agent Type UI Explanation
biolink:NamedThing biolink:NamedThing knowledge_assertion manual_agent Ontological subclass relationships from Ubergraph representing hierarchical is-a relationships between biomedical concepts. These edges support ontology-based reasoning and query expansion in Translator.

Node Types

Node Category Source Identifier Types Additional Notes
biolink:NamedThing UBERON (anatomical structures), CL (cell types), GO (gene ontology: biological processes, molecular functions, cellular components), CHEBI (chemical entities), PR (proteins), NCIT (NCI Thesaurus), HPO (human phenotype ontology), MONDO (diseases), RO (relation ontology - used as subjects/objects in some cases), SO (sequence ontology), MP (mammalian phenotype), PATO (phenotype and trait ontology), ECTO (environmental conditions), ENVO (environmental ontology), OBI (ontology for biomedical investigations), MAXO (medical actions), ECO (evidence codes), NCBITAXON (taxonomic classifications), FOODON (food ontology), MI (molecular interactions), UO (units of measurement) Nodes represent ontology terms from selected biomedical ontologies integrated into Ubergraph. In the raw ingest, all nodes are assigned the generic biolink:NamedThing category. NodeNormalization subsequently enriches these nodes with more specific Biolink categories (e.g., biolink:Disease, biolink:ChemicalEntity, biolink:AnatomicalEntity, biolink:BiologicalProcess, biolink:CellularComponent, biolink:MolecularActivity, biolink:PhenotypicFeature) based on the ontology prefix and internal mappings. This normalization step provides proper semantic typing for downstream Translator operations.

Future Modeling Considerations

predicates: Map additional OWL/RDF predicates beyond rdfs:subClassOf to appropriate Biolink predicates where applicable

edge_properties: Add edge properties for axiom annotations, provenance, and source ontology identifiers

node_properties: Include node labels, definitions, synonyms, and other metadata from Ubergraph SPARQL endpoint

Provenance Information

Contributors: - Sierra Moxon: code, data modeling - Evan Morris: code, data modeling - Jim Balhoff: code, domain expertise, Ubergraph development, artifact development

Artifacts: - Ubergraph GitHub: https://github.com/INCATools/ubergraph