Skip to content

SIGnaling Network Open Resource (Signor) Reference Ingest Guide

Source Information

InfoRes ID: infores:signor

Description: SIGNOR 3.0, https://signor.uniroma2.it, is a public repository that captures causal information and represents it according to an 'activity-flow' model. SIGNOR provides freely-accessible static maps of causal interactions that can be tailored, pruned and refined to build dynamic and predictive models. Each signaling relationship is annotated with an effect (up/down-regulation) and with the mechanism (e.g. binding, phosphorylation, transcriptional activation, etc.) causing the regulation of the target entity. Since its latest release, SIGNOR has undergone a significant upgrade including: (i) a new website that offers an improved user experience and novel advanced search and graph tools; (ii) a significant content growth adding up to a total of approx. 33,000 manually-annotated causal relationships between more than 8900 biological entities; (iii) an increase in the number of manually annotated pathways, currently including pathways deregulated by SARS-CoV-2 infection or involved in neurodevelopment synaptic transmission and metabolism, among others; (iv) additional features such as new model to represent metabolic reactions and a new confidence score assigned to each interaction.

Citations: - Prisca Lo Surdo, Marta Iannuccelli, Silvia Contino, Luisa Castagnoli, Luana Licata, Gianni Cesareni, Livia Perfetto, SIGNOR 3.0, the SIGnaling network open resource 3.0: 2022 update, Nucleic Acids Research, Volume 51, Issue D1, 6 January 2023, Pages D631–D637, https://doi.org/10.1093/nar/gkac883

Data Access Locations: - Signor 3.0 Downloads: https://signor.uniroma2.it/downloads.php (this page includes file sizes and simple data dictionaries for each download)

Data Provision Mechanisms: file_download

Data Formats: csv

Data Versioning and Releases: No consistent cadence for releases, but on average there are 1-2 releases each month. Versioning is based on the month and year of the release. Releases page / change log: https://signor.uniroma2.it/downloads.php

Ingest Information

Ingest Categories: primary_knowledge_provider

Utility: Signor is a rich source of manually curated genetic associations to other biological entities which are an important type of edge for Translator query and reasoning use cases, including treatment predictions, gene-gene regulation predictions, and pathfinder queries. It is one of the sources that focus on drug and genes.

Scope: This initial ingest of Signor covers manually curated causal associations between proteins, protein families, complexes, small molecules (mainly endogenous chemical entities) and chemicals (mainly non-endogenous chemical entities), as well as parthood edges associations between proteins and complexes, and physical interaction associations that can be inferred from certain mechanisms of action in the SIGNOR dataset.

Relevant Files

File Name Location Description
all_data_.tsv https://signor.uniroma2.it/downloads.php Associations generated by knowledge assertions between Entity Types 'Protein', 'Complex', 'Chemical', 'Smallmolecule', 'Proteinfamily'

Included Content

File Name Included Records Fields Used
all_data_.tsv Associations generated by knowledge assertions with quality controlled edges between 'Protein', 'Complex', 'Chemical', 'Smallmolecule', 'Proteinfamily' ENTITYA, TYPEA, IDA, DATABASEA, ENTITYB, TYPEB, IDB, DATABASEB, EFFECT, MECHANISM, TAX_ID, CELL_DATA, TISSUE_DATA, PMID, DIRECT, SENTENCE, SCORE, SIGNOR_ID

Filtered Content

File Name Filtered Records Rationale
all_data_.tsv Entity types - removed entity types in [stimulus, fusion protein] waiting for future new biolink node types: stimulus and fusion protein for more accurate mapping into Biolink model
all_data_.tsv Entity types - removed entity types in [mirna, antibody, phenotype, ncrna] very small proportion of the SIGNOR data, modeling may be needed in following phases, and currently considered to be less impactful in translator

Target Information

Target InfoRes ID: infores:translator-signor

Edge Types

Subject Categories Predicate Object Categories Knowledge Level Agent Type UI Explanation
biolink:Protein, biolink:Proteinfamily, biolink:MacromolecularComplex, biolink:SmallMolecule biolink:Protein, biolink:Proteinfamily, biolink:MacromolecularComplex, biolink:SmallMolecule knowledge_assertion manual_agent Signor records indicate the protein / protein family to 'regulate' another protein / complex / protein family / small molecule or complex / small molecule to 'regulate' another complex / protein / protein family / smallmolecule - which maps best to the Biolink predicate 'regulates' with additional directional qualifier.
biolink:Protein biolink:part_of biolink:MacromolecularComplex knowledge_assertion manual_agent Signor records indicate the protein to be 'form complex' another complex - which maps best to the Biolink predicate 'part of' with additional directional qualifier.
biolink:ChemicalEntity biolink:Protein, biolink:MacromolecularComplex, biolink:Proteinfamily, biolink:SmallMolecule knowledge_assertion manual_agent Signor records indicate the chemical to be 'affects' another protein / complex / proteinfamily / small molecule - which maps best to the Biolink predicate 'affects' with additional directional qualifier.
biolink:Protein, biolink:MacromolecularComplex, biolink:Proteinfamily, biolink:SmallMolecule biolink:ChemicalEntity knowledge_assertion manual_agent Signor records indicate the protein / complex / proteinfamily / small molecule to be 'affects' another chemical - which maps best to the Biolink predicate 'affects' with additional directional qualifier.
biolink:ChemicalEntity biolink:ChemicalEntity knowledge_assertion manual_agent When the mechanism implies a physical interaction, the ingestion create a separate 'physically_interacts_with' edge in addition to the affects/regulates edge (e.g. 'binding')
biolink:Protein, biolink:Proteinfamily, biolink:MacromolecularComplex, biolink:SmallMolecule, biolink:ChemicalEntity biolink:Protein, biolink:Proteinfamily, biolink:MacromolecularComplex, biolink:SmallMolecule, biolink:ChemicalEntity knowledge_assertion manual_agent The 'MECHANISM' provided by Signor indicates that a physical interaction must exist between the subject and object entities - so this edge is created to report that, in addition to the primary affects/regulates causal association edge.

Node Types

Node Category Source Identifier Types Additional Notes
biolink:Protein UNIPROT
biolink:ProteinFamily UNIPROT
biolink:SmallMolecule ChEMBL
biolink:ChemicalEntity ChEMBL
biolink:MacromolecularComplex SIGNOR ID

Future Modeling Considerations

node_content: Considering future new biolink node types: stimulus and fusion protein for more accurate mapping into Biolink model

edge_content: Note that fact that X modifies Y (e.g. phosphorylates) is represented indirectly here, by an 'affects' predicate plus a mechanism qualifier that captures the type of modification. Do we want a more direct edge (e.g. X phosphorylates Y) to be created at ingest? Or derived afterward? Or perhaps live only in a 'predicate-semantics' KG derived from the primary 'qualifier-semantics' KG)?

Provenance Information

Contributors: - Qi Wei: code author, data modeling - Yue Zhang: data modeling - Guangrong Qin: data modeling, domain expertise - Sierra Moxon: code support - Matthew Brush: data modeling, domain expertise

Artifacts: - Ingest Survey: https://docs.google.com/spreadsheets/d/1tqimhXxpWzQdfNxanpW-rAaLnmsP80YZUthY5O4mEc8/edit?gid=1223527032#gid=1223527032 - Ingest Ticket: https://github.com/NCATSTranslator/Data-Ingest-Coordination-Working-Group/issues/29