SIGnaling Network Open Resource (Signor) Reference Ingest Guide¶
Source Information¶
InfoRes ID: infores:signor
Description: SIGNOR 3.0, https://signor.uniroma2.it, is a public repository that captures causal information and represents it according to an 'activity-flow' model. SIGNOR provides freely-accessible static maps of causal interactions that can be tailored, pruned and refined to build dynamic and predictive models. Each signaling relationship is annotated with an effect (up/down-regulation) and with the mechanism (e.g. binding, phosphorylation, transcriptional activation, etc.) causing the regulation of the target entity. Since its latest release, SIGNOR has undergone a significant upgrade including: (i) a new website that offers an improved user experience and novel advanced search and graph tools; (ii) a significant content growth adding up to a total of approx. 33,000 manually-annotated causal relationships between more than 8900 biological entities; (iii) an increase in the number of manually annotated pathways, currently including pathways deregulated by SARS-CoV-2 infection or involved in neurodevelopment synaptic transmission and metabolism, among others; (iv) additional features such as new model to represent metabolic reactions and a new confidence score assigned to each interaction.
Citations: - Prisca Lo Surdo, Marta Iannuccelli, Silvia Contino, Luisa Castagnoli, Luana Licata, Gianni Cesareni, Livia Perfetto, SIGNOR 3.0, the SIGnaling network open resource 3.0: 2022 update, Nucleic Acids Research, Volume 51, Issue D1, 6 January 2023, Pages D631–D637, https://doi.org/10.1093/nar/gkac883
Data Access Locations: - Signor 3.0 Downloads: https://signor.uniroma2.it/downloads.php (this page includes file sizes and simple data dictionaries for each download)
Data Provision Mechanisms: file_download
Data Formats: csv
Data Versioning and Releases: No consistent cadence for releases, but on average there are 1-2 releases each month. Versioning is based on the month and year of the release. Releases page / change log: https://signor.uniroma2.it/downloads.php
Ingest Information¶
Ingest Categories: primary_knowledge_provider
Utility: Signor is a rich source of manually curated genetic associations to other biological entities which are an important type of edge for Translator query and reasoning use cases, including treatment predictions, gene-gene regulation predictions, and pathfinder queries. It is one of the sources that focus on drug and genes.
Scope: This initial ingest of Signor covers manually curated causal associations between proteins, protein families, complexes, small molecules (mainly endogenous chemical entities) and chemicals (mainly non-endogenous chemical entities), as well as parthood edges associations between proteins and complexes, and physical interaction associations that can be inferred from certain mechanisms of action in the SIGNOR dataset.
Relevant Files¶
| File Name | Location | Description |
|---|---|---|
| all_data_ | https://signor.uniroma2.it/downloads.php | Associations generated by knowledge assertions between Entity Types 'Protein', 'Complex', 'Chemical', 'Smallmolecule', 'Proteinfamily' |
Included Content¶
| File Name | Included Records | Fields Used |
|---|---|---|
| all_data_ | Associations generated by knowledge assertions with quality controlled edges between 'Protein', 'Complex', 'Chemical', 'Smallmolecule', 'Proteinfamily' | ENTITYA, TYPEA, IDA, DATABASEA, ENTITYB, TYPEB, IDB, DATABASEB, EFFECT, MECHANISM, TAX_ID, CELL_DATA, TISSUE_DATA, PMID, DIRECT, SENTENCE, SCORE, SIGNOR_ID |
Filtered Content¶
| File Name | Filtered Records | Rationale |
|---|---|---|
| all_data_ | Entity types - removed entity types in [stimulus, fusion protein] | waiting for future new biolink node types: stimulus and fusion protein for more accurate mapping into Biolink model |
| all_data_ | Entity types - removed entity types in [mirna, antibody, phenotype, ncrna] | very small proportion of the SIGNOR data, modeling may be needed in following phases, and currently considered to be less impactful in translator |
Target Information¶
Target InfoRes ID: infores:translator-signor
Edge Types¶
| Subject Categories | Predicate | Object Categories | Knowledge Level | Agent Type | UI Explanation |
|---|---|---|---|---|---|
| biolink:Protein, biolink:Proteinfamily, biolink:MacromolecularComplex, biolink:SmallMolecule | biolink:Protein, biolink:Proteinfamily, biolink:MacromolecularComplex, biolink:SmallMolecule | knowledge_assertion | manual_agent | Signor records indicate the protein / protein family to 'regulate' another protein / complex / protein family / small molecule or complex / small molecule to 'regulate' another complex / protein / protein family / smallmolecule - which maps best to the Biolink predicate 'regulates' with additional directional qualifier. | |
| biolink:Protein | biolink:part_of | biolink:MacromolecularComplex | knowledge_assertion | manual_agent | Signor records indicate the protein to be 'form complex' another complex - which maps best to the Biolink predicate 'part of' with additional directional qualifier. |
| biolink:ChemicalEntity | biolink:Protein, biolink:MacromolecularComplex, biolink:Proteinfamily, biolink:SmallMolecule | knowledge_assertion | manual_agent | Signor records indicate the chemical to be 'affects' another protein / complex / proteinfamily / small molecule - which maps best to the Biolink predicate 'affects' with additional directional qualifier. | |
| biolink:Protein, biolink:MacromolecularComplex, biolink:Proteinfamily, biolink:SmallMolecule | biolink:ChemicalEntity | knowledge_assertion | manual_agent | Signor records indicate the protein / complex / proteinfamily / small molecule to be 'affects' another chemical - which maps best to the Biolink predicate 'affects' with additional directional qualifier. | |
| biolink:ChemicalEntity | biolink:ChemicalEntity | knowledge_assertion | manual_agent | When the mechanism implies a physical interaction, the ingestion create a separate 'physically_interacts_with' edge in addition to the affects/regulates edge (e.g. 'binding') | |
| biolink:Protein, biolink:Proteinfamily, biolink:MacromolecularComplex, biolink:SmallMolecule, biolink:ChemicalEntity | biolink:Protein, biolink:Proteinfamily, biolink:MacromolecularComplex, biolink:SmallMolecule, biolink:ChemicalEntity | knowledge_assertion | manual_agent | The 'MECHANISM' provided by Signor indicates that a physical interaction must exist between the subject and object entities - so this edge is created to report that, in addition to the primary affects/regulates causal association edge. |
Node Types¶
| Node Category | Source Identifier Types | Additional Notes |
|---|---|---|
| biolink:Protein | UNIPROT | |
| biolink:ProteinFamily | UNIPROT | |
| biolink:SmallMolecule | ChEMBL | |
| biolink:ChemicalEntity | ChEMBL | |
| biolink:MacromolecularComplex | SIGNOR ID |
Future Modeling Considerations¶
node_content: Considering future new biolink node types: stimulus and fusion protein for more accurate mapping into Biolink model
edge_content: Note that fact that X modifies Y (e.g. phosphorylates) is represented indirectly here, by an 'affects' predicate plus a mechanism qualifier that captures the type of modification. Do we want a more direct edge (e.g. X phosphorylates Y) to be created at ingest? Or derived afterward? Or perhaps live only in a 'predicate-semantics' KG derived from the primary 'qualifier-semantics' KG)?
Provenance Information¶
Contributors: - Qi Wei: code author, data modeling - Yue Zhang: data modeling - Guangrong Qin: data modeling, domain expertise - Sierra Moxon: code support - Matthew Brush: data modeling, domain expertise
Artifacts: - Ingest Survey: https://docs.google.com/spreadsheets/d/1tqimhXxpWzQdfNxanpW-rAaLnmsP80YZUthY5O4mEc8/edit?gid=1223527032#gid=1223527032 - Ingest Ticket: https://github.com/NCATSTranslator/Data-Ingest-Coordination-Working-Group/issues/29