*************************************************************** SympGAN: a integrated symptom-gene associations database Description: SympGAN collected a mass of symptom terminologies and the associations among symptom, genes, disease and clinial drugs. *************************************************************** symptoms.tsv Description: the basic information of all symptom terminologies in SympGAN The columns in the files are: Symptom_CUI -> Symptom identifier in UMLS Symptom_name -> Symptom name Alias -> Symptom alias Definition -> Symptom definition External_Ids -> External ids that link to other databases --------------------------------------------------------------- diseases.tsv Description: the basic information of all disease terminologies in SympGAN The columns in the files are: Disease_CUI -> Disease identifier in UMLS Disease_name -> Disease name Alias -> Disease alias Definition -> Disease definition External_Ids -> External ids that link to other databases --------------------------------------------------------------- drugs.tsv Description: the basic information of all drug terminologies in SympGAN The columns in the files are: Drug_id -> Drug identifier in Drugbank Drug_name -> Drug name Description -> Drug background State -> Drug state Cas_number -> CAS registry number Synonyms -> Drug synonyms Sequences -> Molecular sequence Indication -> Drug indication Pharmacodynamics -> Drug pharmacodynamics Mechanism_of_action -> Drug action mechanism Metabolism -> Drug metabolism Toxicity -> Drug toxicity --------------------------------------------------------------- genes.tsv Description: the basic information of all gene terminologies in SympGAN The columns in the files are: Gene_id -> Gene identifier in SympGAN Gene_symbol -> Official gene symbol Chromosome -> The chromosome number in which each target located Gene_name -> Official gene name Protein_name -> Protein name Ensembl_id -> The identifier in the Ensembl database Ncbi_id -> The identifier in the NCBI database Hgnc_id -> The identifier in the HGNC database Vega_id -> The identifier in the VEGA database Genbank_gene_id -> The gene identifier in the GenBank database Genbank_protein_id -> The protein identifier in the Ensembl database Uniprot_id -> The identifier in the Uniprot database Pdb_id -> The identifier in the PDB database Mim_id -> The identifier in the OMIM database Mirbase_id -> The identifier in the miRBase database Imgt_gene_db_id -> The identifier in the IMGT/GENE-DB database --------------------------------------------------------------- curated_symptom_gene_associations.tsv Description: Symptom-gene associations that are extracted from phenotype-genotype associations. The columns in the files are: Symptom_CUI -> Symptom identifier in UMLS Symptom_name -> Symptom name Gene_Symbol -> Official Gene Symbol Source -> Data source --------------------------------------------------------------- literature_symptom_gene_associations.tsv Description: Symptom-gene associations derived from the literature databases (i.e., PubMed and SemMed). The columns in the files are: Symptom_CUI -> Symptom identifier in UMLS Symptom_name -> Symptom name Gene_Symbol -> Official Gene Symbol No. of PMID -> Co-occurrence amount of symptom-gene associations in PubMed P-value -> P-value of symptom-gene associations using Fisher exact test Source -> Data source PubMed_IDs -> Co-occurrence PubMed Ids of symptom-gene associations (maximum of PubMed Ids is 1000 for each association) --------------------------------------------------------------- inferred_symptom_gene_associations.tsv Description: Symptom-gene associations predicted by prediction algorithm (i.e., LSGER). The columns in the files are: Symptom_CUI -> Symptom identifier in UMLS Symptom_name -> Symptom name Gene_Symbol -> Official Gene Symbol P_value -> P-value of symptom-gene associations using Fisher exact test Score -> Confidence score of inferred associations --------------------------------------------------------------- all_symptom_gene_associations.tsv Description: all the symptom-gene associations that integrated curated, literature-derived and inferred associations. The columns in the files are: Symptom_CUI -> Symptom identifier in UMLS Symptom_name -> Symptom name Gene_Symbol -> Official Gene Symbol Source -> Data source --------------------------------------------------------------- symptom_disease_associations.tsv Description: integrative symptom-disease associations from multiplex databases (DO, HPO, MalaCards, HSDN, Orphanet, UMLS). The columns in the files are: Symptom_CUI -> Symptom identifier in UMLS Symptom_name -> Symptom name Disease_CUI -> Disease identifier in UMLS Disease_Name -> Disease name Source -> Data source --------------------------------------------------------------- symptom_drug_associations.tsv Description: integrative symptom-drug associations from Sider database. The columns in the files are: Symptom_CUI -> Symptom identifier in UMLS Symptom_name -> Symptom name Drug_ID -> Drug identifier in DrugBank Drug_Name -> Drug name in DrugBank Type -> Symptom type (side effect or indiction) related to clinical drugs. Source -> Data source --------------------------------------------------------------- drug_target_associations.tsv Description: integrative drug-target associations from Drugbank database. The columns in the files are: drug_id -> Drug identifier in Drugbank gene_symbol -> Official Gene Symbol --------------------------------------------------------------- If you have any further questions, please email us at xzzhou@bjtu.edu.cn