STANDARD_NAME NABA_ECM_AFFILIATED SYSTEMATIC_NAME M5880 COLLECTION C2:CGP MSIGDB_URL https://www.gsea-msigdb.org/gsea/msigdb/human/geneset/NABA_ECM_AFFILIATED NAMESPACE HGNC_ID DESCRIPTION_BRIEF Genes encoding proteins affiliated structurally or functionally to extracellular matrix proteins DESCRIPTION_FULL One hallmark of ECM proteins is their domain-based structure. Exploiting this characteristic, we established a list of diagnostic InterPro domains commonly found in ECM proteins. This domain list was used to screen the UniProt protein database. We know that some of the domains used to select positively for ECM proteins are also found in transmembrane receptors and proteins involved in cell adhesion (growth factor receptors, integrins, etc) that do not belong to the ECM. These families of proteins also display a subset of specific domains and transmembrane domains incompatible with definition as “extracellular matrix” proteins. Therefore, a second step comprised a negative selection using another set of domains and a transmembrane domain prediction. Manual curation of the matrisome lists also allowed us to add a very few known ECM proteins that do not contain any known domains. Protein-centric predictions were then converted to gene-centric lists. Finally, knowledge-based annotation of these gene lists allowed us to define subcategories within the core matrisome; namely, ECM glycoproteins, collagens, and proteoglycans. We also defined separate lists of domains commonly found in 1) ECM-affiliated proteins (proteins that share either some architectural similarities with ECM proteins or that are known to be associated with ECM proteins; 2) ECM regulators: ECM-remodeling enzymes, crosslinkers, proteases, regulators etc.; 3) secreted factors, many of which are known to bind to ECM and others that may. As for the core matrisome list, we also defined lists of domains that excluded mis-assigned proteins from these categories. Using similar bioinformatic pipelines as for the core matrisome, we defined three categories of “matrisome-associated” proteins: ECM-affiliated proteins, ECM regulators, and secreted factors. PMID 22159717 GEOID AUTHORS Naba A,Clauser KR,Hoersch S,Liu H,Carr SA,Hynes RO CONTRIBUTOR Alexandra Naba CONTRIBUTOR_ORG Massachusetts Institute of Technology EXACT_SOURCE FILTERED_BY_SIMILARITY EXTERNAL_NAMES_FOR_SIMILAR_TERMS EXTERNAL_DETAILS_URL http://matrisome.org SOURCE_MEMBERS HGNC:10576,HGNC:10658,HGNC:10659,HGNC:10660,HGNC:10661,HGNC:10723,HGNC:10724,HGNC:10725,HGNC:10726,HGNC:10727,HGNC:10728,HGNC:10729,HGNC:10730,HGNC:10731,HGNC:10732,HGNC:10734,HGNC:10735,HGNC:10736,HGNC:10737,HGNC:10738,HGNC:10739,HGNC:10740,HGNC:10741,HGNC:10798,HGNC:10799,HGNC:10801,HGNC:10802,HGNC:10803,HGNC:11891,HGNC:1241,HGNC:1242,HGNC:1245,HGNC:13257,HGNC:13258,HGNC:13523,HGNC:14324,HGNC:14325,HGNC:14326,HGNC:14342,HGNC:14343,HGNC:14344,HGNC:14346,HGNC:14351,HGNC:14362,HGNC:14554,HGNC:14555,HGNC:14556,HGNC:14558,HGNC:14956,HGNC:15449,HGNC:15582,HGNC:15788,HGNC:16016,HGNC:16041,HGNC:1641,HGNC:16770,HGNC:16800,HGNC:16916,HGNC:17213,HGNC:17279,HGNC:18259,HGNC:18386,HGNC:18387,HGNC:19359,HGNC:19832,HGNC:2001,HGNC:2014,HGNC:2052,HGNC:2053,HGNC:2054,HGNC:20599,HGNC:20945,HGNC:21013,HGNC:21661,HGNC:21969,HGNC:2220,HGNC:22977,HGNC:23282,HGNC:23334,HGNC:23399,HGNC:24181,HGNC:24182,HGNC:24191,HGNC:24355,HGNC:24356,HGNC:24536,HGNC:24591,HGNC:2466,HGNC:2467,HGNC:24842,HGNC:25012,HGNC:25172,HGNC:25357,HGNC:25396,HGNC:26705,HGNC:28538,HGNC:28732,HGNC:29396,HGNC:29595,HGNC:30054,HGNC:30388,HGNC:30400,HGNC:30588,HGNC:31374,HGNC:31416,HGNC:31713,HGNC:31966,HGNC:33154,HGNC:33849,HGNC:33874,HGNC:34520,HGNC:34522,HGNC:3623,HGNC:3624,HGNC:3625,HGNC:39755,HGNC:40039,HGNC:4449,HGNC:4450,HGNC:4451,HGNC:4452,HGNC:4453,HGNC:4454,HGNC:4577,HGNC:5171,HGNC:533,HGNC:534,HGNC:535,HGNC:536,HGNC:537,HGNC:541,HGNC:542,HGNC:543,HGNC:544,HGNC:545,HGNC:546,HGNC:547,HGNC:6561,HGNC:6562,HGNC:6563,HGNC:6565,HGNC:6568,HGNC:6569,HGNC:6570,HGNC:6631,HGNC:6632,HGNC:6922,HGNC:7508,HGNC:7510,HGNC:7511,HGNC:7512,HGNC:7513,HGNC:7514,HGNC:7515,HGNC:7516,HGNC:7517,HGNC:7518,HGNC:7519,HGNC:8524,HGNC:8601,HGNC:9099,HGNC:9100,HGNC:9101,HGNC:9102,HGNC:9103,HGNC:9104,HGNC:9105,HGNC:9106,HGNC:9107,HGNC:9951,HGNC:9952 GENE_SYMBOLS CLEC11A,SDC1,SDC2,SDC3,SDC4,SEMA3A,SEMA3B,SEMA3C,SEMA3D,SEMA3E,SEMA3F,SEMA4A,SEMA4B,SEMA4C,SEMA4D,SEMA4F,SEMA4G,SEMA5A,SEMA5B,SEMA6A,SEMA6B,SEMA6C,SEMA7A,SFTPA1,SFTPA2,SFTPB,SFTPC,SFTPD,CLEC3B,C1QA,C1QB,C1QC,CLEC4A,CLEC4C,CLEC4M,C1QTNF1,C1QTNF2,C1QTNF3,C1QTNF7,C1QTNF6,C1QTNF5,C1QTNF4,CLEC2D,MUC19,CLEC4D,CLEC4E,CLEC6A,CLEC7A,MUC15,LGALS13,MUC16,LGALS12,COLEC12,EMCN,CD209,SEMA6D,MUC17,CLEC10A,COLEC11,OPRPN,ITLN1,SFTA2,SFTA3,C1QL3,CLEC14A,GREM1,CLC,CLEC3A,CLEC2B,CLEC5A,ITLN2,PLXDC1,PLXDC2,MUC21,CLEC2L,COLEC10,REG4,MUC20,ANXA8L1,FREM1,C1QL2,C1QL1,CLEC2A,CLEC1A,CLEC1B,PARM1,CLEC4G,CSPG4,CSPG5,LGALS9B,LGALSL,FREM3,CLEC4F,FREM2,CLEC9A,CLEC18C,C1QTNF9,ELFN2,REG3G,LGALS14,CLEC18A,SEMA3G,MUCL1,C1QTNF8,C1QL4,CLEC12A,CLEC12B,ELFN1,CLEC18B,LGALS9C,CLEC17A,CLEC19A,FCN1,FCN2,FCN3,MUC22,LGALS16,GPC1,GPC2,GPC3,GPC4,GPC5,GPC6,GRIFIN,HPX,ANXA1,ANXA10,ANXA11,ANXA13,ANXA2,ANXA3,ANXA4,ANXA5,ANXA6,ANXA7,ANXA8,ANXA9,LGALS1,LGALS2,LGALS3,LGALS4,LGALS7,LGALS8,LGALS9,LMAN1,LMAN1L,MBL2,MUC1,MUC12,MUC13,MUC2,MUC3A,MUC4,MUC5AC,MUC5B,MUC6,MUC7,,OVGP1,REG3A,PLXNA1,PLXNA2,PLXNA3,PLXNA4,PLXNB1,PLXNB2,PLXNB3,PLXNC1,PLXND1,REG1A,REG1B FOUNDER_NAMES