----------------------------------------------- Pentacon NSAID Project Curation Princeton University, NJ, USA University of Pennsylvania, PA, USA ----------------------------------------------- Readme file name: Orthologs_and_analogs_Readme_20140709_production.txt Readme for the following files: Orthologs_and_analogs_20140709_production.txt Orthologs_and_analogs_Evidence_Codes_20140421_production.txt Contents: Mouse (Mus musculus), rat (Rattus norvegicus), zebrafish (Danio rerio), and yeast (Saccharomyces cerevisiae) orthologs/analogs of the 133 human genes PENTACON curators have identified as being involved in the arachidonic acid pathway. These non-human genes are suggested as good models of the corresponding human genes in their respective model organisms. Date: 7/9/2014 Curation Overview ----------------- PENTACON curators identified consensus orthologs/analogs using P-POD version 4 and IMP. Orthologs were identified using P-POD's OrthoMCL analysis when possible or, when the human gene was not assigned to an OrthoMCL family, using P-POD's MultiParanoid analysis. Functional analogs were obtained from IMP using a cutoff of p < 0.05. In some cases, functional analogs were identified directly from the literature; in these cases the supporting PMID and evidence code "9" are noted. PENTACON curators reviewed the ortholog and analog calls and evidence and, using the following evidence codes, identified the consensus ortholog/analog: Evidence code Description 1 P-POD identifies a single ortholog, IMP identifies a single analog, and they agree. 1P P-POD identifies a single ortholog, IMP identifies a single analog, and they agree; P-POD ortholog is found in MultiParanoid family. 2 Call based on orthology (P-POD) only. P-POD identified 1 or more orthologs, but IMP did not identify an analog. 2P Call based on orthology (P-POD) only. P-POD identified 1 or more orthologs, but IMP did not identify an analog; P-POD ortholog is found in MultiParanoid family. 3 Call based on analogy (IMP) only. IMP identified 1 or more orthologs, but P-POD did not identify an ortholog. 4 Multiple or ambiguous orthologs resolved by referring to analogs. 5 Multiple or ambiguous orthologs resolved by referring to both IMP and P-POD. 6 P-POD and IMP disagree. Selection made by curator judgment. 7 IMP identifies no analog, and P-POD identifies no ortholog. 8 P-POD and IMP identify the same proteins, and curator selected a subset based on additional evidence. 9 Manual annotation of functional analogs based on published literature. The P-POD and IMP web sites can be found here: http://ppod.princeton.edu/ http://imp.princeton.edu/ References: Heinicke S, Livstone MS, Lu C, Oughtred R, Kang F, Angiuoli SV, White O, Botstein D, Dolinski K. "The Princeton Protein Orthology Database (P-POD): a comparative genomics analysis tool for biologists." PLoS One. 2007 Aug 22;2(8):e766. Wong AK, Park CY, Greene CS, Bongo LA, Guan Y, Troyanskaya OG. "IMP: A multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks." Nucleic Acids Research. July 2012;40:W484-W490. -------------------------------------------------------------- Orthologs_and_analogs_20140709_production.txt File Information -------------------------------------------------------------- The file contains 15 columns as follows: Columns 1-3: The standard name, UniProt ID, and NCBI Gene ID for the 133 human genes in the arachidonic acid pathway. Column 4: The name and NCBI Gene ID of the mouse gene identified as the consensus ortholog/analog of the human gene, separated by a pipe symbol. Example: "Abcc1|17250". When more than one ortholog/analog is listed, they are separated by semicolons. Example: "Acot1|26897; Acot2|171210". Column 5: The evidence code (see above) supporting the identification of the mouse gene(s) as the ortholog/analog. Column 6: Relevant free-text notes pertaining to the mouse gene(s). Columns 7-9: Similar to columns 4-6, but for rat genes. Columns 10-12: Similar to columns 4-6, but for zebrafish genes. Columns 13-15: Similar to columns 4-6, but for yeast genes. Also, the standard yeast ORF identifier (e.g.: "YDR135C") is used in place of the NCBI Gene ID. -------------------------------------------------------------- Updates and changes -------------------------------------------------------------- • The initial production release (20130710) contained mouse, fish, and yeast ortholog/analogs of 112 human genes. • The second production release (20131125) had the following changes: 1) Standard names were added as aliases to two human genes: C1ORF93 (UniProt ID Q8TBF2, NCBI GeneID 127281) became “FAM213B/C1ORF93,” and GPR44 (UniProt ID Q9Y5Y4, NCBI GeneID 11251) became "PTGDR2/GPR44." 2) The following 21 human genes were added, and orthologs/analogs were identified, in order to match the AAP gene list: ABCC1, ABCC4, ACOT1, ACOT11, ACOT13, ACOT2, ACOT4, ACOT7, ACOT8, ACSL6, BAAT, ELOVL2, ELOVL5, FADS1, FADS2, GPR17, MGST2, MGST3, OXER1, PLA2G16, and PNPLA2. 3) The capitalization of a number of mouse and zebrafish genes was standardized. 4) Standard names were added as aliases to the following mouse and zebrafish gene names (shown with both gene names): mouse: Alox15/Alox12l, Fam213b/2810405K02RIK|. zebrafish: acsl1b/Zgc:101071, dpep1/zgc:153024 • The third production release (20140421) had the following changes: 1) Rat orthologs were identified for specified genes from the pathway. Three new columns were added to accomodate this data. 2) New yeast analogs were added based on information obtained from the literature during the curation process and supported with a new evidence code (9). 3) Evidence code 9 was added with the following description: "Manual annotation of functional analogs based on published literature." • The fourth production release (20140709) had the following changes: 1) Rat orthologs and analogs were identified for the remainder of the 133 genes. 2) The order of the column groupings was changed to: Human, mouse, rat, zebrafish, yeast. 3) Several minor typographical errors were fixed. -------------------------------------------------------------------------------- For questions please contact Mike Livstone (livstone at genomics.princeton.edu). --------------------------------------------------------------------------------