IRMNG March 2012

IRMNG March 2012

www.obis.org.au/irmng IRMNG the Interim Register of Marine and Nonmarine Genera: rationale and current status Tony Rees CSIRO Marine and Atmospheric Research, Australia for: GN-CoL names and taxonomy sharing workshop, Hawaii, March 2012 The Dream Imagine a system that would Automatically classify any genus & species name to kingdom / phylum / class / order / family (as far down as possible) what is this critter plus hierarchical relations e.g. parents / children / siblings Return whether a current (valid) or non-current name e.g. synonym Check spelling for correctness, also authority details, plus supply original publication ref. as available Return associated attributes such as extant / fossil status, habitat information, geographic / geologic range, more Work seamlessly, with a single point of entry, across all groups and geologic epochs including present day Be as up-to-date as possible (latest content), and authoritative

(maintained by relevant experts) Tony Rees: IRMNG March 2012 Realising the Dream For extant taxa: role of Cat. of Life, however ~30% of species still to go; for fossil taxa: PaleoDB (unknown proportion missing, maybe 50%?) In mean time, could make progress by assembling global genera list, and infilling with species names as available genera species IRMNG is an attempt along these lines a work in progress, with modest resourcing, but available for use now. Tony Rees: IRMNG March 2012 IRMNG data sources Animal genera + auths from Nomenclator Zoologicus and elsewhere, tax. placements and synonymies from multiple sources including CoL, individual taxon treatments and printed works Botanical genera and auths from Index Nominum Genericorum (ING) supplemented with other sources, tax. placements and

synonymies from multiple sources including GRIN (APGIII in the main), Index Fungorum, AlgaeBase, CyanoDB, more Prokaryote genera, auths and tax. placements from LSPN (Euzby list), previous/non-valid names from multiple sources Virus genera and tax. placements from ICTV db (multiple versions very different through time) Species lists (all groups) from CoL 2006, Aphia/WoRMS 2006, AFD, NZ Organisms Register + more. Tony Rees: IRMNG March 2012 IRMNG content as at March 2012 (cf. e.g. Cat. of Life): Cat. of Life (2011 version): 8k families 178k genera 2.25m species names (including synonyms) IRMNG: 19k families 454k genera 1.46m species names (including synonyms)

Not all IRMNG genera yet linked to relevant families, but ~370k are (remainder linked to higher taxon i.e. phylum, class or order) Extant/fossil, marine/nonmarine flags held for majority of names Nomenclatural status known for most names, tax. status i.e. valid name/synonym for only a subset at this time (varies by group) Authority known for >97% of genera, publication details for animal subset (from Nomenclator Zoologicus in the main) Fuzzy matching (TAXAMATCH) deployed over all web-based queries for correction of potential errors in input names to be matched. Tony Rees: IRMNG March 2012 IRMNG in practice example genus = Lawsonia Same name is currently a valid genus in 3 Codes i.e. plants, animals and bacteria (no barriers to this) Tony Rees: IRMNG March 2012 Required base information is scattered in multiple systems / printed works at this time plant animal

bacterium Tony Rees: IRMNG March 2012 (etc.) Required base information is scattered in multiple systems / printed works at this time plant animal bacterium Tony Rees: IRMNG March 2012 (etc.) IRMNG query as at March 2012 Tony Rees: IRMNG March 2012 IRMNG query as at March 2012 extant, habitat flags

children parents Tony Rees: IRMNG March 2012 synonym of (as known) Note: IRMNG fields displayed on the web are only a subset of full information held for any name, e.g.: Tony Rees: IRMNG March 2012 IRMNG core fields IRMNG ID, Rank Scientific name (for species: epithet + parent ID) Authority Publication (as microcitation subset with link to refs. module) Source(s) for above

Orthography verified against (authoritative source) Parent ID (+ according to) Linnaean ranks only at this time Extant/fossil, marine/nonmarine flags + according to (could be as per parent) Date entered, last modified, deprecated (where required) (under consideration) Intermediate ranks e.g. subfamily, subgenus, also infraspecies (not currently held) Nomenclatural status (+ relation with other names as needed) + according to Type genus / species indicator Taxonomic status (same)

Geo flags (country codes etc.) Nomenclatural Code Palaeo range (periods/epochs) Taxonomic or nomenclatural remarks Vernacular names as available Tony Rees: IRMNG March 2012 Freshwater / terrestrial flags vs. present nonmarine IRMNG is not just a passive aggregator Editorial / curatorial decisions / actions required to: Correct obvious data errors Assemble complete records from multiple sources (where one source data deficient) Normalise authority data (in particular) to a house style Digitise or transcribe print material into electronic form where not otherwise available Decide between conflicting content in data sources e.g. for authority

orthography/year, taxonomic placement, valid/synonym status and more Cross-link names e.g. synonyms -> current names, basionyms -> replacement names, misspelled names to their correctly spelled counterparts, etc. etc. Reconcile variant higher taxonomies as supplied to a single hierarchy Add nomenclatural or taxonomic remarks as required. Tony Rees: IRMNG March 2012 Relevance to present meeting? Demonstrates utility of a single entry point to a system permitting query on any name i.e., a [comprehensive] Taxonomic Name Resolution Service (TNRS) covering all life Envisage something like OBIS or GBIF, but for taxonomy the aggregator / central query point is not a content author, but provides integration and value-added services IRMNG based on static snapshot/s of multiple data sources; cf. a super catalogue should be based on live feeds from relevant authoritative sources, continuously updated as available (?+ some static data not available as feeds) Maybe the static data lives outside the data aggregation/query point, becomes a separately managed source How does / should GNA facilitate this? Will the need for an IRMNG (or IRMNG equivalent) disappear or grow

in the above scenario? (for example could this role be taken by another player or group of players) Tony Rees: IRMNG March 2012 Thank you! Tony Rees: IRMNG March 2012 (supplementary slides) Tony Rees: IRMNG March 2012 Size of the task: IRMNG 2011 content cf. Cat. of Life 2011 Cat. of Life 2011 edition Kingdoms Phyla Classes Orders Families Subfamilies Genera Subgenera Species (valid)

Species (synonyms) 8 111 288 1,233 8,071 178,515 1,347,224 895,441 % with auth's 0% 0% ~100% ~100% IRMNG Oct 2011 extant + fossil 7 153 509

2,645 19,639 452,848 1,020,519 440,738 % with auth's 22.1% 97.1% ~100% ~100% IRMNG Oct 2011 fossil only (0) (12) (64) (715) (6,542) (90,278) (16,792)

(100) CoL has 70% of valid extant species names (of est. 1.9m total), thus maybe also 70% of valid extant genera (with subset of genus-level synonyms) IRMNG has further ~180k extant genus names and ~90k fossil names at this time (including syns) est. ~25k still missing Tony Rees: IRMNG March 2012 Taxonomic names: what the customer is currently offered (+ more) publication discovery official registers taxon-specific DBs ICTV ICTV Viruses Viruses DB

DB CyanoDB CyanoDB integrated DBs ITIS ITIS NCBI NCBI Taxonomy Taxonomy WoRMS WoRMS etc. etc. Index Index Fungorum Fungorum MycoBank MycoBank

LPSN LPSN (Prokaryote (Prokaryote names) names) all names AlgaeBase AlgaeBase New New names names publishe publishe dd (in (in primary primary literature) literature)

Plant Plant GSDs GSDs ICBN ICBN Decisions Decisions Catalogue Catalogue of of Life Life The The Plant Plant List, List, IPNI, IPNI, TROPICOS, TROPICOS, ING

ING Journal Journal TOCs, TOCs, RSS RSS feeds, feeds, text text mining mining PaleoDB PaleoDB Animal Animal GSDs GSDs Abstracting Abstracting services services

Nomenclator Nomenclator Zoologicus Zoologicus Subject Subject bibliographies bibliographies Reviews, Reviews, secondary secondary literature literature Tony Rees: IRMNG March 2012 Zoological Zoological Record Record ICZN ICZN Decisions Decisions

GNI GNI GNUB GNUB Botany Zoology ChecklistBank ChecklistBank ION ION (Index (Index of of Organism Organism Names) Names) other other compilations compilations e.g. e.g. regional

regional lists, lists, Wikispecies, Wikispecies, Wikipedia, Wikipedia, more more Two approaches - GNI and Cat. of Life NameBank / GNI 20m+ names all ranks, no hierarchy mix of clean and dirty names many duplicates extant + fossil, most sectors with at least some names Tony Rees: IRMNG March 2012 GNI search result Lawsonia (all ranks returned) (Mar 2012) candidate genus names highlighted in red (although could

be other ranks too) need access to original taxonomic / nomenclatural resources to sort out / see if anything missed Tony Rees: IRMNG March 2012 Two approaches - GNI and Cat. of Life NameBank / GNI 20m+ names all ranks, no hierarchy mix of clean and dirty names many duplicates extant + fossil, most sectors with at least some names Tony Rees: IRMNG March 2012 Cat. of Life <2m names Linnaean ranks, in hierarchy all clean/ vetted names / relationships

extant only, sectors either complete or absent Cat. of Life search result Lawsonia (Mar 2012) Tony Rees: IRMNG March 2012

Recently Viewed Presentations

  • Global Initiative for Asthma (GINA) 2015 update

    Global Initiative for Asthma (GINA) 2015 update

    Add-on tiotropium by soft-mist inhaler is a new 'other controller option' for Steps 4 and 5, in patients ≥18 years with history of exacerbations. Tiotropium was previously described in GINA as an add-on option on the basis of clinical trial...
  • 投影片 1 - 國立臺灣大學

    投影片 1 - 國立臺灣大學

    Extract prosodic, lexical, and semantic features for each candidate term. ASR trans. With all candidate terms, including words and phrases, we can extract prosodic, lexical, semantic features for each term. Feature Extraction. Key Term Extraction, National Taiwan University.
  • Diapositive 1 - Free

    Diapositive 1 - Free

    Très longs, les essais tournent 24 h sur 24 selon un programme enregistré sur bande magnétique, donnant à tous moments les valeurs de la vitesse V, du couple C, et de l'angle pris par la transmission au long d'un parcours...
  • K-POP K-pop Boy band SJ 2 AM Girl

    K-POP K-pop Boy band SJ 2 AM Girl

    Formed in 2005 by producer Lee Soo-man of SM Entertainment, the group comprised a total of thirteen members at its peak. Super Junior originally debuted with twelve members, consisting of leader Leeteuk, Heechul, Hankyung, Yesung, Kangin, Shindong, Sungmin, Eunhyuk, Siwon,...
  • Electrical System - CCS

    Electrical System - CCS

    Electrical Systems Construction Mr. O Components Incoming Electrical Lines Electrical Meter Service Panels and Subpanels Wiring Outlets Switches Appliances Rough-In Components Wires Cables Electrical Boxes All added during beginning stages of construction Finish Components Outlets Switches Light Fixtures Added toward...
  • Scotland - University of Iceland

    Scotland - University of Iceland

    key stay know news rude Bruce feet queen poor care see greed agreed toad towed agree#d toad# towe#d feed keyed maid made prize price Scottish English No Long Mid Diphthonging (FACE, GOAT) FOOT-GOOSE Merger FOOT-STRUT Split No NURSE Merger LOT-THOUGH,...
  • What is a Tall Tale? Tall tales were

    What is a Tall Tale? Tall tales were

    Some famous tall tale heroes are Johnny Appleseed, Pecos Bill, Paul Bunyan, Davy Crockett, Sally Ann Thunder Ann Whirlwind Crockett, Mike Fink, John Henry, and Annie Oakley. Famous Tall Tale Characters The main character is bigger than life and has...
  • Effects of Splanchnic Vasoactive Agents on Hepatic Functional ...

    Effects of Splanchnic Vasoactive Agents on Hepatic Functional ...

    Effects of Splanchnic Vasoactive Agents on Hepatic Functional Recovery and Regeneration in Porcine 70% Partial Hepatectomy Model. Dong-Sik Kim, Jae Hyun Han, Yoon Young Choi, . Sung Won Jung, Young Dong Yu, Joo-Young Kim*