Overview
Documentation on the ICD-O mapping project.
Meeting Notes
Details
Computing Scope
Use the attached transitive_closure.pl script. The file icdoScope.txt is attached (1558 entries). A file of those unmapped codes is also attached.
cd Snapshot/Refset/Map grep 446608001 *Simple* | grep '/' | perl -ne '@_=split/\t/; print if $_[2];' | cut -f 6 | sort -u -o icdo.txt ./transitive_closure.pl ../../Terminology/*_Relationships*txt out.txt sort -t\ -k 1,1 -o out.txt out.txt join -j 1 -o 2.2. icdo.txt out.txt > scope.txt sort -u -o icdoScope.txt scope.txt icdo.txt /bin/rm f icdo.txt out.txt scope.txt
Preparing the Data
In Lieu of obtaining offical ICD-O-3.1 files from WHO, we proceeded with ICDO morphology data from NCI Metathesaurus (available from NCI in the US). As we understand it, this is the correct ICDO version to map to and contains the real data.
Steps for processing the data:
- in the /home/ihtsdo/data/ICDO directory (on PROD or UAT) are the files containing just morphology code data.
icdo3.txt (see format below)
Code|Level|Term|Note|Code reference|obs|See also|See note|Includes|Excludes|Other text|comment_April_Fritz |1|MORPHOLOGY||||||||| 800|2|Neoplasms, NOS||||||||| 801-804|2|Epithelial neoplasms, NOS|||||||||
chdPar.txt (sample entries below, it's a child → parent code list).
800|MORPHOLOGY 801-804|MORPHOLOGY 805-808|MORPHOLOGY 809-811|MORPHOLOGY
- The same directory has a conversion script (icdo3.pl) that produces a valid ClaML representation of the morphology codes (icdo-3-1.xml).
- This file can be loaded using the standard ClaML loader.
- In the same directory is a subset of the ICDO mapping file containing just the morphology entries.
- der2_sRefset_IcdoMorphSimpleMapSnapshot_INT_20160131.txt
- The mapping tool is loaded with just morphology terminology and just morphology mappings. Thus, RF2 release output from the tool can be simply applied to the previous full as it will only indicate modifications to morphology codes. Thus, it is not necessary to load all of ICD-O in order to make this work.
At such time as official files are obtained from WHO and are in a different format, work will be done (as part of warranty/maintenance) to either convert that format into suitable ClaML or a new loader will be created for the native data format.
Production Deployment
- Load the ICDO-3-1 data
- Load the prior version ICDO map (as starting point)
- Create and configure the project (with scope definition based on icdoScope.txt "plus descendants".
- Use "REVIEW" workflow
- Configure Nicki/Donna as users
- Add basic productivity reports.
- Compute workfow
- Q: should we create a "published" project to represent last release state?
Publishing ICD-O
TBD
References/Links
- Initial files for ICDO load can be found in https://uat-mapping.ihtsdotools.org/doc/ICDO.zip