Overview
Documentation on the ICD-O mapping project.
Information on the ICDO mapping project.
- ModuleId: 900000000000207008
- RefsetId: 446608001
Helpful Admin Commands
Remove
cd ~/code/admin/remover
mvn install -PMapRecords -D$rc -Drefset.id=446608001
mvn install -PTerminology -D$rc -Dterminology=ICDO -Dversion=16_1
Load
Get maintenance window
cd ~/code/admin/loader
mvn install -PICDO -Dinput.dir=/home/ihtsdo/data/ICDO -Dterminology=ICDO -Dversion=16_1 -D$rc
cd ~/code/admin/loader
set file = /home/ihtsdo/data/ICDO/der2_sRefset_IcdoMorphSimpleMapSnapshot_INT_20170131.txt
/bin/rm -f /tmp/x.txt
# Keep only morphology codes (with '/')
perl -ne '@_=split/\t/; print if $_[4] eq "446608001"' $file | grep '/' > /tmp/x.txt
mvn install -PSimpleMapRecords -D$rc -Dinput.file=/tmp/x.txt -Dmember.flag=true -Drecord.flag=false
Compute
cd ~/code/admin/loader
mvn install -PComputeWorkflow -D$rc -Drefset.id=446608001
Release
cd ~/code/admin/release
mvn install -PBeginRelease -D$rc -Drefset.id=446608001 -Dtest.mode.flag=true
mvn install -PRelease -D$rc -Drefset.id=446608001 -Doutput.dir=. -Dtime=20170731 -Dmodule.id=900000000000207008 -Dtest.mode.flag=true
mvn install -PFinishRelease -D$rc -Drefset.id=446608001
mvn install -PBeginEditingCycle -D$rc -Drefset.id=446608001
Release with Alpha/Beta Iteration
Reloading Snomed is typically being done on prod-mapping now before the clone-over, so steps 2-5 should be skipped on alpha iteration.
1.) Take server down. Check RAM allocated to MAVEN_OPTS with export command. Adjust server size and RAM allocated with this command if necessary:
export MAVEN_OPTS="-XX:MaxPermSize=512m -Xmx3000M -Xmx7000M"
2.) cd ~/code/admin/remover
3.) mvn install -PTerminology -Drun.config=/home/ihtsdo/config/config.properties -Dterminology=SNOMEDCT -Dversion=latest
4.) cd ~/code/admin/loader
5.) mvn install -PRF2-snapshot -Drun.config=/home/ihtsdo/config/config.properties -Dterminology=SNOMEDCT -Dinput.dir=/home/ihstdo/data/xSnomedCT_InternationalRF2_ALPHA_20170731T120000Z/Snapshot
6.) DELETE from simple_map_refset_members where refsetId = 446608001;
7.) set file = /home/ihtsdo/data/ICDO/der2_sRefset_SimpleMapSnapshot_INT_20170131.txt
# Keep only morphology codes (with '/')
perl -ne '@_=split/\t/; print if $_[4] eq "446608001"' $file | grep '/' > /tmp/icdo.txt
mvn install -PSimpleMapRecords -Drun.config=/opt/mapping-rest/config.properties -Dinput.file=/tmp/icdo.txt -Dmember.flag=true -Drecord.flag=false
8.) bring server back up
9.) rerun compute workflow, begin, and process release steps
Loading "Human Readable" View
TBD
Updating Published Project
Similar approach to ICD10 but using the "simple" map loader. e.g.
cd /opt/mapping-admin/remover
mvn install -PMapRecords -Drun.config=/opt/mapping-rest/config.properties -Drefset.id=P446608001
cd /opt/mapping-admin/loader grep 446608001 /opt/mapping-data/doc/release/SNOMEDCT_to_ICDO_446608001/20200731/der2_sRefset_SimpleMapActiveSnapshot_INT_20200731.txt | perl -pe 's/446608001/P446608001/;' > x.txt mvn install -PSimpleMapRecords -Drun.config=/opt/mapping-rest/config.properties -Dinput.file=x.txt -Dmember.flag=false -Drecord.flag=true >&! mvn.log
Meeting Notes
Dev Requested
- Ability to export map (suggestion is to support export from search resutls - there should be a ticket for this already)
- Auto-mapping capability (as part of compute workflow?)
- exact match handler → e.g. support auto mapping where strings exactly match (e.g. without semantic tag)
- Support feedback from Resovled→Editing in workflows (e.g. REVIEW_RESOLVED to REVIEW_IN_PROGRESS) via SAVE_FOR_LATER
- Support for optional codes after the slash, e.g. things not explicitly in ICDO terminology but allowed ,e.g. /6 where only /3 exist
Details
Computing Scope
Use the attached transitive_closure.pl script. The file icdoScope.txt is attached (1558 entries). A file of those unmapped codes is also attached.
cd Snapshot/Refset/Map grep 446608001 *Simple* | grep '/' | perl -ne '@_=split/\t/; print if $_[2];' | cut -f 6 | sort -u -o icdo.txt ./transitive_closure.pl ../../Terminology/*_Relationships*txt out.txt sort -t\ -k 1,1 -o out.txt out.txt join -j 1 -o 2.2. icdo.txt out.txt > scope.txt sort -u -o icdoScope.txt scope.txt icdo.txt /bin/rm f icdo.txt out.txt scope.txt
Preparing the Data
In Lieu of obtaining offical ICD-O-3.1 files from WHO, we proceeded with ICDO morphology data from NCI Metathesaurus (available from NCI in the US). As we understand it, this is the correct ICDO version to map to and contains the real data.
Steps for processing the data:
- in the /home/ihtsdo/data/ICDO directory (on PROD or UAT) are the files containing just morphology code data.
icdo3.txt (see format below)
Code|Level|Term|Note|Code reference|obs|See also|See note|Includes|Excludes|Other text|comment_April_Fritz |1|MORPHOLOGY||||||||| 800|2|Neoplasms, NOS||||||||| 801-804|2|Epithelial neoplasms, NOS|||||||||
chdPar.txt (sample entries below, it's a child → parent code list).
800|MORPHOLOGY 801-804|MORPHOLOGY 805-808|MORPHOLOGY 809-811|MORPHOLOGY
- The same directory has a conversion script (icdo3.pl) that produces a valid ClaML representation of the morphology codes (icdo-3-1.xml).
- This file can be loaded using the standard ClaML loader.
- In the same directory is a subset of the ICDO mapping file containing just the morphology entries.
- der2_sRefset_IcdoMorphSimpleMapSnapshot_INT_20160131.txt
- The mapping tool is loaded with just morphology terminology and just morphology mappings. Thus, RF2 release output from the tool can be simply applied to the previous full as it will only indicate modifications to morphology codes. Thus, it is not necessary to load all of ICD-O in order to make this work.
At such time as official files are obtained from WHO and are in a different format, work will be done (as part of warranty/maintenance) to either convert that format into suitable ClaML or a new loader will be created for the native data format.
Production Deployment
- Load the ICDO-3-1 data
- Create and configure the project (with scope definition based on icdoScope.txt "plus descendants".
- Use "REVIEW" workflow
- Configure Nicki/Donna as users
- Add basic productivity reports.
- Load the prior version ICDO map (as starting point) - both records and members.
- Begin editing cycle
- Compute workfow
- Create a "published" form of the project as well
Here's the process in code:
service tomcat7 stop # 1. load ICDO-3-1 data cd ~/code/admin/loader set file = ~/data/ICDO/icdo-3-1.xml mvn install -PClaML -D$rc -Dterminology=ICDO -Dversion=3_1 -Dinput.file=$file >&! mvn.log # 2. do in mapping tool service tomcat7 start ... service tomcat7 stop # 3. Load the prior version ICDO map cd ~/code/admin/loader set file = ~/data/ICDO/der2_sRefset_IcdoMorphSimpleMapSnapshot_INT_20160131.txt mvn install -PSimpleMapRecords -D$rc -Dinput.file=$file -Dmember.flag=true -Drecord.flag=true >&! mvn.log # 4. Begin editing cycle cd ~/code/admin/release mvn install -PBeginEditingCycle -D$rc -Drefset.id=446608001 >&! mvn.log # 5. Compute workflow cd ~/code/admin/loader mvn install -PComputeWorkflow -D$rc -Drefset.id=446608001 >&! mvn.log # load published project data cd ~/code/admin/loader set file = ~/data/ICDO/der2_sRefset_IcdoMorphSimpleMapSnapshot_INT_20160131.txt perl -pe 's/446608001/P446608001/;' $file >! x.txt mvn install -PSimpleMapRecords -Drun.config=/home/ihtsdo/config/config.properties -Dinput.file=x.txt -Dmember.flag=false -Drecord.flag=true >&! mvn.log service tomcat7 start
References/Links
- Initial files for ICDO load can be found in https://uat-mapping.ihtsdotools.org/doc/ICDO.zip
3 Comments
Donna Morgan
So what is the next step? Will the unmapped concepts be loaded into a project within the mapping tool. Do we need further discussion around the workflow?
Yongsheng Gao
A few things to consider for the next step:
Donna Morgan
Okay thanks Yong....as soon as we are ready to progress then please let me know.