SEP maps
The SEP mapping table has been developed in MySQL for anatomy redesign. The table contains associations between Structure, Entire and Part concepts. The following is an example of key columns in the SEP mapping table.
SEPID | Basename | S_sctid | S_name | E_sctid | E_name | P_sctid | P_name |
146 | hand | 85562004 | Hand structure (body structure) | 302539009 | Entire hand (body structure) | 120577001 | Hand part (body structure) |
225 | finger | 7569003 | Finger structure (body structure) | 302541005 | Entire finger (body structure) | ||
9 | digestive tract | 51289009 | Digestive tract structure (body structure) | 709519000 | Digestive tract part (body structure) |
The mapping table provides some key functions to help achieve the objectives listed for the SEP assignment project:
- Represent the association between S, E and P concepts
- Identify missing concept
- Ensure concept modeling for appropriate S, E, and P concepts
- Identify concept types (S, E, P) by concept id
- One of the three tables for conversion to the OWL file
- Naming convention applied for all S, E and P concepts
SEP reference sets
The key information in the SEP mapping table can be represented formally as refset in RF2 release format. The refset in production release will provide the services similar to the mapping table for development. In addition, it will enable us to set more detailed MRCM rules for quality assurance. For example, some disorders or procedures should be modelled by E concepts, e.g. amputation, abnormal shortening, enlargement. The finding or procedure site can be restricted to E concepts in the SEP refsets in MRCM rules by possible expression in future SNOMED CT Query Language or ECL (Expression Constraint Language).
^1000005|Anatomy structure and entire association reference set|.900000000000533001 |Association target component|
There are different options for reference types. The Association Reference Set is chosen for the SEP refset.
- Simple Reference Set - The S, E and P concepts can be represented by three separate Simple Reference Sets. However, the associations among them cannot be represented.
- Simple Map Reference Set - The map target is codes from other terminology, classification or code system. The data type for mapTarget is string. Therefore, it is not suitable for association between SNOMED CT concepts.
- Association Reference Set - Represents a set of unordered associations of a particular type between components. Since the Association Reference Set only allows a single target component, the SEP associations can be represented by two Association Reference Sets. The association between Entire and Part will be achieved by the data aggregation between two refsets.
- Anatomy Structure and Entire Association Reference Set - S concept is the referenced component and E concept is the target component. e.g refset ID = 734138000
- Anatomy Structure and Part Association Reference Set - S concept is the referenced component and P concept is the target component. e.g. refset ID = 734139008
For example:
der2_sRefset_AssociationReferenceSnapshot_INT_20170731.txt
id | effectiveTime | active | moduleId | refsetId | referencedComponentId | targetComponentId |
---|---|---|---|---|---|---|
54553d8e-36c9-4004-a1bd-594319f77609 | 20170731 | 1 | 900000000000207008 | 734138000 | 85562004 | 302539009 |
40d092d9-1bd0-4f3a-9045-d284c928763d | 20170731 | 1 | 900000000000207008 | 734139008 | 85562004 | 120577001 |
64069576-bf3d-4249-99ee-fbcfad5a6f54 | 20170731 | 1 | 900000000000207008 | 734138000 | 7569003 | 302541005 |
17f4df92-1601-4426-acec-cf4fca8c86cf | 20170731 | 1 | 900000000000207008 | 734139008 | 51289009 | 709519000 |
The scope for SEP refsets
The SE refset (Anatomy Structure and Entire Association Reference Set) includes all associations between Structure and Entire concepts in body structure hierarchy.
The SP refset (Anatomy Structure and Part Association Reference Set) includes all associations between Structure and Part concepts in body structure hierarchy.
The method for construction and review
Construction of the refsets
The first step is to generate a baseline of the SE refset by the conversion of IS A relationship between E concept and S concept. Each S must be the parent of its corresponding E based on the SEP model. All descriptions of E must contain the word ‘entire’. Therefore, the matches between S and E are identified by E concepts in the relationship table.
The second step is to identify the proper S and E matches from the baseline. There should be one E for each S, and vice versa. However, some E has more than one IS A relationships to S. They have been reviewed to identify the proper S and E match for inclusion in the SE refset. The review results can be found in the lists sep_es_match_duplicate_review.txt and sep_es_match_duplicate_review1.txt.
The majority E concepts have only one S in the IS A relationship table. They have been validated by lexical matching to ensure proper S and E matches in the refset. The naming convention for S is “Structure of X” or “X structure”. The convention for E is “Entire X”. The exact lexical matching is checked by removal of “Entire”, “Structure” and “Structure of” from FSN. Some additional associations are identified by lexicon matching, which had not been covered in the baseline. The pairs of S and E with exact lexical match are included in the SE refset.
A review has been performed for those pairs of S and E that do not have lexical match, amongst which the proper S and E matches are included in the refset. The outcomes of review can be found in the file sep_refset_es_match_review.txt.
Review and quality assurance
The SE refset has been crosschecked with the output file from the technical team and the anatomy SEP mapping table. The differences are reviewed and changes are made accordingly.
The following technical quality assurance has been performed on the refset.
1. Each referencedComponentId of S should only have one targetComponentId of E
2. There is no duplicate association of S and E
3. The referencedComponentId must not be the same to targetCoponentId
The SP refset has followed the same construction and review process.
A member of content team also reviewed the final refsets. In addition, the technical team performed quality assurance from a referential and structural content perspective.
The planned future work
Positively, we have already seen the benefits that the refset can help to identify content issues.
Regarding the completeness of SEP model, during the construction of refset, we identified about 500 E concepts that do not have associations to S. It is a content gap issue rather than a quality issue of the SEP refsets. The new associations will be included when new S concepts are added.
Furthermore, it is proven that the refset helps identify the issue of inconsistent naming, which is demonstrated by descriptions that do not have lexical match. They will be addressed in future release.