Introduction
The International Edition of SNOMED CT contains three release types (Full, Snapshot and Delta) and each of these release types include 20 files (2019-07-31 release). Each of those files has a distinct structure representing either a type of component (e.g. concept, relationship or description) or a type of reference set. SNOMED CT Extensions contain additional files and most of these conform to the same structure as one of the International Edition release files 1 .
When designing a database to accommodate SNOMED CT release files, decisions need to be made about the names to give to each of the database tables. One option is to give the tables exactly the same names as the release files they represent. However, analysis of the release file naming conventions indicates that these conventions are not directly applicable to table names.
SNOMED CT release file naming conventions include some elements that represent information about the provenance, language and release date of a specific file. This information is useful and in some cases essential as a way of distinguishing releases files. However, this information is neither essential nor helpful when naming tables that may contain data from different SNOMED CT versions, editions and extensions.The release file naming conventions do however include some essential elements that relate directly to the specification of the nature and structure of the data they contain. The following sections provide a summary of the release file naming conventions, identify the elements in release file names that are relevant to database table naming and describe a set rules that can be applied to derive consistent table names from release file names.
Release File Naming
All SNOMED CT release file are named in accordance with the 3.3.2 Release File Naming Convention. The naming conventions result in names that can be decomposed into parts as illustrated by examples with color coding in .
Description of the pattern or file illustrated | Example release file names |
General pattern2 | prefix_[refsetPattern]componentType_[refsetType][extensionName]releaseType[-language]_country_releaseDate.txt |
International edition full release concepts file for 2019-07-31 | sct2_Concept_Full_INT_20190731.txt |
International edition snapshot release english descriptions file | sct2_Description_Snapshot-en_INT_20190731.txt |
Spanish extension full release spanish descriptions file | sct2_Description_SpanishExtensionFull-es_INT_20190430.txt |
International edition snapshot release extended maps reference set file | der2_iisssccRefset_ExtendedMapSnapshot_INT_20190731.txt |
Spanish extension full release spanish language reference set file | der2_cRefset_LanguageSpanishExtensionFull-es_INT_20190430.txt |
International edition snapshot release english language reference set file | der2_cRefset_LanguageSnapshot-en_INT_20190731.txt |
File Name Element Relevance to Table Names
identifies the elements of the release file naming pattern that are relevant to the naming of the database tables containing content from those files. It also outlines the reasons why some elements that form an important part of the release file names can or should be omitted from the relevant database table names.
Filename Element | Relevant to Table Name | Explanation |
---|---|---|
prefix | No | The prefix sct2 or der2 distinguishes components from derivatives (refsets). This information is present in the componentType and refsetType. |
refsetPattern | No | This information relates to the datatypes of additional columns in the file and the table. The table structure includes the required columns so there is no reason to include this in the table name. |
componentType | Yes | This is essential as it indicates either the type of components represented in the table or that this is a reference set |
refsetType | Yes | This is essential to distinguish the tables representing different reference set types (and not present in other file names). |
extensionName | No | This is not required as data from extensions files should be included in the same tables as the equivalent data from the international release. Individual records maintained in extensions can be distinguished by moduleId |
releaseType | Yes | This is essential if importing data from both the full and snapshot release. However, since this is a fundamental grouping, it is probably sensible for this to be a prefix to the table name. Otherwise with long table names this key distinction may be easier to miss. A short prefix denoting release types with a convention that also allows database views to be named in a similar consistent manner is recommended. |
language | No | This is not applicable to the description table name. All descriptions should be accommodated in a single table with the languageCode column indicating the language of the associated term. Similarly it is not applicated to a language reference set table name. All language reference sets should be accommodated in a single table with the refsetId column indicating the language and dialect of each language preference. |
country | No | This is not required in the table name as the country or other point of origin of the components and reference set members is indicated by the moduleId. |
releaseDate | No | This is not required as data from many releases is included in the full release file tables. In the case of the snapshot it would be possible to include the date of the snapshot in the table name. However this is not recommended because, as noted in 4.2. Release Type Options multiple sets of tables representing different snapshot releases multiply the required storage capacity required. |
Deriving Table Names from Release File Names
The analysis in , identifies three elements in the release file name that are relevant to table names. There are various ways in which table names could be derived by combining these elements and one of these is shown in . The end result (shown in ) is a set of table names that:
- Are as short as possible while clearly identifying:
- The release type from which they are derived
- The component or reference set type specification to which they conform
- Are not specific to a particular SNOMED CT release or edition.
Note
The rules shown here are those applied to the example SNOMED CT database. Alternative table naming patterns may be preferred by those developing their own SNOMED CT database. However, is important is to ensure that the table naming pattern should be consistently applicable to all release files. Furthermore, it also should be readily applicable to any additional reference set types that may be added to future releases of the International Edition (or included in other SNOMED CT editions and or extensions).
Start with file name pattern | prefix_[refsetPattern]componentType_[refsetType][extensionName]releaseType[-language]_country_releaseDate.txt |
Remove element that are not required | componentType[_refsetType]releaseType |
Make release type the prefix | releaseType_componentType[_refsetType] |
Abbreviate the prefix to 4 characters (full or snap) | rtyp_componentType[_refsetType] |
.
List of Release File Name | List of Corresponding Table Names in the Example Database |
sct2_Concept_Full_INT_20190731.txt | |
sct2_Description_Full-en_INT_20190731.txt | |
der2_cRefset_AssociationFull_INT_20190731.txt | |
der2_cRefset_AttributeValueFull_INT_20190731.txt | |
der2_ciRefset_DescriptionTypeFull_INT_20190731.txt | |
der2_iisssccRefset_ExtendedMapFull_INT_20190731.txt | |
der2_cRefset_LanguageFull-en_INT_20190731.txt | |
der2_ssRefset_ModuleDependencyFull_INT_20190731.txt | |
der2_cissccRefset_MRCMAttributeDomainFull_INT_20190731.txt | |
der2_ssccRefset_MRCMAttributeRangeFull_INT_20190731.txt | |
der2_sssssssRefset_MRCMDomainFull_INT_20190731.txt | |
der2_cRefset_MRCMModuleScopeFull_INT_20190731.txt | |
sct2_sRefset_OWLExpressionFull_INT_20190731.txt | |
der2_cciRefset_RefsetDescriptorFull_INT_20190731.txt | |
der2_Refset_SimpleFull_INT_20190731.txt | |
der2_sRefset_SimpleMapFull_INT_20190731.txt | |
sct2_Relationship_Full_INT_20190731.txt | |
sct2_StatedRelationship_Full_INT_20190731.txt | |
sct2_TextDefinition_Full-en_INT_20190731.txt | |
sct2_Concept_Snapshot_INT_20190731.txt | |
sct2_Description_Snapshot-en_INT_20190731.txt | |
... list continues for all Snapshot release files | ... list continues for all the snap_ tables |
Ref | Notes |
---|---|
1 | A few files in an extension may conform to a reference set that has been defined by the organization responsible for that extension. |
2 | Pattern elements in square brackets [ ] are optional depending on file type. |
Feedback