In particular, we note that we do not have any requirement to be able to embed a CG or ECL expression directly within a URL. The closest that we would come to this might be where an expression would be included as a parameter:
(given the fact that none of these languages are particularly URI friendly, I suspect that we'd always pass it as mime encoded data (another reason for the mime type)...)
Proposed URI's
The SNOMED CT URI specification defines the following patterns:
http://snomed.info/sct – the SNOMED CT code system (specifically intended for use with FHIR)
http://snomed.info/sct/{sctid}[/version/{timestamp} – editions and versions
http://snomed.info/id/{sctid} - SNOMED CT "Component" (any item identified by an SCTID such as Concept, Relationship, or Description)
http://snomed.info/id/{uuid} - SNOMED CT reference set
I believe that we should remain consistent with this approach. We need to identify languages, so we should create a new category ("lpg", "fol", "scl", ...) and with a new set of identifiers. I would recommend:
In principle, this sounds like a great idea ... but I have a few comments on the specific syntax proposed:
The 'fol' (Family of Languages) abbreviation is problematic, as the community associates the word 'language' with languages like English, Spanish, Danish, French and German etc. If we are going to use a field to separate the 'snomed.info' from the 'ecl'/'scg' part, then another suggestion is to use 'syntax' (noting that this also permits reference to the 'template syntax', which in itself is not a whole language (like the 'expression template language' or 'expression constraint template language' is).
http://snomed.info/syntax/{syntax name}
The abbreviation for Compositional Grammar used to refer to the specification document is 'scg' (i.e. snomed.org/scg). It would therefore be good to be consistent in the URI spec - i.e.
The SNOMED CT Expression Constraint Template Language:
http://snomed.info/syntax/ctl
The SNOMED CT Query Template Language:
http://snomed.info/syntax/qtl
With respect to use cases - I think we have a use case for trying to define expressions, expression constraints and expression templates as URIs - as this seems to be the preferred mechanism for terminology bindings to FHIR resources. Do you have any suggestions on how this could be done (given the special characters included)?
Also, just to reiterate what we discuss in the meeting, once we have finished discussing this within the Languages Project Group, we will take this proposal to the Modelling Advisory Group for further action (and updates to the URL standard).
While I don't care deeply what category is used, 'syntax' seems particularly odd, given that the the word 'language' appears in "ecl" and "etl" – the only one that is a syntax is "sts". Is "language" completely banned as in the category slot, or could we use something like "lsg" (language, syntax grammar)?
When qualified with "Expression constraint" or "Expression template", I think it is sufficiently clear that the word "Language" refers to a computable language. However, when defining a general category the word 'language' may imply human-languages - so it's not suitable as a URI category.
The reason for proposing 'syntax' (which was David's suggestion) is because all of the things that we want to represent in that category are syntaxes. As you say, we call some 'languages', others 'grammars', and the 'template syntax' is neither a language or a grammar ... but they all have a syntax.
The alternative, is to follow Michael's suggestion and drop the 'fol/syntax' grouper category completely, and define 'scg', 'ecl', 'qry', 'sts', 'etl', 'ctl', 'qtl' as their own categories, and use this as a base for adding an instance of the language - for example:
I think we're looking at (at least) two different use cases/requirements:
Davide Sattora / Mayo has asked for common name (URI) to identify a particular SNOMED language or version thereof. He will be passing around expressions in FHIR and other documents where he needs to state "The expression below is written scg or ecl or ..."
Michael's use case is the ability to embed expressions in a URI (URL?).
If you take use case 1, I would argue that it is not consistent with "id/..." – "id" says a SNOMED CT concept identifier follows. "http://snomed.info/id" by itself doesn't make a lot of sense. "field/..." says a SNOMED CT field follows. "http://snomed.info/field" doesn't make a lot of sense either. "module/..." says a SNOMED CT module identifier, "sct/..." a release. Why should "scg" say the SNOMED CT Compositional Grammar language instead of "syntax/..." – which, following the above approach, says a SNOMED CT language/syntax identifier follows?
Use case 2 presents a different situation. I think it would be bad form to use "id/..." to say that either a SNOMED CT concept identifier follows or a URI encoded compositional grammar expression, because existing software wouldn't know whether 12345:6789 was an sctid with a typo or an expression. This would be where "scg/..." would be a perfect fit.
I would definitely support this proposal (as you've just described with /syntax, /id and /scg).
Let's raise this to the Modelling Advisory Group, as the SL Project Group's recommendation ... with the intention of moving forward with this change (if there are no objections). Once the MAG is okay with this, then I can make this change to the URI specification (in the migrated Confluence version of the document).
Thanks for raising this! A very useful addition to the specification!
Do we also consider version-specific URI-encoded expressions, e.g. http://snomed.info/ecl/%3C404684003:%3C%3C47429007=%3C%3C267038008/version/1.1 ? The end might not be a good place for version but what alternatives are there? Cf. http://www.lexicalscope.com/blog/2012/03/12/how-are-rest-apis-versioned/
I've suddenly a need for a mime type for the postcoordination syntax. I was thinking something like text/snomed-pcg, but I see above that Harold Solbrig proposed (the equivalent of) application/snomed-pcg.
Michael - We have not discussed mime types for SNOMED languages further, but happy to discuss this briefly at our SLPG meeting this week. In terms of a 3 character abbreviation for compositional grammar ... we are using "SCG" (SNOMED Compositional Grammar) in the short URLs to the specification (i.e. http://snomed.org/scg) and in the planned updates to the URI standard (e.g. http://snomed.info/syntax/scg) - so I would definitely favour staying consistent - e.g. "text/snomed-scg".
12 Comments
Harold Solbrig
Requirements
1) A shared and standard URI to identify the particular family of languages language (Compositional Grammar, Expression Constraint Language, etc.)
<expression language="URI">(expression)</expression>
Examples:
<expression language="http://snomed.info/fol/ecl">
<![CDATA[<< 77400008 | Appendicitis| ]]>
</expression>
{"expression":
{"language": "http://snomed.info/fol/ecl/version/1.0",
"_contents": "<< 77400008 |Appendicitis|"}
}
2) A mime type to identify referenced languages:
<import href="pointer to a resource" type="MIME" />
Example:
<import href="samples/appendicitis_sample" type="application/snomed-ecl"/>
In particular, we note that we do not have any requirement to be able to embed a CG or ECL expression directly within a URL. The closest that we would come to this might be where an expression would be included as a parameter:
http://example.org/server/valueset?expr=%3C%3C+77400008+%7CAppendicitis%7C
(given the fact that none of these languages are particularly URI friendly, I suspect that we'd always pass it as mime encoded data (another reason for the mime type)...)
Proposed URI's
The SNOMED CT URI specification defines the following patterns:
The pattern that appears consistently against all but (arguably) the first is:
http://snomed.info/{category}/{identifier}/(subcategory)/{identifier}/subsubcategory/{identifier}
I believe that we should remain consistent with this approach. We need to identify languages, so we should create a new category ("lpg", "fol", "scl", ...) and with a new set of identifiers. I would recommend:
http://snomed.info/fol/ecl – for expression constraint language
http://snomed.info/fol/cgl – compositional grammar
http://snomed.info/fol/sql — query language
etc.
And
http://snomed.info/fol/ecl/version/0.9 – for version 0.9 of the SCT expression constrant language
...
Linda Bird
Thanks very much for your URI proposal Harold!
In principle, this sounds like a great idea ... but I have a few comments on the specific syntax proposed:
Harold Solbrig
While I don't care deeply what category is used, 'syntax' seems particularly odd, given that the the word 'language' appears in "ecl" and "etl" – the only one that is a syntax is "sts". Is "language" completely banned as in the category slot, or could we use something like "lsg" (language, syntax grammar)?
Linda Bird
When qualified with "Expression constraint" or "Expression template", I think it is sufficiently clear that the word "Language" refers to a computable language. However, when defining a general category the word 'language' may imply human-languages - so it's not suitable as a URI category.
The reason for proposing 'syntax' (which was David's suggestion) is because all of the things that we want to represent in that category are syntaxes. As you say, we call some 'languages', others 'grammars', and the 'template syntax' is neither a language or a grammar ... but they all have a syntax.
The alternative, is to follow Michael's suggestion and drop the 'fol/syntax' grouper category completely, and define 'scg', 'ecl', 'qry', 'sts', 'etl', 'ctl', 'qtl' as their own categories, and use this as a base for adding an instance of the language - for example:
If this is possible to do in a valid way, then it would be incredibly useful (e.g. for terminology binding use cases).
Brian Carlsen
I support this last suggestion. Clear, easy to understand, and consistent with the style of "id/...".
Harold Solbrig
I think we're looking at (at least) two different use cases/requirements:
If you take use case 1, I would argue that it is not consistent with "id/..." – "id" says a SNOMED CT concept identifier follows. "http://snomed.info/id" by itself doesn't make a lot of sense. "field/..." says a SNOMED CT field follows. "http://snomed.info/field" doesn't make a lot of sense either. "module/..." says a SNOMED CT module identifier, "sct/..." a release. Why should "scg" say the SNOMED CT Compositional Grammar language instead of "syntax/..." – which, following the above approach, says a SNOMED CT language/syntax identifier follows?
Use case 2 presents a different situation. I think it would be bad form to use "id/..." to say that either a SNOMED CT concept identifier follows or a URI encoded compositional grammar expression, because existing software wouldn't know whether 12345:6789 was an sctid with a typo or an expression. This would be where "scg/..." would be a perfect fit.
I would propose:
http://snomed.info/syntax/scg as the name of the SNOMED CT Compositional grammar
http://snomed.info/syntax/scg/version/1.1 as the name of a particular version of scg and
http://snomed.info/id/74400008 as the concept identifier "74400008" and
http://snomed.info/scg/74400008 as the URI encoded compositional grammar expression with one focus code.
Linda Bird
Hi Harold,
I would definitely support this proposal (as you've just described with /syntax, /id and /scg).
Let's raise this to the Modelling Advisory Group, as the SL Project Group's recommendation ... with the intention of moving forward with this change (if there are no objections). Once the MAG is okay with this, then I can make this change to the URI specification (in the migrated Confluence version of the document).
Thanks for raising this! A very useful addition to the specification!
Kind regards,
Linda.
Daniel Karlsson
Do we also consider version-specific URI-encoded expressions, e.g. http://snomed.info/ecl/%3C404684003:%3C%3C47429007=%3C%3C267038008/version/1.1 ? The end might not be a good place for version but what alternatives are there? Cf. http://www.lexicalscope.com/blog/2012/03/12/how-are-rest-apis-versioned/
Regards,
Daniel
Michael Lawley
I've suddenly a need for a mime type for the postcoordination syntax. I was thinking something like
text/snomed-pcg
, but I see above that Harold Solbrig proposed (the equivalent of)application/snomed-pcg
.Did this go any further?
Linda Bird
Michael - We have not discussed mime types for SNOMED languages further, but happy to discuss this briefly at our SLPG meeting this week. In terms of a 3 character abbreviation for compositional grammar ... we are using "SCG" (SNOMED Compositional Grammar) in the short URLs to the specification (i.e. http://snomed.org/scg) and in the planned updates to the URI standard (e.g. http://snomed.info/syntax/scg) - so I would definitely favour staying consistent - e.g. "text/snomed-scg".
Michael Lawley
Great - scg rather than pcg.
Until they're registered then it should probably be:
text/x-snomed-scg
Michael Lawley
Hi Harold Solbrig, I've been tasked with following up on this topic with you to clarify the use-case / requirements for these URIs.
Looking at your original requirements, is there any reason why 1 can't be also satisfied with a MIME type:
<expression language="application/snomed-ecl">
<![CDATA[<< 77400008 | Appendicitis| ]]>
</expression>
{"expression":
{"language": "application/snomed-ecl; version=1.0",
"_contents": "<< 77400008 |Appendicitis|"}
}