Sorry I am late to this conversation, and that my implementation (Slang) has been down (technical difficulties I am working on–it is memory-starved right now).
I've run both queries on my local Slang instance, and they return the same concept count–339 on the INT20170731 substrate (I presume it's the same set as well).
I'm not sure whether I've gotten it right, but many thanks again to Linda for her help with implementing the finer points of attribute groups and grouping.
Interesting. Yes - query 1 and 2 should yield the same results. Query 1 finds how many subtypes of |Procedure| have a role group that matches the given refinement, while query 2 finds the subtypes of |Procedure that match the given refinement, irrespective of whether or not it is in a role group. Since all relationships with the attribute |Has intent| are treated in a role group, these 2 queries should yield the same results. I haven't double checked the result set against my own database ... but I assume that the 347 brought back by snowstorm in the browser and Ontoserver is correct for both queries (based on a small sampling of the results, which all seem correct).
The query "<<* : { 272741003 | laterality | = 7771000 | Left | }" should return 0 because |laterality| should never appear in a role group ... so based on the results you report I would say that snowstorm is correct and OntoServer is incorrect on that one. Michael - do you agree?
Kind regards, Linda. (P.S. - I'm officially on holidays ... but couldn't resist a bit of ECL from the beach )
>> "...Since all relationships with the attribute |Has intent| are treated in a role group, these 2 queries should yield the same results..."
Not so. For the July 2018 international data there are 340 |Has intent| ungrouped (RG=0) roles and 7 in a non-zero group. This suggests to me that snowstorm is behaving as 'intended'.
so, is this a confusion of "ungrouped" and "in relationship group 0"? AFAIK, RG0 means self-grouped (a singleton group) for | has intent | but ungrouped (no group whatsoever) for | laterality | (as specified in the MRCM).
Happy New Year, Daniel
PS. The sooner we get rid of implicit relationship grouping the better. If this causes confusion in SLPG, how are "normal" users of SNOMED CT to cope? DS.
PS2. Linda, that beach sounds lovely. +1 C and rain here. DS2.
Already added to the GitHub issue raised by Daniel Karlsson but added here for others on the thread. Comparable results from our current authoring platform terminology server (Snow Owl)
implementation
SNOMED CT release
Query 1
Query 2
Snow Owl v5
International 2018-01-31
7
347
Snow Owl v5
SE edition 2018-11-30
9
523
Snow Owl v5
International 2018-07-31
7
347
Snomed Query Service (used in refset tool)
SE edition 2018-11-30
not supported
523
The results just reflect the confusion on what is the correct implementation for the first query as they match all the snowstorm results (the snapshot-rest-api is deprecated and its results can be ignored at this point).
If the SLPG group agrees that any attribute which is grouped in the MRCM Attribute Domain reference set should be treated as grouped for ECL query purposes I would be happy to change the Snowstorm implementation. We would also need to get Snow Owl changed to match.
Yes - Any attribute which is grouped in the MRCM Attribute Domain reference set should be treated as grouped for ECL query purposes. It would be great if the Snowstorm and SnowOwl implementations could be updated to reflect this.
This is one of the reasons that Daniel, Yong and I have all been supporters of only using role group 0 for attributes where grouped=0 (e.g. |Laterality|). Implementers, who use the inferred Relationship file, could then determine whether or not a relationship is grouped or not (as far as the classifier is concerned) by simply looking at the role group number.
Indeed. This is something I have been pushing for with the migration to the OWL stated form. The inferred form is based on the stated form and the self grouping is explicit in the stated OWL axiom expressions - so this seems like a good excuse to fix the inferred form! I think you will get your wish in the 2019 July International Edition!
I will get this ECL fix ito the next Snowstorm release and raise a ticket for Snow Owl.
I am also a supporter of only using role group 0 for never-grouped attributes.
However, I worry about your follow-on statement; it has a real potential for misleading people into trouble. Yes, avoiding role group 0 where possible would help avoid inadvertent errors, but it's not an assumption that can be safely made; the RF2 specification doesn't change, so an extension may still use role group 0 for a self-grouped attribute and your determination would be wrong.
People are confused enough as it is about role grouping and how to interpret the role group numbers. Short-cut heuristics about interpreting role group 0 are only likely to make things worse.
what I think we need to do is to move in this direction in bringing clarity to relationship grouping, and that then implementers might be more confident in the interpretation of relationship groups and their numbers. However, the road to this relationship group nirvana might be long and winding.
The ambiguity of role group 0 within the stated form will be resolved with the migration to OWL axioms in the July release of the International Edition this year. Any attribute in group 0 and marked as grouped in the MRCM will be self grouped when converted to a stated OWL axiom. This matches the behaviour within the classifier when preparing the ontology for reasoning.
There is a MAG meeting next Monday where SI will be checking the agreement to allow this change in the stated form to trickle through into the inferred form. If this change goes through in the July release in both the stated and inferred form the ambiguity of how to evaluate ECL queries against attributes in group 0 will be mitigated because these attributes will be self grouped using another role group number.
to be clear, dose this change mean that attributes will now be stated in group 1+ even if they are the only attribute in the group? This is currently not allowed (at least not in the Managed Service Authoring tool). Will, for clarity, existing stated RG0 attributes be moved to RGx, x > 0, or will this be for inferred form only?
All stated attributes in group 0 which are grouped in OWL terms will move to another available role group during the OWL axiom conversion. The authoring platform will be changed to allow self grouped relationships.
If we can get agreement from the MAG the same change will appear in the inferred relationships also.
OK, clearly I'm wrong, but I'd be really grateful if someone could take a step back and explain the practical (analytic, interpretive) distinction between self-grouped and ungrouped.
I appreciate there is a syntactic difference, but what will us '..."normal" users of SNOMED CT...' be doing or getting wrong by only recognising a naïve distinction between 'ungrouped (with one or more other roles)' and 'grouped (with one or more other roles)'?
This is not an easy question to answer ... which is why we're working to make the distinction less confusing.
When an author refers to a relationship being "ungrouped", they generally mean that there are no other relationships that are grouped together with it.
"Self-grouped" tends to be a phrase used by people with an understanding of what the corresponding Description Logic axioms look like (or the classification process). "Self-grouped" means that the relationship is not grouped together with any other relationships, but that a classifier needs to treat the relationship as though it was in a group (with only 1 member) to ensure that subsumption works correctly. When SNOMED CT is converted into DL for classification, every relationship with an attribute that is considered to be "groupable" is placed into a "role group". "Role group" is actually a concept that is wrapped around the relationship. For example, the concept |Respiratory finding| has 1 defining relationship: |Finding site| = |Respiratory system structure|. To a classifier, the definition of |Finding by site| looks something like "|Role group| (|Finding site| = |Respiratory system structure|)" (note: excuse the lack of formal syntax). This consistency is really important to ensure that subsumption / analytics gets the correct results. So, for example, a classifier will infer that:
then the |Mass of respiratory structure| would not be inferred as a subtype ... because its definition does not include the ungrouped "|Finding site| = |Respiratory system structure|" required by the 'ungrouped' version 2 of |Respiratory finding|.
Apologies if I have oversimplified some of the DL details or misunderstood your question .... however I hope this explains why this conversation is so important for analytics.
Most of this is actually familiar (from the behaviour of at least one of the old perl scripts), and in the meantime I'd salvaged similar material from the owl guide and various places on confluence after searching confluence for “self-grouped”.
My concern as to where this is going is that it risks polluting the relatively human-readable and understandable SCG (and associated syntax) form with MRCM-held authoring business rules and DL reasoner requirements; in particular the calls for ‘no more implicit grouping’.
The notion of ‘never grouped’ roles is relevant to modelling activities (“roles X, Y and Z should never be grouped with another role”) - fine. The notion of representing self-grouped roles is required for an owl-representation of role grouping to work as intended for DL classification, as you explain - fine. I also *think* I see, on the EAG site, a proposal (but it’s not clear from the slides) to automate the groping of many ‘ungrouped but groupable’ role sets. All of these can be enacted when/where they need to in the authoring and classification workflows, but I don’t see why any of these need to ‘leak out’ into the explicit SCG forms of expressions and expression constraints.
20 Comments
Daniel Karlsson
Queries tried:
(Note that some implementations, most notably the two SNOMED "International" ones, works badly (i.e. not at all) with a non-English alphabet)
Sorry I could not (easily) get comparable SNOMED CT versions... Active=true where the setting is available.
sct-snapshot-rest-api, commit 3ce4ab6
I was expecting the two queries to retrieve the same results. Is there a bug in snowstorm? I cannot find an answer in the specs.
Then, how is "<<* : { 272741003 | lateralitet | = 7771000 | vanster | }" to be interpreted? snowstorm gives 0, OntoServer gives 1068.
Happy New Year,
Daniel
Peris Brodsky
Hi all,
Sorry I am late to this conversation, and that my implementation (Slang) has been down (technical difficulties I am working on–it is memory-starved right now).
I've run both queries on my local Slang instance, and they return the same concept count–339 on the INT20170731 substrate (I presume it's the same set as well).
I'm not sure whether I've gotten it right, but many thanks again to Linda for her help with implementing the finer points of attribute groups and grouping.
-Peris
Linda Bird
Hi Daniel,
Interesting. Yes - query 1 and 2 should yield the same results. Query 1 finds how many subtypes of |Procedure| have a role group that matches the given refinement, while query 2 finds the subtypes of |Procedure that match the given refinement, irrespective of whether or not it is in a role group. Since all relationships with the attribute |Has intent| are treated in a role group, these 2 queries should yield the same results. I haven't double checked the result set against my own database ... but I assume that the 347 brought back by snowstorm in the browser and Ontoserver is correct for both queries (based on a small sampling of the results, which all seem correct).
The query "<<* : { 272741003 | laterality | = 7771000 | Left | }" should return 0 because |laterality| should never appear in a role group ... so based on the results you report I would say that snowstorm is correct and OntoServer is incorrect on that one. Michael - do you agree?
Kind regards,
Linda.
(P.S. - I'm officially on holidays ... but couldn't resist a bit of ECL from the beach )
Michael Lawley
Daniel & Linda you are correct about Ontoserver and
<<* : { 272741003 | laterality | = 7771000 | Left | }
– it should return zero matches.This is a bug in Ontoserver and will be fixed in the next release (5.3.0). For the public sandbox (that Shrimp uses), it is fixed already.
(Sorry for the delay in sorting this - I was studiously avoiding work over my Xmas holidays :-)
Michael Lawley
BTW, why "
<<* : { 272741003 | laterality | = 7771000 | Left | }
" and not just "* : { 272741003 | laterality | = 7771000 | Left | }
" ?Daniel Karlsson
Copy and paste + sloppiness
Ed Cheetham
No need to respond til holidays end, but...
>> "...Since all relationships with the attribute |Has intent| are treated in a role group, these 2 queries should yield the same results..."
Not so. For the July 2018 international data there are 340 |Has intent| ungrouped (RG=0) roles and 7 in a non-zero group. This suggests to me that snowstorm is behaving as 'intended'.
Ed
Daniel Karlsson
Hi Ed, Linda and all,
so, is this a confusion of "ungrouped" and "in relationship group 0"? AFAIK, RG0 means self-grouped (a singleton group) for | has intent | but ungrouped (no group whatsoever) for | laterality | (as specified in the MRCM).
Happy New Year,
Daniel
PS. The sooner we get rid of implicit relationship grouping the better. If this causes confusion in SLPG, how are "normal" users of SNOMED CT to cope? DS.
PS2. Linda, that beach sounds lovely. +1 C and rain here. DS2.
Rory Davidson
Already added to the GitHub issue raised by Daniel Karlsson but added here for others on the thread. Comparable results from our current authoring platform terminology server (Snow Owl)
implementation
SNOMED CT release
Query 1
Query 2
The results just reflect the confusion on what is the correct implementation for the first query as they match all the snowstorm results (the snapshot-rest-api is deprecated and its results can be ignored at this point).
Kai Kewley
Hi all,
If the SLPG group agrees that any attribute which is grouped in the MRCM Attribute Domain reference set should be treated as grouped for ECL query purposes I would be happy to change the Snowstorm implementation. We would also need to get Snow Owl changed to match.
Kind regards, Kai
Linda Bird
Hi Kai,
Yes - Any attribute which is grouped in the MRCM Attribute Domain reference set should be treated as grouped for ECL query purposes. It would be great if the Snowstorm and SnowOwl implementations could be updated to reflect this.
This is one of the reasons that Daniel, Yong and I have all been supporters of only using role group 0 for attributes where grouped=0 (e.g. |Laterality|). Implementers, who use the inferred Relationship file, could then determine whether or not a relationship is grouped or not (as far as the classifier is concerned) by simply looking at the role group number.
Kind regards,
Linda.
Kai Kewley
Hi Linda,
Indeed. This is something I have been pushing for with the migration to the OWL stated form. The inferred form is based on the stated form and the self grouping is explicit in the stated OWL axiom expressions - so this seems like a good excuse to fix the inferred form! I think you will get your wish in the 2019 July International Edition!
I will get this ECL fix ito the next Snowstorm release and raise a ticket for Snow Owl.
Cheers.
Michael Lawley
I am also a supporter of only using role group 0 for never-grouped attributes.
However, I worry about your follow-on statement; it has a real potential for misleading people into trouble. Yes, avoiding role group 0 where possible would help avoid inadvertent errors, but it's not an assumption that can be safely made; the RF2 specification doesn't change, so an extension may still use role group 0 for a self-grouped attribute and your determination would be wrong.
People are confused enough as it is about role grouping and how to interpret the role group numbers. Short-cut heuristics about interpreting role group 0 are only likely to make things worse.
Daniel Karlsson
Hi Michael,
what I think we need to do is to move in this direction in bringing clarity to relationship grouping, and that then implementers might be more confident in the interpretation of relationship groups and their numbers. However, the road to this relationship group nirvana might be long and winding.
/Daniel
Kai Kewley
Hi Daniel,
The ambiguity of role group 0 within the stated form will be resolved with the migration to OWL axioms in the July release of the International Edition this year. Any attribute in group 0 and marked as grouped in the MRCM will be self grouped when converted to a stated OWL axiom. This matches the behaviour within the classifier when preparing the ontology for reasoning.
There is a MAG meeting next Monday where SI will be checking the agreement to allow this change in the stated form to trickle through into the inferred form. If this change goes through in the July release in both the stated and inferred form the ambiguity of how to evaluate ECL queries against attributes in group 0 will be mitigated because these attributes will be self grouped using another role group number.
Kind regards, Kai
Daniel Karlsson
Thanks Kai and SI,
to be clear, dose this change mean that attributes will now be stated in group 1+ even if they are the only attribute in the group? This is currently not allowed (at least not in the Managed Service Authoring tool). Will, for clarity, existing stated RG0 attributes be moved to RGx, x > 0, or will this be for inferred form only?
Regards,
Daniel
Kai Kewley
All stated attributes in group 0 which are grouped in OWL terms will move to another available role group during the OWL axiom conversion. The authoring platform will be changed to allow self grouped relationships.
If we can get agreement from the MAG the same change will appear in the inferred relationships also.
Ed Cheetham
OK, clearly I'm wrong, but I'd be really grateful if someone could take a step back and explain the practical (analytic, interpretive) distinction between self-grouped and ungrouped.
I appreciate there is a syntactic difference, but what will us '..."normal" users of SNOMED CT...' be doing or getting wrong by only recognising a naïve distinction between 'ungrouped (with one or more other roles)' and 'grouped (with one or more other roles)'?
Thanks
Ed
Linda Bird
Hi Ed,
This is not an easy question to answer ... which is why we're working to make the distinction less confusing.
When an author refers to a relationship being "ungrouped", they generally mean that there are no other relationships that are grouped together with it.
"Self-grouped" tends to be a phrase used by people with an understanding of what the corresponding Description Logic axioms look like (or the classification process). "Self-grouped" means that the relationship is not grouped together with any other relationships, but that a classifier needs to treat the relationship as though it was in a group (with only 1 member) to ensure that subsumption works correctly. When SNOMED CT is converted into DL for classification, every relationship with an attribute that is considered to be "groupable" is placed into a "role group". "Role group" is actually a concept that is wrapped around the relationship. For example, the concept |Respiratory finding| has 1 defining relationship: |Finding site| = |Respiratory system structure|. To a classifier, the definition of |Finding by site| looks something like "|Role group| (|Finding site| = |Respiratory system structure|)" (note: excuse the lack of formal syntax). This consistency is really important to ensure that subsumption / analytics gets the correct results. So, for example, a classifier will infer that:
- |Mass of respiratory structure| === |Clinical finding|: |Role group| ( |Finding site| = |Respiratory system structure|, |Associated morphology| = |Mass|)
is a subtype of
1. |Respiratory finding| === |Clinical finding|: |Role group| ( |Finding site| = |Respiratory system structure| ).
However, if we left |Finding site| 'ungrouped' (i.e. we did not use the |Role group| concept in the DL syntax) and instead defined it like:
2. |Respiratory finding| === |Clinical finding|: |Finding site| = |Respiratory system structure|
then the |Mass of respiratory structure| would not be inferred as a subtype ... because its definition does not include the ungrouped "|Finding site| = |Respiratory system structure|" required by the 'ungrouped' version 2 of |Respiratory finding|.
Apologies if I have oversimplified some of the DL details or misunderstood your question .... however I hope this explains why this conversation is so important for analytics.
Kind regards,
Linda.
Ed Cheetham
Thanks Linda.
Most of this is actually familiar (from the behaviour of at least one of the old perl scripts), and in the meantime I'd salvaged similar material from the owl guide and various places on confluence after searching confluence for “self-grouped”.
My concern as to where this is going is that it risks polluting the relatively human-readable and understandable SCG (and associated syntax) form with MRCM-held authoring business rules and DL reasoner requirements; in particular the calls for ‘no more implicit grouping’.
The notion of ‘never grouped’ roles is relevant to modelling activities (“roles X, Y and Z should never be grouped with another role”) - fine. The notion of representing self-grouped roles is required for an owl-representation of role grouping to work as intended for DL classification, as you explain - fine. I also *think* I see, on the EAG site, a proposal (but it’s not clear from the slides) to automate the groping of many ‘ungrouped but groupable’ role sets. All of these can be enacted when/where they need to in the authoring and classification workflows, but I don’t see why any of these need to ‘leak out’ into the explicit SCG forms of expressions and expression constraints.
Ed