ECL and grouped attributes

Created by Daniel Karlsson, last modified by Anne Randorff Højen on 2020-Mar-04

1.1.1.1. Daniel Karlsson

Owner

7125 View 20 Comment In discussion Comments enabled In the category: Undefined

Tried four implementations of ECL using grouped and non-grouped attributes in the query and got different results.

1.2. Contributors (7)

Daniel Karlsson

Number of accepted comment 0

Number of comment 5
Ed Cheetham

Number of accepted comment 0

Number of comment 3
Kai Kewley

Number of accepted comment 0

Number of comment 4
Linda Bird

Number of accepted comment 0

Number of comment 3
Michael Lawley

Number of accepted comment 0

Number of comment 3
Peris Brodsky

Number of accepted comment 0

Number of comment 1
Rory Davidson

Number of accepted comment 0

Number of comment 1

20 Comments

Daniel Karlsson

Queries tried:

<< 71388002 | atgard | : { 363703001 | har avsikt | = << 129428001 | preventiv avsikt | }
<< 71388002 | atgard | : 363703001 | har avsikt | = << 129428001 | preventiv avsikt |

(Note that some implementations, most notably the two SNOMED "International" ones, works badly (i.e. not at all) with a non-English alphabet)

Sorry I could not (easily) get comparable SNOMED CT versions... Active=true where the setting is available.

implementation	SNOMED CT release	Query 1	Query 2
sct-snapshot-rest-api, commit 3ce4ab6	The one I had on my hard drive, likely International 2018-01-31	91	91
snowstorm, v 2.1.0	SE edition 2018-11-30	9	523
OntoServer through Shrimp UI, buildid ddd5953f1d34f52fb9f5d79a5d910e5d2f4bfaf4487755d3f8f6a5c7ea12a81c	International 2018-01-31	347	347
snowstorm through browser, v 2.0.0 (https://browser.ihtsdotools.org/ecl/)	International 2018-07-31	7	347

I was expecting the two queries to retrieve the same results. Is there a bug in snowstorm? I cannot find an answer in the specs.

Then, how is "<<* : { 272741003 | lateralitet | = 7771000 | vanster | }" to be interpreted? snowstorm gives 0, OntoServer gives 1068.

Happy New Year,
Daniel

Permalink

2019-Jan-02

Peris Brodsky
Hi all,
Sorry I am late to this conversation, and that my implementation (Slang) has been down (technical difficulties I am working on–it is memory-starved right now).
I've run both queries on my local Slang instance, and they return the same concept count–339 on the INT20170731 substrate (I presume it's the same set as well).
I'm not sure whether I've gotten it right, but many thanks again to Linda for her help with implementing the finer points of attribute groups and grouping.
-Peris
- Permalink
- 2019-Jan-10

Linda Bird
Hi Daniel,
Interesting. Yes - query 1 and 2 should yield the same results. Query 1 finds how many subtypes of |Procedure| have a role group that matches the given refinement, while query 2 finds the subtypes of |Procedure that match the given refinement, irrespective of whether or not it is in a role group. Since all relationships with the attribute |Has intent| are treated in a role group, these 2 queries should yield the same results. I haven't double checked the result set against my own database ... but I assume that the 347 brought back by snowstorm in the browser and Ontoserver is correct for both queries (based on a small sampling of the results, which all seem correct).
The query "<<* : { 272741003 | laterality | = 7771000 | Left | }" should return 0 because |laterality| should never appear in a role group ... so based on the results you report I would say that snowstorm is correct and OntoServer is incorrect on that one. Michael - do you agree?
Kind regards,
Linda.
(P.S. - I'm officially on holidays ... but couldn't resist a bit of ECL from the beach )
- Permalink
- 2019-Jan-03
1. Michael Lawley
  Daniel & Linda you are correct about Ontoserver and <<* : { 272741003 | laterality | = 7771000 | Left | } – it should return zero matches.
  This is a bug in Ontoserver and will be fixed in the next release (5.3.0). For the public sandbox (that Shrimp uses), it is fixed already.
  (Sorry for the delay in sorting this - I was studiously avoiding work over my Xmas holidays :-)
  Permalink
  
  2019-Jan-16
  1. Michael Lawley
    BTW, why "<<* : { 272741003 | laterality | = 7771000 | Left | }" and not just "* : { 272741003 | laterality | = 7771000 | Left | }" ?
    
    Permalink
    
    2019-Jan-16
    1. Daniel Karlsson
      
      Copy and paste + sloppiness
      
      Permalink
      
      2019-Jan-16
Ed Cheetham
No need to respond til holidays end, but...
>> "...Since all relationships with the attribute |Has intent| are treated in a role group, these 2 queries should yield the same results..."
Not so. For the July 2018 international data there are 340 |Has intent| ungrouped (RG=0) roles and 7 in a non-zero group. This suggests to me that snowstorm is behaving as 'intended'.
Ed
- Permalink
- 2019-Jan-07
Daniel Karlsson
Hi Ed, Linda and all,
so, is this a confusion of "ungrouped" and "in relationship group 0"? AFAIK, RG0 means self-grouped (a singleton group) for | has intent | but ungrouped (no group whatsoever) for | laterality | (as specified in the MRCM).
Happy New Year,
Daniel
PS. The sooner we get rid of implicit relationship grouping the better. If this causes confusion in SLPG, how are "normal" users of SNOMED CT to cope? DS.
PS2. Linda, that beach sounds lovely. +1 C and rain here. DS2.
- Permalink
- 2019-Jan-07

Rory Davidson

Already added to the GitHub issue raised by Daniel Karlsson but added here for others on the thread. Comparable results from our current authoring platform terminology server (Snow Owl)

implementation	SNOMED CT release	Query 1	Query 2
Snow Owl v5	International 2018-01-31	7	347
Snow Owl v5	SE edition 2018-11-30	9	523
Snow Owl v5	International 2018-07-31	7	347
Snomed Query Service (used in refset tool)	SE edition 2018-11-30	not supported	523

The results just reflect the confusion on what is the correct implementation for the first query as they match all the snowstorm results (the snapshot-rest-api is deprecated and its results can be ignored at this point).

Permalink

2019-Jan-09

Kai Kewley
Hi all,
If the SLPG group agrees that any attribute which is grouped in the MRCM Attribute Domain reference set should be treated as grouped for ECL query purposes I would be happy to change the Snowstorm implementation. We would also need to get Snow Owl changed to match.
Kind regards, Kai
- Permalink
- 2019-Jan-09
1. Linda Bird
  Hi Kai,
  Yes - Any attribute which is grouped in the MRCM Attribute Domain reference set should be treated as grouped for ECL query purposes. It would be great if the Snowstorm and SnowOwl implementations could be updated to reflect this.
  This is one of the reasons that Daniel, Yong and I have all been supporters of only using role group 0 for attributes where grouped=0 (e.g. |Laterality|). Implementers, who use the inferred Relationship file, could then determine whether or not a relationship is grouped or not (as far as the classifier is concerned) by simply looking at the role group number.
  Kind regards,
  Linda.
  Permalink
  
  2019-Jan-21
  1. Kai Kewley
    Hi Linda,
    Indeed. This is something I have been pushing for with the migration to the OWL stated form. The inferred form is based on the stated form and the self grouping is explicit in the stated OWL axiom expressions - so this seems like a good excuse to fix the inferred form! I think you will get your wish in the 2019 July International Edition!
    I will get this ECL fix ito the next Snowstorm release and raise a ticket for Snow Owl.
    Cheers.
    
    Permalink
    
    2019-Jan-21
  2. Michael Lawley
    I am also a supporter of only using role group 0 for never-grouped attributes.
    However, I worry about your follow-on statement; it has a real potential for misleading people into trouble. Yes, avoiding role group 0 where possible would help avoid inadvertent errors, but it's not an assumption that can be safely made; the RF2 specification doesn't change, so an extension may still use role group 0 for a self-grouped attribute and your determination would be wrong.
    People are confused enough as it is about role grouping and how to interpret the role group numbers. Short-cut heuristics about interpreting role group 0 are only likely to make things worse.
    
    Permalink
    
    2019-Jan-21
    1. Daniel Karlsson
      
      Hi Michael,
      what I think we need to do is to move in this direction in bringing clarity to relationship grouping, and that then implementers might be more confident in the interpretation of relationship groups and their numbers. However, the road to this relationship group nirvana might be long and winding.
      /Daniel
      
      Permalink
      
      2019-Jan-21
Kai Kewley
Hi Daniel,
The ambiguity of role group 0 within the stated form will be resolved with the migration to OWL axioms in the July release of the International Edition this year. Any attribute in group 0 and marked as grouped in the MRCM will be self grouped when converted to a stated OWL axiom. This matches the behaviour within the classifier when preparing the ontology for reasoning.
There is a MAG meeting next Monday where SI will be checking the agreement to allow this change in the stated form to trickle through into the inferred form. If this change goes through in the July release in both the stated and inferred form the ambiguity of how to evaluate ECL queries against attributes in group 0 will be mitigated because these attributes will be self grouped using another role group number.
Kind regards, Kai
- Permalink
- 2019-Jan-09
1. Daniel Karlsson
  Thanks Kai and SI,
  to be clear, dose this change mean that attributes will now be stated in group 1+ even if they are the only attribute in the group? This is currently not allowed (at least not in the Managed Service Authoring tool). Will, for clarity, existing stated RG0 attributes be moved to RGx, x > 0, or will this be for inferred form only?
  Regards,
  Daniel
  Permalink
  
  2019-Jan-09
  1. Kai Kewley
    All stated attributes in group 0 which are grouped in OWL terms will move to another available role group during the OWL axiom conversion. The authoring platform will be changed to allow self grouped relationships.
    If we can get agreement from the MAG the same change will appear in the inferred relationships also.
    
    Permalink
    
    2019-Jan-09
Ed Cheetham
OK, clearly I'm wrong, but I'd be really grateful if someone could take a step back and explain the practical (analytic, interpretive) distinction between self-grouped and ungrouped.
I appreciate there is a syntactic difference, but what will us '..."normal" users of SNOMED CT...' be doing or getting wrong by only recognising a naïve distinction between 'ungrouped (with one or more other roles)' and 'grouped (with one or more other roles)'?
Thanks
Ed
- Permalink
- 2019-Jan-09
Linda Bird
Hi Ed,
This is not an easy question to answer ... which is why we're working to make the distinction less confusing.
When an author refers to a relationship being "ungrouped", they generally mean that there are no other relationships that are grouped together with it.
"Self-grouped" tends to be a phrase used by people with an understanding of what the corresponding Description Logic axioms look like (or the classification process). "Self-grouped" means that the relationship is not grouped together with any other relationships, but that a classifier needs to treat the relationship as though it was in a group (with only 1 member) to ensure that subsumption works correctly. When SNOMED CT is converted into DL for classification, every relationship with an attribute that is considered to be "groupable" is placed into a "role group". "Role group" is actually a concept that is wrapped around the relationship. For example, the concept |Respiratory finding| has 1 defining relationship: |Finding site| = |Respiratory system structure|. To a classifier, the definition of |Finding by site| looks something like "|Role group| (|Finding site| = |Respiratory system structure|)" (note: excuse the lack of formal syntax). This consistency is really important to ensure that subsumption / analytics gets the correct results. So, for example, a classifier will infer that:
- |Mass of respiratory structure| === |Clinical finding|: |Role group| ( |Finding site| = |Respiratory system structure|, |Associated morphology| = |Mass|)
is a subtype of
1. |Respiratory finding| === |Clinical finding|: |Role group| ( |Finding site| = |Respiratory system structure| ).
However, if we left |Finding site| 'ungrouped' (i.e. we did not use the |Role group| concept in the DL syntax) and instead defined it like:
2. |Respiratory finding| === |Clinical finding|: |Finding site| = |Respiratory system structure|
then the |Mass of respiratory structure| would not be inferred as a subtype ... because its definition does not include the ungrouped "|Finding site| = |Respiratory system structure|" required by the 'ungrouped' version 2 of |Respiratory finding|.
Apologies if I have oversimplified some of the DL details or misunderstood your question .... however I hope this explains why this conversation is so important for analytics.
Kind regards,
Linda.
- Permalink
- 2019-Feb-12
Ed Cheetham
Thanks Linda.
Most of this is actually familiar (from the behaviour of at least one of the old perl scripts), and in the meantime I'd salvaged similar material from the owl guide and various places on confluence after searching confluence for “self-grouped”.
My concern as to where this is going is that it risks polluting the relatively human-readable and understandable SCG (and associated syntax) form with MRCM-held authoring business rules and DL reasoner requirements; in particular the calls for ‘no more implicit grouping’.
The notion of ‘never grouped’ roles is relevant to modelling activities (“roles X, Y and Z should never be grouped with another role”) - fine. The notion of representing self-grouped roles is required for an owl-representation of role grouping to work as intended for DL classification, as you explain - fine. I also *think* I see, on the EAG site, a proposal (but it’s not clear from the slides) to automate the groping of many ‘ungrouped but groupable’ role sets. All of these can be enacted when/where they need to in the authoring and classification workflows, but I don’t see why any of these need to ‘leak out’ into the explicit SCG forms of expressions and expression constraints.
Ed
- Permalink
- 2019-Feb-12

Space shortcuts

Page tree

1.1.1.1. Daniel Karlsson

1.2. Contributors (7)

20 Comments

Daniel Karlsson

Peris Brodsky

Linda Bird

Michael Lawley

Michael Lawley

Daniel Karlsson

Ed Cheetham

Daniel Karlsson

Rory Davidson

Kai Kewley

Linda Bird

Kai Kewley

Michael Lawley

Daniel Karlsson

Kai Kewley

Daniel Karlsson

Kai Kewley

Ed Cheetham

Linda Bird

Ed Cheetham