NegEx is a simple regular expression algorithm initially created by Will Bridewell.
NegEx's input is a sentence with indexed findings and diseases; output is whether the indexed terms are explicitly negated in the text (e.g., "The patient denies chest pain" -> chest pain (negated)) or are mentioned as a hypothetical possibility (e.g., "Rule out pneumonia" -> pneumonia (possible)).
NegEx version 2 (updated 4/2003) includes the following updates from the previous version (1/2002):
Publications on NegEx:
(1) Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan B. Evaluation of negation phrases in narrative clinical reports. Proc AMIA Symp. 2001;:105-9. (2) Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform. 2001;34:301-10.
(3) Goldin I, Chapman WW. Learning to detect negation with `not' in medical texts. Proc ACM: SIGIR 2003
| I. Regular
Expressions
II. Types of Negation Phrases III. NegEx Algorithm IV. List of Negation Phrases V. UMLS Terms Considered for Negation |
NegEx uses two regular expressions that are triggered by three types of negation phrases.
Regular Expression 1: <negation phrase> * <indexed term>
Regular Expression 2: <indexed term> * <negation phrase>
The asterisk (*) represents five words. A word can be a single word or a UMLS phrase.II. Negation Phrases:
Three types of negation phrases are used by NegEx. Depending on the specific negation phrase in the sentence, an indexed term will be marked as either negated or possible:III. NegEx Algorithm:(1) Pseudo negation phrases - phrases that look like negation phrases but are not reliable indicators of pertinent negatives. If a pseudo-negation phrase is found, NegEx skips to the next negation phrase.
(2) Pre-UMLS negation phrases - phrases that occur before the term they are negating. Pre-UMLS phrases are used in Regular Expression 1.
(3) Post-UMLS negation phrases - phrases that occur after the term they are negating. Post-UMLS phrases are used in Regular Expression 2.
The list of actual phrases is shown is section IV below.
For each sentence, find all negation phrases (from the lists described in (IV) below):IV. List of all Negation Phrases Currently Used by NegEx (updated 4/10/03):Go to the first negation phrase in the sentence (Neg1).
If Neg1 is a pseudo-negation phrase, skip to the next negation phrase in the sentence.
If Neg1 is a pre-UMLS negation phrase, define a window of six words after Neg1 (one UMLS phrase counts as one word - e.g., "chest pain" is considered one word); if Neg1 is a post-UMLS negation phrase, define a window of six words before Neg1.
Decide whether to decrease window size:
A. If another negation phrase (Neg2) is found within the six-word window, decrease the size of the window to end immediately preceding Neg2.Mark all UMLS terms within the window that are indexed as findings or diseases (from the list described in (V) below) as either negated (if negation phrase is a negating phrase) or possible (if negation phrase is a hypothetical possibility phrase).Example: in the sentence "The patient denies [chest_pain and has no shortness_of_breath]" words in bold are negation phrases (denies = Neg1; no = Neg2), words in italics are UMLS terms considered for negation. The initial window for possible pertinent negatives for Neg1 is contained in brackets. Because another negation phrase (Neg2) also falls within the window, the window is decreased as follows: "The patient denies [chest_pain and has] no shortness_of_breath." According to the algorithm, chest_pain will be negated with Neg1 (denies) and shortness_of_breath will be negated with Neg2 (no).
B. If a conjunction from the conjunction list described in ( IV-4 ) occurs within the six-word window, decrease the size of the window to end immediately preceding the conjunction.Example: In the sentence "The patient denies [chest_pain but has experienced some shortness_of_breath]" words in bold are negation phrases (denies = Neg1), words in italics are UMLS terms considered for negation, and the underlined word is a conjunction from the conjunction list. The initial window for possible pertinent negatives for Neg1 is contained in brackets. Because a relevant conjunction also falls within the window, the window is decreased as follows: "The patient denies [chest_pain] but has experienced some shortness_of_breath." According to the algorithm, chest_pain is negated with Neg1 (denies) but shortness_of_breath is not negated.Repeat for all negation phrases in the sentence.
Repeat for all sentences.
(1) Pseudo negation phrasesV. UMLS Terms Considered for Negation:no increase
no suspicious change
no significant change
no change
no interval change
no definite change
no significant interval change
not extend
not cause
not drain
not certain if
not certain whether
gram negative
without difficulty
not necessarily
not only(2) Pre-UMLS phrases
A. Negating phrase (used to mark an indexed term as negated):
absence of
cannot
cannot see
checked for
declined
declines
denied
denies
denying
evaluate for
fails to reveal
free of
negative for
never developed
never had
no
no abnormal
no cause of
no complaints of
no evidence
no new evidence
no other evidence
no evidence to suggest
no findings of
no findings to indicate
no mammographic evidence of
no new
no radiographic evidence of
no sign of
no significant
no signs of
no suggestion of
no suspicious
not
not appear
not appreciate
not associated with
not complain of
not demonstrate
not exhibit
not feel
not had
not have
not know of
not known to have
not reveal
not see
not to be
patient was not
rather than
resolved
test for
to exclude
unremarkable for
with no
without
without any evidence of
without evidence
without indication of
without sign of
rules out
rules him out
rules her out
rules the patient out
rules out for
rules him out for
rules her out for
rules the patient out for
ruled out
ruled him out
ruled her out
ruled the patient out
ruled out for
ruled him out for
ruled her out for
ruled the patient out for
ruled out against
ruled him out against
ruled her out against
ruled the patient out against
did rule out
did rule out for
did rule out against
did rule him out
did rule her out
did rule the patient out
did rule him out for
did rule her out for
did rule him out against
did rule her out against
did rule the patient out for
did rule the patient out against
can rule out
can rule out for
can rule out against
can rule him out
can rule her out
can rule the patient out
can rule him out for
can rule her out for
can rule the patinet out for
can rule him out against
can rule her out against
can rule the patinet out against
adequate to rule out
adequate to rule him out
adequate to rule her out
adequate to rule the patient out
adequate to rule out for
adequate to rule him out for
adequate to rule her out for
adequate to rule the patient out for
adequate to rule the patient out against
sufficient to rule out
sufficient to rule him out
sufficient to rule her out
sufficient to rule the patient out
sufficient to rule out for
sufficient to rule him out for
sufficient to rule her out for
sufficient to rule the patient out for
sufficient to rule out against
sufficient to rule him out against
sufficient to rule her out against
sufficient to rule the patient out againstB. Conditional possibility phrase (used to mark an indexed term as possible):
rule out
r/o
ro
rule him out
rule her out
rule the patient out
rule out for
rule him out for
rule her out for
rule the patinet out for
be ruled out for
should be ruled out for
ought to be ruled out for
may be ruled out for
might be ruled out for
could be ruled out for
will be ruled out for
can be ruled out for
must be ruled out for
is to be ruled out for
what must be ruled out is
(3) Post-UMLS phrases
A. Negating phrase (used to mark an indexed term as negated):
unlikely
free
was ruled out
is ruled out
are ruled out
have been ruled out
has been ruled outB. Conditional possibility phrase (used to mark an indexed term as possible):
did not rule out
not ruled out
not been ruled out
being ruled out
be ruled out
should be ruled out
ought to be ruled out
may be ruled out
might be ruled out
could be ruled out
will be ruled out
can be ruled out
must be ruled out
is to be ruled out(4) Conjunctions to decrease scope of negation phrase
but
however
nevertheless
yet
though
although
still
aside from
except
apart from
secondary to
as the cause of
as the source of
as the reason of
as the etiology of
as the origin of
as the cause for
as the source for
as the reason for
as the etiology for
as the origin for
as the secondary cause of
as the secondary source of
as the secondary reason of
as the secondary etiology of
as the secondary origin of
as the secondary cause for
as the secondary source for
as the secondary reason for
as the secondary etiology for
as the secondary origin for
as a cause of
as a source of
as a reason of
as a etiology of
as a cause for
as a source for
as a reason for
as a etiology for
as a secondary cause of
as a secondary source of
as a secondary reason of
as a secondary etiology of
as a secondary origin of
as a secondary cause for
as a secondary source for
as a secondary reason for
as a secondary etiology for
as a secondary origin for
as an cause of
as an source of
as an reason of
as an etiology of
as an origin of
as an cause for
as an source for
as an reason for
as an etiology for
as an origin for
as an secondary cause of
as an secondary source of
as an secondary reason of
as an secondary etiology of
as an secondary origin of
as an secondary cause for
as an secondary source for
as an secondary reason for
as an secondary etiology for
as an secondary origin for
cause of
cause for
causes of
causes for
source of
source for
sources of
sources for
reason of
reason for
reasons of
reasons for
etiology of
etiology for
trigger event for
origin of
origin for
origins of
origins for
other possibilities of
Input to NegEx is a sentenced whose findings and diseases have been indexed. We index findings and diseases by automatically matching strings from the UMLS that could be described as findings or diseases. Approximately 133,550 UMLS phrases belonging to any of the following fourteen semantic types are findings and diseases we consider for negation:
Finding (e.g., “decreased capillary fragility,” “absent tendon reflex”)
Disease or Syndrome (e.g., “diabetes mellitus,” “dumping syndrome”)
Sign or Symptom (e.g., “dyspnea,” “nausea,” “pain”)
Congenital abnormality (e.g., “cleft palate,” “gastroschisis”)
Acquired abnormality (e.g., “hemorrhoids,” “hernia, femoral,” “varicose veins”)
Lab result (e.g., “forced expiratory volume,” “abnormal skin pH”)
Injury or Poisoning (e.g., “carbon monoxide poisoning,” “frostbite,” “accident caused by bench saw”)
Biologic function (e.g., “obesity endogenous,” “adrenal cortex effects”)
Physiologic function (e.g., “energy expenditure,” “fetal development”)
Mental process (e.g., “anger,” “auditory fatigue”)
Mental or Behavioral dysfunction (e.g., “agoraphobia,” “cyclothymic disorder”)
Cell or Molecular dysfunction (e.g., “DNA damage,” “wallerian degeneration”)
Anatomic abnormality (e.g., “hypertrophy of male breast,” “hand deformities”)
Experimental model of disease (e.g., “alloxan diabetes,” “carcinoma”)There are two types of exceptions in the UMLS lists that we do not consider for negation:
(1) Irrelevant terms: Some of the terms in the UMLS tables are not actual findings or diseases that we want to negate. Although they appear in one of the fourteen UMLS files, we have removed the following words from the list of phrases that are candidates for pertinent negatives:
ruled out
neg
negative for
negative
none
normal limits
normal
unremarkable
nothing
no change
no complaints
no follow-up nos
not treated for
neg <2>
history
history <3>
source
will
very
presence
(2) Findings or diseases that already contain a negation: Some of the UMLS phrases are internally negated (e.g., the UMLS finding “no chest pain”). Any UMLS phrase that is comprised of a negation phrase from List IV and another UMLS phrase has been automatically deleted from the list of phrases considered for negation. A total of 85 such phrases were deleted. For example, the phrase “no chest pain” is not in our list of UMLS phrases, because it is comprised of the trigger phrase “no” and another UMLS phrase “chest pain.”** We have a python version of NegEx available upon request ** Please email Wendy Chapman with any questions or requests: chapman@cbmi.pitt.edu