ICD-10 PCS

\b[0-9BCDFGHX][0-9A-HJ-NP-Z]{6}\b

False positives exist! E.g. “G0BBLDGˮ would be identified as an ICD10 PCS code with the expression even though it isnʼt one.

But no false negatives (tested against all 2023 CMS-approved codes):

In [13]: import re
In [14]: with open("icd10pcs_codes_2023.txt") as pcs:
    ...:     pcs_codes = pcs.readlines()
    ...:
In [15]: len(pcs_codes)
Out[15]: 78530
In [16]: matches = 0
In [17]: for line in pcs_codes:
    ...:     if re.search(r'\b[0-9BCDFGHX][0-9A-HJ-NP-Z]{6}\b',
...: ...:
In [18]: matches
Out[18]: 78530
matches += 1

ICD-10 CM

\b[A-TV-Z]\d[A-Z\d]\.?[A-Z\d]{0,4}\b This follows the specification*:

3 - 7 characters
Character 1 is alpha (all letters except U are used) Character 2 is numeric
Characters 3  7 are alpha or numeric
Use of decimal after 3 characters

Brute force validated as well, though there were 3 false negatives:

In [1]: import re
In [2]: with open("icd10cm_codes_2024.txt") as cm:
   ...:     icd10cm = cm.readlines()
   ...:
In [3]: len(icd10cm)
Out[3]: 74044
In [5]: matches = 0
In [6]: for line in icd10cm:
   ...:     if re.search(r'\b[A-TV-Z]\d[A-Z\d]\.?[A-Z\d]{0,4}\b
   ...:         matches += 1
   ...:

adosib/icd10_regex.md

ICD-10 PCS

ICD-10 CM