33

I am looking for a complete list of ICD-9 Codes (Medical Codes) for Diseases and Procedures in a format that can be imported into a database and referenced programmatically. My question is basically exactly the same as Looking for resources for ICD-9 codes, but the original poster neglected to mention where exactly he "got ahold of" his complete list.

Google is definitely not my friend here as I have spent many hours googling the problem and have found many rich text type lists (such as the CDC) or websites where I can drill down to the complete list interactively, but I cannot find where to get the list that would populate these websites and can be parsed into a Database. I believe the files here ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Publications/ICD9-CM/2009/ have what I am looking for but the files are rich text format and contain a lot of garbage and formatting that would be difficult to remove accurately.

I know this has to have been done by others and I am trying to avoid duplicating other peoples effort but I just cannot find an xml/CSV/Excel list.

Community
  • 1
  • 1
TJ.
  • 502
  • 1
  • 4
  • 11
  • You can see the response [here](http://stackoverflow.com/a/1596643/65400) for the discussion of how to format – Aaron Feb 02 '12 at 18:06

4 Answers4

22

Centers for Medicaid & Medicare services provides excel files which contain just the codes and diagnosis, which can be imported directly into some SQL databases, sans conversion.

Zipped Excel files, by version number

(Update: New link based on comment below)

chris
  • 2,404
  • 3
  • 27
  • 33
11

After removing the RTF it wasn't too hard to parse the file and turn it into a CSV. My resulting parsed files containing all 2009 ICD-9 codes for Diseases and Procedures are here: http://www.jacotay.com/files/Disease_and_ProcedureCodes_Parsed.zip My parser that I wrote is here: http://www.jacotay.com/files/RTFApp.zip Basically it is a two step process - take the files from the CDC FTP site, and remove the RTF from them, then select the RTF-free files and parse them into the CSV files. The code here is pretty rough because I only needed to get the results out once.

Here is the code for the parsing app in case the external links go down (back end to a form that lets you select a filename and click the buttons to make it go)

Public Class Form1

Private Sub btnBrowse_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnBrowse.Click
    Dim p As New OpenFileDialog With {.CheckFileExists = True, .Multiselect = False}
    Dim pResult = p.ShowDialog()
    If pResult = Windows.Forms.DialogResult.Cancel OrElse pResult = Windows.Forms.DialogResult.Abort Then
        Exit Sub
    End If
    txtFileName.Text = p.FileName
End Sub

Private Sub btnGo_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnGo.Click
    Dim pFile = New IO.FileInfo(txtFileName.Text)
    Dim FileText = IO.File.ReadAllText(pFile.FullName)
    FileText = RemoveRTF(FileText)
    IO.File.WriteAllText(Replace(pFile.FullName, pFile.Extension, "_fixed" & pFile.Extension), FileText)

End Sub


Function RemoveRTF(ByVal rtfText As String)
    Dim rtBox As System.Windows.Forms.RichTextBox = New System.Windows.Forms.RichTextBox

    '// Get the contents of the RTF file. Note that when it is
    '// stored in the string, it is encoded as UTF-16.
    rtBox.Rtf = rtfText
    Dim plainText = rtBox.Text

    Return plainText
End Function


Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
    Dim pFile = New IO.FileInfo(txtFileName.Text)
    Dim FileText = IO.File.ReadAllText(pFile.FullName)
    Dim DestFileLine As String = ""
    Dim DestFileText As New System.Text.StringBuilder

    'Need to parse at lines with numbers, lines with all caps are thrown away until next number
    FileText = Strings.Replace(FileText, vbCr, "")
    Dim pFileLines = FileText.Split(vbLf)
    Dim CurCode As String = ""
    For Each pLine In pFileLines
        If pLine.Length = 0 Then
            Continue For
        End If
        pLine = pLine.Replace(ChrW(9), " ")
        pLine = pLine.Trim

        Dim NonCodeLine As Boolean = False
        If IsNumeric(pLine.Substring(0, 1)) OrElse (pLine.Length > 3 AndAlso (pLine.Substring(0, 1) = "E" OrElse pLine.Substring(0, 1) = "V") AndAlso IsNumeric(pLine.Substring(1, 1))) Then
            Dim SpacePos As Int32
            SpacePos = InStr(pLine, " ")
            Dim NewCode As String
            NewCode = ""
            If SpacePos >= 3 Then
                NewCode = Strings.Left(pLine, SpacePos - 1)
            End If

            If SpacePos < 3 OrElse Strings.Mid(pLine, SpacePos - 1, 1) = "." OrElse InStr(NewCode, "-") > 0 Then
                NonCodeLine = True
            Else
                If CurCode <> "" Then
                    DestFileLine = Strings.Replace(DestFileLine, ",", "&#44;")
                    DestFileLine = Strings.Replace(DestFileLine, """", "&quot;").Trim
                    DestFileText.AppendLine(CurCode & ",""" & DestFileLine & """")
                    CurCode = ""
                    DestFileLine = ""
                End If

                CurCode = NewCode
                DestFileLine = Strings.Mid(pLine, SpacePos + 1)
            End If
        Else
            NonCodeLine = True
        End If


        If NonCodeLine = True AndAlso CurCode <> "" Then 'If we are not on a code keep going, otherwise check it
            Dim pReg As New System.Text.RegularExpressions.Regex("[a-z]")
            Dim pRegCaps As New System.Text.RegularExpressions.Regex("[A-Z]")
            If pReg.IsMatch(pLine) OrElse pLine.Length <= 5 OrElse pRegCaps.IsMatch(pLine) = False OrElse (Strings.Left(pLine, 3) = "NOS" OrElse Strings.Left(pLine, 2) = "IQ") Then
                DestFileLine &= " " & pLine
            Else 'Is all caps word
                DestFileLine = Strings.Replace(DestFileLine, ",", "&#44;")
                DestFileLine = Strings.Replace(DestFileLine, """", "&quot;").Trim
                DestFileText.AppendLine(CurCode & ",""" & DestFileLine & """")
                CurCode = ""
                DestFileLine = ""
            End If
        End If
    Next

    If CurCode <> "" Then
        DestFileLine = Strings.Replace(DestFileLine, ",", "&#44;")
        DestFileLine = Strings.Replace(DestFileLine, """", "&quot;").Trim
        DestFileText.AppendLine(CurCode & ",""" & DestFileLine & """")
        CurCode = ""
        DestFileLine = ""
    End If

    IO.File.WriteAllText(Replace(pFile.FullName, pFile.Extension, "_parsed" & pFile.Extension), DestFileText.ToString)
End Sub

End Class

TJ.
  • 502
  • 1
  • 4
  • 11
  • Could you please mention where did you find the codes from. – Pranav Shah Mar 17 '11 at 17:56
  • Thank you for this. I've been looking for a set of all codes myself. It's incredible how difficult the government makes it to work with this stuff. – Yuck Apr 27 '11 at 17:27
  • the ziped files linked contain an RTF file, there seem to be more then one line per code. It was not useful to me. – userJT May 16 '12 at 19:36
  • i used your code to remove RTF formatting but the next step did not work properly. leaving a lot of garbage. so I took a file with RTF formatting removed and used Notepad ++ to extract codes/descriptions using regexp. For diagnosis codes the regex is "^(V\d{2}(\.\d{1,2})?|\d{3}(\.\d{1,2})?|E\d{3}(\.\d)?).*$" and for procedure codes it is "^\d{2}(\.\d{1,2})?\t.*". thanks for your code - it gave me a good start! – mishkin Feb 14 '14 at 15:50
5

Center for Medicare Services (CMS) is actually charged with ICD, so I think the CDC versions you guys reference may just be copies or reprocessed copies. Here is the (~hard to find) medicare page which i think contains the original raw data ("source of truth").

http://www.cms.gov/Medicare/Coding/ICD9ProviderDiagnosticCodes/codes.html

It looks like as of this post the latest version is v32. The zip you download will contain 4 plain-text files which map code-to-description (one file for every combination of DIAG|PROC and SHORT|LONG). It also contains two excel files (one each for DIAG_PROC) which have three columns so map code to both descriptions (long and short).

4444
  • 3,541
  • 10
  • 32
  • 43
Benny
  • 51
  • 1
  • 1
  • Looks like I found a fellow gravedigger. I noticed that the answer involved a slew of parsing, and I realized I have used the raw sets before, so where did I get them? I was just about to post... and saw your link. (visited!) – DoverAudio Nov 29 '14 at 23:38
  • If all you want are the ICD-9-CM codes, then this CMS zip has all you need, but if you would like sub-category names, i.e. those codes which represent groups of other codes, then you're out of luck. For this, you need the RTF and awkward parsing. Incidentally, the descriptions often differ between the RTF and CDC versions. I think it's fair to say the whole thing is a mess, designed for printing and reading, not automated parsing. There is XML for ICD-10. – Jack Wasey Oct 11 '15 at 01:59
3

You can get the orginal RTF code files from here http://ftp.cdc.gov/pub/Health_Statistics/NCHS/Publications/ICD9-CM/2009/