Data is Not Information

Data is not information. That phrase, which I owe to a podcast I’ve long since forgotten, conveys something fundamental about making data a useful component of an analytics or machine learning application. Namely, it is not enough to collect it, clean it up, and standardize it. We need to do something else. We need to extract the information.

To give you a practical example of what that can look like, I pulled a sample of claimants who received arthroscopic knee surgery. I marked all their diagnoses as recipients of the surgery (with a 1), and then I marked all the diagnoses of other claimants as non-recipients (with a 0). Then I grouped the diagnosis codes by the highest average surgical incidence. The result looks like this:

But it turns out a many of these diagnoses were not associated with claimants in a reasonable proximity to their surgery - maybe not for months before; maybe not until after. So, I mark those diagnoses as 0s - yes, the claimant was associated with it, but it does not seem like the driver of the surgery. Performing the same grouping with my revised data, I get this result:

So that’s similar, but various meniscus tears vault to the top of the list. We now have a truer sense of the diagnoses likely to precede arthroscopic knee surgery.

But still, this has a problem. The person with a torn meniscus probably does not have one diagnosis in isolation. So, we can refine our analysis by taking the claimants associated with a torn medial meniscus and setting them aside. Now, only those claimants without an association to a torn medial meniscus remain in our data. We repeat the grouping and find a sprain of the cruciate ligament is the next most likely to be associated with the surgery. We remove those claimants; now only claimants without either a torn medial meniscus or sprain of cruciate ligament remain. We keep going until a stopping criterion is reached. Now we see the following:

Comparing the first table to the last reveals a fairly different set of diagnoses. But the latter set much better reflects the diagnoses that are associated with the procedure. That is the product of applying our assumption that for a diagnosis to be relevant to the surgery, it should be invoked in reasonable proximity, and then trying to isolate the impact of each diagnosis. That is turning data into information.

Defining the most strongly associated diagnoses to accompany arthroscopic knee surgery is to narrow the universe of claimants for which that surgery is likely, and, coupled with predictive modeling, to shine a light on the most probable claims therein. Because these procedures are not inevitable and frequently over-prescribed, we can highlight the claims where a surgery may be forthcoming and allow you to be proactive about that potentiality in a way that is beneficial to all parties.

In real case studies, our clients have acted on our surgical forecasts and many claimants have successfully treated non-invasively.

For more information:

Visit our Website

Request a Consultation

Follow us on LinkedIN


Just One Hour Sessions November 13 through November 15

4:30pm BST (UK) 11:30am EST 10:30am CT 9:30am MT 8:30am PT

Featured Posts
Recent Posts
Search By Tags
Follow Us
  • Facebook Basic Square
  • Twitter Basic Square
  • Google+ Basic Square