CGM accuracy – Calibration is King!

Over the years I’ve spent some time living with multiple CGM systems:

  • Dexcom G4 and G5
  • Medtronic Guardian2/Enlite (640G pump)
  • FreeStyle Libre
  • FreeStyle Libre with Spike/xDrip+

Incidentally for the purposes of this technical discussion, the Libre is a Continuous Glucose Monitor. Any lack of alarms/etc does not change this, even if “Flash” monitoring doesn’t have the same end features as other CGMs.

I think it’s worth discussing the accuracy of these. Manufacturers like to quote “MARD” (which is an indication of how far away from a lab reference they are on average), and we see various people in peer-support forums saying things like “I love brand X: it’s always accurate” or “brand Y was terrible: often way off.”

Unfortunately it’s hard to make objective judgements when you’re only dealing with one CGM.

As well as using some of these systems individually, I’m currently part-way through an exercise where I’m wearing three systems concurrently. This has meant I can compare them throughout the day, not just when I decide to do a fingerprick.

I plan to write up the results of this in a later article once it’s complete, but I can at least make some comments already.

Any CGM can only be as good as its calibration

I’ve discussed in the past (after an exercise where I wore Libre and Dexcom G5 systems concurrently) how the Libre Reader’s algorithms don’t always match fingerpricks, but when you feed the raw data into a CGM such as Spike or xDrip+ and apply calibrations, it can be similar to the Dexcom in accuracy. See “How accurate is the Libre Reader?“.

The Dexcom G6 promises no need for calibrations (but does allow you to apply calibrations if you need) and feedback from users in other countries is very positive. But in this discussion I’m comparing the following systems:

  • Dexcom G5 (using the transmitter’s internal calibration)
  • Libre through xDrip+ (using a Nightrider device as the interface, but the results are the same whether you use LibreAlarm, MiaoMiao, BlueReader, etc). Also Dexcom G5 through xDrip+ is the same. Spike’s calibration routines are similar to xDrip+’s.
  • Medtronic 640G (Enlite) CGM

I have been calibrating these systems at the same time, with the same fingerprick value (obtained through Contour Next One or Accu-Chek Guide meters, as discussed in “Do you trust your meter?“). The BG meter is not guaranteed to be 100% accurate, but it’s important to use a device that’s as accurate as possible. Any error in the BG reading can be amplified in the CGM calibration.

How does calibration work?

There’s lots of research and science that goes into these algorithms, and what I’m going to describe is very much a simplification of the calibration routines in these systems. But the basic concept applies to all, so it’s worth getting your head around it.

The sensor returns a “raw” value, and the calibration routine has to translate this into an equivalent glucose value. Generally as the raw number increases so does the glucose number, but the trick is to work out at what rate. Each sensor can return slightly different ranges of values, and in fact during the life of the sensor the results can change (e.g. as the chemicals in the sensor are used up).

Consider this graph, with several calibration points and a line that the calibration algorithm has chosen to fit them:

Using this line, if the sensor returns a raw value of 50 (on the left side of the graph) it’s obvious that the CGM is going to translate that to just under 3 mmol/L.

The line isn’t always going to be a perfect fit for the points recorded, but hopefully they’ll be fairly close to the line:

But what if we only have one calibration point? How does the algorithm decide where the line goes? How steep is it? It has to guess.

In this example the chosen slope has ended up a bit less steep than before, and now a raw value of 50 would be translated to a glucose value of over 3 mmol/L.

Calibration is rarely accurate until you have multiple samples, at different points on the graph!

After I supply the initial calibration for a sensor, I need to remind myself that until I supply more calibration points, the further away from that calibration level I am, the less accurate the CGM is likely to be.  Once I enter a second point it improves a lot, and a third point usually improves it further.

“Good” calibrations are essential

But just adding a second point doesn’t necessarily fix the problem. Any error in the value can in fact make things worse. Consider this graph, where two points of 4.9 and 5.1 have resulted in a line almost identical to the one I initially showed.

But there’s a time lag between a blood glucose value and when that flows through to the interstitial fluid the CGM is measuring. If my BG was rising/falling when I sampled it, the value measured by the CGM won’t be a good match. And our BG meters can introduce their own errors through lack of precision. Hopefully my fingers were clean as well.

Also keep in mind that some of these systems can take 5-15 minutes to process the calibration. If your glucose levels change dramatically in that time this can introduce further errors.

Consider this graph where each of those points was only different by 0.1 mmol/L. Such a small difference that we would generally regard it as an insignificant change.

It should be immediately obvious that any raw value of 50 would now be translated into a glucose value of 1 mmol/L (instead of just under 3). It’s very annoying being woken up by hypo alarms when you’re not actually low! At the same time, high values will be exaggerated with this graph.

Incidentally, my observations of the “calibration-free” Abbott algorithm in LibreLink and the Libre Reader is that it seems to result in a steeper-than-actual calibration slope. Both hypos and hypers are usually exaggerated (which is probably safer than hiding them).

Older calibrations are ignored

As the sensor ages the calibration will usually vary (some systems more than others) so fresh calibrations are important.

The details of this vary between systems, but this seems a reasonable summary:

  • The Medtronic Guardian2 CGM only uses the last four calibration points.
  • The Dexcom G5 uses more (at least the last five).
  • xDrip+ uses as many as you give it. By default it ignores calibrations older than a day, but you can allow it to use older points if there aren’t enough new ones. You can see the calibration points on a graph (similar to shown here) and easily judge if a recent point was a long way away from the line. And you can delete bad points.

The Medtronic and Dexcom systems do not allow you to easily identify or delete a bad calibration point. The most straightforward solution is to work the bad point out of the system by supplying 3 (Dexcom) or 4 (Medtronic) new calibrations. Again ideally at varying BG levels, but each when your levels were stable. This can be a pain!

For the Dexcom G5 if you provide it with 3 calibrations spaced 15 minutes apart, that’s supposed to reset the calibrations to scratch. That takes 45 minutes (if you don’t stuff it up) so some people prefer to stop then start the sensor and cope with a 2-hour outage instead.

Summary

All these CGM systems can be accurate. But only if they’re calibrated well. Poor calibrations will stuff up any system! Unfortunately this is a major “pain point” of current CGM technology.

But if you keep these concepts in mind you can turn most CGMs into very useful tools.

  • Only calibrate when your levels are flat, and likely to remain flat for a while.
    Don’t get stuck into a meal immediately after applying a calibration test.
  • Keep in mind that the systems need more than one calibration point to be able to extrapolate well.

My own closed-loop pump system makes insulin dosing decisions off the values coming out of my CGM. I do have a lot of trust in the system, but I know that I need to keep an eye on the calibration.

12 thoughts on “CGM accuracy – Calibration is King!”

    1. Great article. I’m new to using a libre with miaomiao2 and xdrip+

      Had trouble these first 2 days still need help to understand as I though first day you couldn’t colaborate more than once. Reading this I may be mistaken as xdrip+ still a point and half off actual reading

      Would re colaborating say dripx at 5.1 and actual bg at 6.7 damage drips software or anything?

      Your advice most welcome
      Best regards

      1. David Burren

        It won’t damage the software. It needs to be at least an hour after the last calibration to be entered as a new one (instead of overriding the previous).

        You can see the calibration points in xDrip+’s Calibration Graph.
        And if you’ve enabled viewing the datatables in the settings you can disable incorrect calibration points via the Calibration Datatable.

  1. Thanks for this post. Very illustrative. Keep posting.

    I’m starting to compare LibreLink & LibreReader with the glucose levels reported through XDrip+ using MiaoMiao. I’m calibrating the Xdrip with BG finger pricks. My experience so far is that Xdrip+ measures are much more accurate than the ones got through LibreReader being these last ones ‘higher’ in general, both for hypos and hipers.

    This behavior doesn’t much with your appreciation above: “Incidentally, my observations of the “calibration-free” Abbott algorithm in LibreLink and the Libre Reader is that it seems to result in a steeper-than-actual calibration slope. Both hypos and hypers are usually exaggerated (which is probably safer than hiding them).”

    If the calibration slope in Libre is steeper than actual (BG) calibration, this would lead to show higher glucose values both for hypos and hiper given the same raw input data. I don’t understand why you say this. Steeper calibration slopes would exaggerate hyper values (which is not good) but also hypos values which is even worse.

    1. A steeper slope can result in higher highs AND lower lows.
      Consider a see-saw that pivots around a point near the centre of the graph. When on one side it goes higher, on the other side it goes lower.
      The simplistic graphs I showed here can be defined by two values. The “slope” (angle) of the line, and the vertical position (offset).

      1. Got it! “Consider a see-saw that pivots around a point near the centre of the graph” –> Yes, that is what I thought once I wrote you.
        On the other hand, I already have two weeks of observation of Libre Reader vs Libre/Xdrip+ via MiaoMiao and effectively, I am starting to appreciate quite considerable differences in the readings when low or high being higher the hypers in Libre Reader vs Libre/Xdrip+ and the other way around in the hypos (lower in the Libre Reader vs Libre/Xdrip).
        As you very well mentioned in other post, either you should have only one reader or three, but not two 😉

  2. xDrip+ delays applying calibration by 15 minutes. This means calibration points generally match the interstitial reading when it’s finally applied. Because of this it’s safer to calibrate if your trend is not exactly flat (although it’s still recommended). This also should result in a better calibration profile in most circumstances. The only exception is the initial calibration is applied immediately, which is frustrating, so after a few more calibration points I look to see if it makes sense to delete these initial points.

    Being able to delete *bad* calibration points is a useful feature.

    Also, xDrip uses calibration points from the previous 3 days, not one day as you said in the article. But as you indicated it could be set to use more.

    1. And from what I know, this 15 minute delay is not necessarily the correct behaviour. CGM latency is often quoted as 15 minutes, but that number refers to the output from the official CGM receivers and crucially includes the processing done to the readings, which delays the output by one reading. Papers on this subject indicate the interstitial glucose is only 5-10 minutes late from the blood glucose and thus on systems that use the true RAW data from the sensor, 10 minutes (2 readings) delay would be more accurate.

    1. In xDrip+’s “Less common settings” section you can tell it to “Show Datatables”.
      New options then appear in the main menu. In the “Calibration Data Table” you can long-press on a calibration and disable it.

  3. Hello,

    Thank you for the article and the useful info! I’m very new to this and I’m still a bit confused on how many times I should calibrate by finger pricks xDrip+ with my FreeStyle Libre & MiaoMiao2?

    The whole reason I got into Libre and MiaoMiao was to reduce the amount of finger pricking, but I feel like now I have to go back to doing it once every day/two days to get accurate readings? It was easier with the Libre by itself then because even if it wasn’t entirely accurate, it was a good indicator of my blood sugar is low, in range, or high. Now, my xDrip shows a reading of 65 while the sensor is 130 and finger pricking is 127…

    Can I just calibrate with the values I get from the Libre app? I wouldn’t mind xDrip reporting those exact values. I don’t get why we need a whole other algorithm for that app? I personally can do with only using the readings from the freestyle libre reported on xDrip via MiaoMiao

    1. Calibrating every day or two seems reasonable with Libre. Some people go longer, but the Libre can go wonky and if you don’t test you won’t know.
      A few extra calibrations to start with of course (at at least two different BG levels). And if you stuff up the calibrations it can take a few more pricks to clean things up. Your xDrip=65, fingerprick=127 seems to be a sign of stuffed calibrations.

      No I would NOT calibrate using the numbers from the Libre app. Mathematically that does not make sense, and in fact is a recipe for bad results.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.