
Have you ever wondered what formula Excel uses to make its “curved line” scatter plots? I have!
Admittedly, this sounds like an odd thing to wonder about. But it’s actually a super-practical question: what if you want to approximate an intermediate point along the curve you have just plotted? For some reason, Excel hides this formula from you, and according to my source post, Microsoft has left this formula a black-box and given users no access to it. WTF, mate.
An intrepid engineer named Brian Murphy at XLRotor seems to have “cracked the code” – he has a worksheet that can reproduce Excel’s curved line with a custom formula. It seems Excel uses a spline (as one might expect), but there are many different kinds of splines and he has found the right one. As I confirmed today, the kind of spline it uses is not a simple cubic spline like the one described in the Wikipedia spline article; Excel uses a (third-order) Bezier Spline, which is commonly used in computer graphics (like PostScript).
You can download Brian’s Bezier example file (deep link) and play with it yourself to convince yourself he has the right solution. Unfortunately, his workbook is an Excel 2003 sheet with some macros. The way I am dealing with this is putting my interpolations into his example sheet, and then pasting them back out as I need them.
If you want to do this as well, you need to understand his argument structure, which I think could use a bit more documentation. For those interested in using his example file, the syntax is:
FEvaluate_Bezier(x-range,y-range,i1,i2,i3,i4,delta, chart_name)
Where the x-range and y-ranges are the data points, i1-i4 are integers representing the indices of the x and y arrays to use as the points to interpolate between. You need four points instead of two in order to get the curviness right – this is also how you know it’s a cubic spline. For the interpolations at the edge of the curve, just use the first or last point twice.
Delta is a dimensionless number representing the increment along the path from {x[i2],y[i2]} and {x[i3],y[i3]}. When delta=0.0, the function gives {x[i2],y[i2]}, and delta=1.0 gives {x[i3],y[i3]}, so you can sample the intermediate curve as densely as you care to. Pretty nifty!
I have no idea why you need the name of the chart, so I just left the example chart in there.
One potential confusion – since the interpolation can return a fully parametric path, not just a function, the formula returns an array (as opposed to, say, just a y-value). If you’re not familiar with array formulas in Excel, the up-shot of this is that you should highlight two adjacent cells before you type in the formula, and then press Ctrl+Enter instead of just Enter to get both the x and y values to appear. Happy interpolating!
Hi Jon,
Your post really helped me out! In fact I wrote an article about it in which I explained part of my quest to generate smooth curves in VBA and in which you are mentioned. Have a look:
http://www.connexa.nl/?p=870
Sorry for the wicked slow reply to your comment Emiel – awesome! I had a look at your article, and so glad I could help out by documenting just a bit more what Brian Murphy had done.
The article has moved:
http://www.redcirclesoftware.com/determine-the-control-points-of-an-excel-curve-shape/
Not found…
Sorry Arto. Had to remove the article.
Using Excell 2007 (Windows 7)
I can put my own data into this example it works fine with manually inserting my numbers into the base four point data array.
When I try to add this macro code into my own worksheet I cannot get it to work – is the full macro function hidden? Keeps saying it cannot find the variable:
Set cht = ActiveSheet.DrawingObjects(chtname).Chart
Am I missing something?
Further to above I am not quite explaining the problem. I can change/use the existing example as much as I like. If I try to copy/use the same spread sheet formula array anywhere elsewhere the:
Set cht = ActiveSheet.DrawingObjects(chtname).Chart code fails every time with VALUE.
I cannot find where it is trying to reference for the failure. The real chart name I have is correct. Any advice welcome ???
Hi Chris,
I struggled with this too, and had much more success by just building my whole sheet around the example and modifying it carefully, undoing if I hit any VALUE errors. I know this is weird, but Excel macros are very brittle and are totally hell to debug. I’m sorry I can’t offer more specific advice – you might want to try contacting Brian at XLRotor? (see above)
Thanks for the reply! – After much pain I have already found that I could do the same thing that you suggest by building my work into the “example” but my main spreadsheet is very complex and I still have not got all of it into the “example” sheet.
What I have done so far appears to work along the llines you have found but It does not like array references achieved from cells in other sheets in the same document e.g. array in Sheet1 A1 getting data from Sheet2!A1 (more pain).
The macro code for the “FEvaluate function” does not display anywhere that I can find directly if I ask to view it from the “example”. I found the macro code (accidentally) and copied it into my own sheet VBA but it does not work there (original problem) but only in the example itself!
Further to above . . . Viewing via the Excel “VBA Project Explorer”, If I take (copy the code from) all three function code modules from the example and add then to my own project and keep my own code in a separate module I can “escape” from being tied to the example as I was before. Previously I had tried to copy ALL the example module function code into my own module. Sorry – not very good at explaining this but I have mine working now completely independent of the example.
Hi Jon,
This is a really good work, i always wondered how excel did this.
But it seems that there is still some mystery there.
I tried with this couple of points :
1;1.00E-04
2;1
3;10
4;15
The result doesn’t really fit, do you have an idea ?
http://imageshack.us/photo/my-images/59/5s5u.jpg/
Mystery not quite solved…
Entering (9,8) (10,12) (11,12) (14,6) gives clearly incorrect values for 10 < x < 11 in the example file.
Accurate code is here:
http://answers.microsoft.com/en-us/office/forum/office_2007-excel/how-does-excel-plot-smooth-curves/c751e8ff-9f99-4ac7-a74a-fba41ac80300?msgId=5b59b6de-5099-48d1-b205-1b791ea2df05
Hi,
I am trying to use this for a semi-logarithmic X-Y plot and its not acurately predicting the spline in the semi-log space. Any quick fixes i can apply?
If reducing overshoot is really an issue, why shouldn’t Microsoft consider using a monotonicity preserving spline… like this one.
https://en.wikipedia.org/wiki/Monotone_cubic_interpolation
I decided to make the article (and sample Excel-file) publicly available again. You can find it here:
http://www.delegate.nl/calculate-smooth-control-points-for-an-excel-curve-shape/