Monday 20 June 2011

When not finding something is an error

Or to be exact is a #VALUE! - a passing tip for those using spreadsheet functions that return an error code instead of a usable outcome.

I was using the FIND function to look for a character in a text string. Specifically I was trying to find "/" so that I could rip domain parts from a huge list of URLs. My problem came when some of the URLs were just domains and contained no trailing "/" - at this point my formula

=FIND("/",A1)

returned a #VALUE! code instead of a 0, where FIND's outcome is normally a numeric reference indicating the location of string. A bug, I thought, so let's get it fixed. As I was using LibreOffice Calc and their release cycle is more or less monthly, I figured I could live with the issue for now and get the bug fixed for next month.

A conscientious software user always, I duly posted the bug and a day later I get an email indicating someone was processing it. I open the email and... the bug has been closed, marked NOTABUG. The associated comment said "This is not a bug. It is the expected behavior in ODF 1.2". As the newly completed OpenDocument Format 1.2 had some of the world's top experts developing the formula specification, this seemed, well, odd.

For those who, like me, couldn't understand why 0 isn't a logical outcome for "not found" in the FIND function, I asked someone I know who has some expertise in this sort of thing (Rob Weir of IBM, a spreadsheet developer since 1990ish and the chair of the ODF Technical Committee). Turns out it's that damned legacy thing again...

Back in the early days of spreadsheet development, 20 or so years ago, it was believed that a user wouldn't understand 0 as the outcome of a test that finds nothing, potentially causing subtle errors to cascade through the spreadsheet. So spreadsheet developers adopted the practice of outputting an error as the null result for this type of function. As it is now a well-established practice it was decided ODF should stick to it in the formula specification.

As my FIND function was nested in a larger formula, a error output just caused the whole formula to fail, breaking my automated process. Luckily it turns out there is a workaround using another function called ISERR (ain't it great knowing really clever people?). So if like me you need FIND to return an integer or otherwise testable answer rather than an error, wrap it in a ISERR:

 =IF(ISERR(FIND("my string",<the cell to search>)),0,FIND("my string",<the cell to search))

Sorted. Or rather found...