Wednesday 26 October 2011

What we can learn from monkeys (part 2)

If, the saying goes, you take enough monkeys and typewriters then given enough time it is statistically probable that they will reproduce the entire works of Shakespeare. Not a word or a phrase at a time, but each play as a coherent whole. The chance of the letter "n" being pressed, out of the 46 keys on our typewriter, is 1/46, the same for "o", the same for "w", for " "[1], for "i","s"," ","t","h","e"," ","w","i","n","t","e","r" and so on until we get the complete and excellent opening line of Richard III:
"Now is the winter of our discontent made glorious summer by this sun of York"
The probability of this occurring with just one monkey is 1/46*1/46*1/46 and so on for the number of characters (76). To be more specific, the probability of creating just this first line is

1 in a number far too big to even write out in normal numbers (46^76)

Hmmm. Ok, to bring it down to numbers worth writing out, let's just look at the first two words "Now is", a total of 6 characters, which the chances of one monkey randomly tapping out is

1 in 9,474,296,896 (making the odds of winning the UK lottery jackpot of 1 in 13,000,000 seem positively likely!)

But with enough monkeys and enough time it becomes more and more likely that not only will one them tap out those opening 6 characters, but also the opening 76 characters, and even the all of the large number of characters that is Richard III, all the rest of Shakespeare's surviving plays and even the lost ones (although I'm not sure how we would know the lost ones had been correctly typed out...). By combining an infinite number of monkeys and typewriters, it would not be statistically significant that the monkeys produced the complete works; if you started with nothing and in 6 weeks someone delivers a typewritten manuscript the fact that infinite monkeys were involved means that statistically it would be not unlikely i.e. not *mathematically* improbable that the manuscript has been produced by pure random chance by a bunch of monkeys with typewriters.
Arthur looked up. "Ford!" he said, "there's an infinite number of monkeys outside who want to talk to us about this script for Hamlet they've worked out."[2]
So here our ever helpful monkeys are helping teach us something about statistics; that they are a dangerous source of truths. While not mathematically improbable, this truth is heavily dependent on a few highly unlikely things, like having an infinite living space in which to house infinite numbers of monkeys with their typewriters.

This becomes even more of an issue when numbers turned into statistics to be used by politicians and news outlets and those others with an agenda, who all too often mistake correlation for causation, using statistics to demonstrate why some new policy or other is needed or why a current one should be changed, when (1) not understanding that statistical significance is mostly about having, or assuming, the right amount of monkeys and that (2) the fact that two measurements correlate does not mean one caused the other. News outlets in particular also have a tendency to reproduce statistics as the agenda-pusher would have them reproduced "the murder rate in the country has gone up 10%" (was 10 in year 1, 11 in year 2); the same numbers could just as easily, and probably less misleadingly, have been reported with a more qualitative statement such as "the murder rate in the country was stable".

It is this misuse and misunderstanding that help give rise to the saying "Lies, damned lies and statistics"[3]; but everyone is at it. Internal "news outlets" (news inlets?) are equally prone to misleading messages of the types "90% of users rate the IT support service as 4/5 or higher", which is more precisely reported as "of the people who got around to filling out the satisfaction survey when their helpdesk ticket was closed, 90% rated the service as 4/5 or higher". A statement that would be further informed by the information that the survey defaults to 5 and you have to change it to anything lower; then take into account such truisms as people are more likely to complain than praise. Eventually a qualitative statement would work out to be more useful "the IT support service is making very few people angry", which if new software is being rolled out is good news indeed!

Qualitative statements are seen to carry less weight than ones laden with numbers, which in organisations is probably the fault of the CFO; this seems ironic, given the quantity of assumptions and informed guesswork that is the basis of corporate accountants...

In the end remember this: statistics just provide information in a numerical form. What it all actually means is a matter of interpretation. In other words don't be mislead into believing the numbers are not just another qualitative measure...

[1] Although surely the space bar is so much bigger it would be more likely to be pressed? Damn these complications and assumptions...
[2] Douglas Adams, The Hitchhikers Guide to the Galaxy, with a little help from the Improbability Drive
[3] Said by someone, some time in some form: http://www.york.ac.uk/depts/maths/histstat/lies.htm

Monday 1 August 2011

Installing Linux on Dell E5420

As is customary, to help others following along the path, here's some instructions for installing Linux (Kubuntu 11.04 but I'd guess any modern distro would be a similar experience) on a 2011 Dell E5420. In case the particular specification has an impact on the steps needed I've appended it to the bottom of this post.

These steps assume you have a prepared bootable install medium. I use a USB key, but CD/DVD would be similar.

Step 1 (optional): Ensure laptop is connected to mains and network (best to use wired).

Step 2: Press F12 during BIOS spash screen to reach boot menu, and select your boot media.

Step 3: When install option comes up, select it. Accept all defaults and complete user information as requested/desired.

Step 4: When instructed, reboot laptop. If using USB and an SSD this is in less than 5 minutes.

Step 5: Everything works. Enjoy :)

Sadly this traditional post is basically pointless, because it was that easy. All but one of my nieces could have done it (the youngest is only a month old, give her a chance...). In less time than it takes to install a Windows service pack you can take a blank laptop and transform it into complete working environment with all the core applications (office suite, etc.) already installed. No technical expertise required.

PS Rather nice, solidly built, nice to use laptop, in case you're interested. 3 mouse buttons on the Trackstick too. Battery lasted 7 hours out of the box, including all the install work :)

Spec (yes, no Microsoft Windows was supplied nor paid for):

Latitude E5420
Processor : One Intel Core i3-2310M (2.10GHz, 3MB cache, Dual Core)
Display : 14.0in UltraSharp HD+ (1600X900) AntiGlare LED-backlit with Premium Panel Guarantee
Palmrest : Dual Point Touchpad and Trackstick
Camera : Integrated Full HD Camera with Microphone
Memory : 2GB (1x2GB) 1333MHz DDR3 Dual Channel
Hard Drive : 128GB Mobility Solid State Drive
Optical Drive : DVD +/-RW Drive
Battery : Primary 9-cell 87W/HR, 3 Year Warranty
Wireless : Dell 375 Bluetooth Card
Wireless Card : EMEA Intel Centrino Advanced-N 6205 (802.11 a/b/g/n) Half Mini Card
Keyboard : Internal UK/Irish Qwerty Backlit Dual Pointing Keyboard (antimicrobial)
OS Media : Not Included

Monday 20 June 2011

When not finding something is an error

Or to be exact is a #VALUE! - a passing tip for those using spreadsheet functions that return an error code instead of a usable outcome.

I was using the FIND function to look for a character in a text string. Specifically I was trying to find "/" so that I could rip domain parts from a huge list of URLs. My problem came when some of the URLs were just domains and contained no trailing "/" - at this point my formula

=FIND("/",A1)

returned a #VALUE! code instead of a 0, where FIND's outcome is normally a numeric reference indicating the location of string. A bug, I thought, so let's get it fixed. As I was using LibreOffice Calc and their release cycle is more or less monthly, I figured I could live with the issue for now and get the bug fixed for next month.

A conscientious software user always, I duly posted the bug and a day later I get an email indicating someone was processing it. I open the email and... the bug has been closed, marked NOTABUG. The associated comment said "This is not a bug. It is the expected behavior in ODF 1.2". As the newly completed OpenDocument Format 1.2 had some of the world's top experts developing the formula specification, this seemed, well, odd.

For those who, like me, couldn't understand why 0 isn't a logical outcome for "not found" in the FIND function, I asked someone I know who has some expertise in this sort of thing (Rob Weir of IBM, a spreadsheet developer since 1990ish and the chair of the ODF Technical Committee). Turns out it's that damned legacy thing again...

Back in the early days of spreadsheet development, 20 or so years ago, it was believed that a user wouldn't understand 0 as the outcome of a test that finds nothing, potentially causing subtle errors to cascade through the spreadsheet. So spreadsheet developers adopted the practice of outputting an error as the null result for this type of function. As it is now a well-established practice it was decided ODF should stick to it in the formula specification.

As my FIND function was nested in a larger formula, a error output just caused the whole formula to fail, breaking my automated process. Luckily it turns out there is a workaround using another function called ISERR (ain't it great knowing really clever people?). So if like me you need FIND to return an integer or otherwise testable answer rather than an error, wrap it in a ISERR:

 =IF(ISERR(FIND("my string",<the cell to search>)),0,FIND("my string",<the cell to search))

Sorted. Or rather found...

Monday 2 May 2011

The public good of software

Or public goods, to be more precise. Every economics student can promptly skip three paragraphs.

The subject of economics likes to classify and simplify things; some of these classifications and simplifications are useful, some not. Several in classical economics are downright dangerous, but I digress. One classification that more or less all economists would agree on is the concept of the public good. A public good is one that is collectively paid for because it makes no sense to pay for it any other way; the majority of services provided by the public sector are, or should be, public goods per se.

A classical example of a public good is streetlights. Streetlights have been around since 800AD or so; the first streetlights as we recognise them today were gas powered and switched on in Pall Mall in 1807; the first streets to be lit by electric light were in 1881 (Cleveland, Ohio and Newcastle-upon-Tyne appear to share this honour). All these streetlights share one distinctive feature; it is effectively impossible to have streetlights and charge individually for benefiting from them.

Early streetlights were either installed for public benefit or as a marketing ploy; modern day streetlights are more or less universally funded from local taxation. No one has ever managed to charge directly the user of the streetlight; creating and managing a use-based charging model would be several orders of magnitude more expensive than the provision of the light itself. Light being tricky stuff would provide much of the complication. How much light is enough before you get charged? So how close to the light can you get for free? Do you get charged different amounts depending on the how dark it would be if the light was turned off? How foggy it is? Do people who get the light spilling into their houses get charged? Anyhow, it's far too complex a question and to my knowledge it has never even been attempted. Residents in the town/region pay, everyone who passes by at night benefits.

Recently I argued against open source software being a cartel market. I closed that post by pointing out that software could not actually be seen as a normal economic good; can software be best seen as a public good? Interesting question...

The formal definition of a public good is that it is non-rivalrous and non-excludable. A good is seen to be excludable if you can easily and cheaply prevent those who have not paid for the good from consuming it, while a rivalrous good is one where your consumption of it prevents another from having it.

So here I am with my computer, using LibreOffice to write this post. If software is a normal good then my using it should be preventing you using it. Which clearly it does not as even if you don't yet have it you could just follow the download link to get it, I could give you a copy of the one I'm using and, on a thin client setup, we could all be using the same installed copy, so software (all software) is non-rivalrous; one tick in the box. Interestingly the common example of a non-rivalrous good is broadcast media - yet software is utterly non-rivalrous while broadcast media has a number of restrictions (no simultaneous use of wavelengths, requirement of near line of sight to transmitter, limited range). An alternative definition of a non-rivalrous good is one that could be provided at a very low additional (marginal) cost for each additional customer. Software clearly satisfies this definition and then some as the marginal cost for each additional customer/user is effectively zero.

So is software excludable? Well, companies whose business model is the production of software for sale have historically gone to great lengths to prevent people from using software without paying for it; and yet even with significant public monies being spent in helping prevent this illicit use, people continue to make use of all sorts of software they are not supposed to, entirely free. Arguably even the reasonably effective methods that have been used to prevent software being used illicitly, like dongles, are so expensive relative to the cost of the copy of the software that they would automatically fail the excludability test.

So software is non-rivalrous and non-excludable. A pure public good by normal economic definition. Yet it is not how many people currently view it, and much software is sold at high prices to individuals as if it was a normal good. Actually much software is sold prices so high relative to the cost of the copy of the software that it's priced at what economics would view as monopoly prices.

Why isn't software provided as a public good? Why isn't it something we collectively provide? Surely this would be to the greatest benefit of all, economically speaking, and be far more in keeping with the nature of software. Oddly, it is not uncommon for companies who produce software for sale to spend more money on selling, marketing and protecting the sales of the software than they do on making the software better. Economically speaking this makes no sense - it is the attempt to create artificial exclusions that costs so much money, rather than improvements to the product.

So software could be, should be a public good. Come to think of it, wouldn't it be even better, something more than a public good, if software was not only provided collectively but in a way that made it easy for others to improve? Like if the underlying programming code was available and people could fix stuff and add new features and then make the resulting improved, added value version available to everyone else allowing the virtuous cycle to start again. Wouldn't that be a truly amazing thing? Ever more added value for all, the economics are just brilliant; just need a name for the approach and then surely this "available code" software will spread like wildfire...

Tuesday 12 April 2011

Big Society

The phrase of the era, for good or for bad. But it is of course an old idea, something some people have been practising all their lives, a concept that is as old as thought. Giving is indeed better than receiving. And the best “gives” in the Big Society concept are those that just keep on giving. The volunteers that pick up the litter on the local fields are great as is the outcome of their time contribution, litter-free green areas; but when they stop doing it, whether through age, illness or a move away, the fields stop being litter-free. All too quickly :(

Open source project contributions on the other hand just keep on giving: we add a fix, everyone benefits from the fix; you add an enhancement, everyone gets the enhancement; they add some documentation, everyone gets to use the documentation. Ain’t that great? An excellently large Big Society example that everyone can help with and everyone benefit from, even after people stop contributing. So, all you councils, voluntary organisations, government departments and agencies, come on, join in…

Thursday 24 February 2011

The economics of a (software) cartel

Over here, thanks to @glynmoody, I read what is to all intents and purposes an article with the best of academic credentials. It reads like the summary of a thesis or maybe a major paper, and it is published on a website that tags itself as "Research-based policy analysis and commentary from leading economists" and is backed by the Centre for Economic Policy Research, surely a most respectable institution. Submissions are reviewed by an editorial board, which while not a strict academic peer review process should be fairly close to one.

So given all this, why is the article so flawed? And in its basic premise...

Early on it becomes fairly clear the article is written with an agenda:
How, if at all, should governments use [open source software (OSS)]? One important theoretical insight starts from the observation that [OSS is] ...imperfect [and] has distinct areas of advantage and disadvantage (von Engelhardt 2008). This implies that large modern economies will usually require a mix of both [OSS and closed source software (CSS)].
The article goes on:
[Engelhardt and Maurer] point out that the existence of CSS code increases OSS output and vice versa. To see why, consider an all-OSS world in which each company offers consumers exactly the same shared code as every other company. By definition no company can then compete by writing more OSS code than its rivals. This lack of competition suppresses code production for the same reason that cartels suppress output.
 Quotes from " Open vs closed source software: The quest for balance" linked above, emphasis added

From this point the argument is reasonably constructed and more or less appropriate in its conclusions. But this premise, that a pure open source world would (a) result in less code production and the implication (b) that that would inherently be "a bad thing" is totally unfounded.

So, as it is a very good place to start, I'll start at the beginning; with the definition of the economic concept referred to, a cartel.

A cartel in economic theory is generally seen to occur at a particular point in a range of market types. This range stretches from perfect competition to monopoly. A monopoly market is the condition which the game of the same name defines as victory, that is the absence of competition. Perfect competition at the other end of the scale is a market where all parties know all things about the goods sold in the market (known as perfect knowledge) and it is easy to set up in business. As is clear in the terminology used, perfect competition is seen to be good and monopolies bad.

Economists see a sliding scale between monopoly and perfect competition, and degrees along the way. It is generally accepted that a near or effective monopoly is as bad as a monopoly; a near monopoly can be seen to exist in a market where a single company controls more than two thirds of that market. Below a monopoly in economic badness lies an oligopoly, where a small number of large companies control the majority of a market. It is at this point in the scale that cartels are seen to form. A cartel is where a number of firms in the oligopoly get together and conspire to fix pricing, using their power to inhibit competition, to create an effective monopoly.

One of the consequences of suppressing competition in this way is that as there is less need to compete, not only are prices maintained artificially high but the members of the cartel have no need to try and compete in other ways, through improved production methods, higher quality output, higher rates of output etc.. This is the cartel effect the bad, bad pure open source world will have. No incentive to compete, so reduced output (where output is defined as code).

It is important at this point to note that the basic models and concepts of mainstream economics are very old, and that these models essentially assume a physical product,  known as a "good", is being produced from raw materials and being sold into the market of whatever type. It is also useful to understand that there are theoretic possible consequences of a cartel, and the one used in the article does not take into account the search for profits i.e. that while operating a pricing cartel, companies will still seek to improve their individual profit position and thus continue to evolve in areas centred on reduced production costs and increased output.

Phew. Right, so reasons why monopolies, oligopolies and cartels are bad while competition is good: one basic concept is that while an accounting profit is considered ok an economic profit is considered bad. So for your company to return a profit in its accounts, no matter how large, is absolutely fine; but for the resources used in your company to produce more than a very small (marginal) amount of return than they would if used in their next best option, e.g. if your staff worked for another company which would make a lot less money than yours? This is bad. A second and pertinent concept is that because you have little incentive to make your products better or in a cheaper way, resources are not being utilised in the most efficient manner i.e. not contributing their maximum value to the economy.

So back to the premise: that is that if all software firms were selling the same open source codebase, output of code would fall and this would be "a bad thing". Ok, first off. Software companies that sell software don't do open source. You don't sell software if you're in the open source market, you sell services. Moreover you do so in something approaching a perfectly competitive market.

So why would I give you money to use your Linux when I can get this other Linux, based on the same codebase for free? Oh, you'll ensure patches are timely and applied in a tested manner before being provided to me? You'll provide a ridiculously good SLA if I have problems with my Linux desktops and servers? You have many of the best coders and therefore will be able to fix any issues I encounter in very short timeframes? In which case, your differentiations, albeit minor, are important to me, the buyer in this near perfect market, so I'll give you my money even though there is a free version available.

Then to the second and defined negative aspect: that code production would fall and this would be a bad thing. I will with great willpower refrain from swearing at this point and try to calmly point out the fallacy of applying pure manufacturing thinking to software code. Quantity is utterly irrelevant in software code. Quality is everything. This is then back to the conversation outlined above. The quality differentiator is what I will pay for.

You can make an office productivity tool a little faster, a little more standards-compliant, a little less prone to arbitrarily adding and removing spaces? And you can do it for me now, although other people will get the benefit later? Why, that sounds good to me.  I suppose others will be paying to get a shinier interface and some new functionality, which I will gain from later? That's nice. Why, that sounds suspiciously like a win-win from that game theory stuff doesn't it...

So back to the economics: no, a completely open source software world would not act as a cartel, with a negative consequence of reduced code output. Sure the amount of code written would be less, but not in a negative way; it would be a result of the increased efficiency of the perfectly competitive market software would become.

So open source software would be, more or less, a perfectly competitive market? Well, sort of.

You see, actually there's another underlying premise in that article that is flawed, a premise the article doesn't acknowledge. Quite simply you can't treat software as if it is a normal good - it doesn't exhibit any of the features of one. But I'll leave that for another day...

Thursday 3 February 2011

Having the basics

Ever so now and again I'm reminded of why I'm so sure that senior IS managers must have a grasp of the technical basics. I look at recruitment processes for technical staff and see a reliance on certification as a base measure of suitability. Yet that certification, especially vendor certification (Cisco exempted!), tends to leave out the basics and focus on the applied product. So if the senior IS manager doesn't have the basics, how will they know when their technical staff are misleading them? Or when the sales people are...

So during a client's technical team discussion a forum link was circulated as an answer to a question. The question is irrelevant, the fact that the answer on the forum was essentially correct is also irrelevant; what worried me was a bold factual statement made in passing in the answer and the qualification (Microsoft Most Valuable Professional) of the person who posted it.

I had this vision of some senior IS manager somewhere. They've recruited this really capable person, someone whose "high-quality, real-world technical expertise" has been recognised with this MVP award, and this person is someone they are dependent on for technical advice. And this person can make a factual statement in public like

"VLANs are just subnets"[1]

and still be trusted and qualified to provide technical advice on anything vaguely connected to networks? Networks are one of the basics, like an ABC of a modern information system; the concepts need to be understood as a fundamental of doing the job. There are others too; like a good retail manager has worked every shopfloor department before moving on, like a surgeon has been through general medical qualifications before specialising, a CIO should have a grasp of the basics of each function they oversee.

This is why a CIO must have both technical and business expertise, why just management experience in some other field is insufficient to be really good at the job; and why IS management should be seen as professional career in its own right, with required and tested levels of knowledge to be fully qualified.

[1] For the record... No, a VLAN is not a subnet. A VLAN may have many subnets, or many VLANs a single subnet. Subnets even existed before "Virtual" LANs, and LANs existed before subnets.  They are used for different purposes, although they do often correlate. If you (like the MVP) need to brush up, then VLANs and subnets are good articles to read.

Friday 14 January 2011

A Tablet Conundrum

CES came and went without any significant/useful/timely Microsoft announcements in the tablet sphere. If 2011 is finally to be the year of the tablet, and ownership of tablets starts to become as common as ownership of smartphones, what can Microsoft do about their current uncompetitive position?

Some have asked why has Microsoft not reacted more quickly to the new tablet form factor. After all they have considerable experience making software for tablets and touchscreen devices. I encountered a modified version of XP on tablets in 2003 (and have owned an XP-licenced IBM X41T for over 5 years) and touchscreen handhelds running Microsoft operating systems had been around for some years before that. Even the acquisition of the Surface product should have brought in relevant expertise.

We can point the finger of blame at Microsoft's incumbent status, or at their arguably over-bureaucratic HR and management systems; but there is another far more problematic issue: the business model currently required to succeed in this new marketplace.

Consider the leading operating systems in the new tablet marketplace, Apple iOS and Android. Consider also the bit-players of desktop Linux derivatives like Ubuntu, as well as the established business solution provider RIM with its PlayBook.

So what is the challenge for Microsoft from these tablet operating systems? Simply that those listed above have one thing in common - they are not in themselves sold. All are made by companies with different business models. Apple design and sell consumer hardware based on a premium brand and iOS is a by-product of that business. Android is made by multiple companies, the largest contributor, Google, gets its revenue from advertising and most of the other contributors design and/or make and sell hardware, both components and end-user products. The heavy lifting in Linux is done by companies who sell services or hardware, with significant input from academics and volunteers, and RIM sell a high value end-to-end service that is embedded in those companies that have adopted it.

Microsoft makes much of its money by selling software licences in bulk; specifically most of its profits are generated by software sales, with a large proportion of that being OS licences in massive quantities. But in the current tablet market there seems little opportunity for charging for an OS. Apple's pricing certainly has a premium, but that is accounted for by the brand; the Android tablet market is highly competitive on both price and design innovation, as will be the desktop Linux tablet market, while RIM will sell to RIM users with, maybe, a slight premium based on their unique proposition.

So there's a conundrum - whether or not to rush to invest large amounts of money in developing a tablet OS to sell in a market that does not charge for the OS. So if a decent entry level tablet stabilises at a 2011 retail price of USD350 (based on component and assembly costs and normal channel markups) what can Microsoft really charge manufacturers for the OS?

Depending on local sales taxes, a charge of USD50 for the operating system would put a 15-20% premium onto the tablet price, which can only be sustained if the market can be led to believe the product is premium. Given Microsoft's lack of recent success in mobile devices in general, to achieve this requires some considerable thought and work; they need to ensure that their tablet operating system is launched with minimum bugs, a decent upgrade process and roadmap and enough innovative elements to gain mindshare and thus justify a premium on the cost of the tablet. Of course they will also need a thriving application marketplace where most apps are free or nearly so.

That gets Microsoft to a requirement for more time. They need time to come up with innovations, time to optimise an operating system for a new hardware platform, time to have a strategy that allows a premium in mass market sales and time to generate market numbers sufficient to sustain the Microsoft operating cost model. The length of time required is likely to be at least 18 months, so nothing can be forthcoming until 2012 or later.

There is every sign that the advances in hardware that allowed the new tablet designs are also allowing tablets to have sufficient processing power to do most everything that a desktop PC can, but with 12 hours battery life. Then there is the increasing move online via wireless broadband, 3G or wifi - it is more and more common to have documents edited and published via the web, more common to have corporate and social interaction and collaboration via the Internet; even thin client and VDI developments lend themselves to a move to tablets.

Watch the tablet become ubiquitous, encouraging ever more people to move corporate and personal IT/IS functionality online; if Microsoft do not have a significant presence in the tablet market as this happens this hits their revenue stream from operating systems as the tablets eat at desktop and laptop sales, and hits it again as more people move functionality online, diminishing interest in high-priced desktop productivity tools.

This then is the real conundrum: do Microsoft rush to bring out an uncompetitive (features, user experience, extendability) operating system into an already competitive marketplace and risk failure because of the revenue requirement, the presence of incumbents and their alternative business' models, or take time to bring out a competitive OS while their main revenue streams are under attack, the players in the new market become ever stronger and ever more established; and while hardware prices reduce, increasing the premium their business model requires. And then still have a significant risk that the new OS is unable to get sufficient (or any) market share to replace their lost revenues...

I think Lorenzo Dow said it best a couple of centuries ago:

"And you will be damned if you do - And you will be damned if you don't"