One Less Cut In A Thousand

Wednesday, 1 January 2014

2013 - 4000km, 47km ascent, 140k calories.

So 2013 all wrapped up with most exercise done on the bike. From somewhere near zero to...well....

2484 miles this year (4000 km exactly) with about 47,807m climbing in that.

139,837 calories burned, average speed of 16mph (25.5kph), which sounds low but is fine given the climbing done and winter miles that are in there (well, for me anyway).

Total time on the bike: 166 hours. (Turbo time: another 45 hours.)

Best ride in terms of overall performance was RideLondon 100, which took me 5 hrs (on the nose).

Best ride in terms of average speed was Great Manchester Cycle; 37.6 kph average over 52 miles. Lucky to be in with some great riders on that.

To match this level of training in 2014 will be a challenge, given work and family commitments - but the time on the bike is an investment that always pays back. It's worth making the time to do it.

Best kit of 2013? These were all great buys:

GripGrab Hurricane gloves
Castelli Gabba long sleeved jersey
Rapha bib shorts
Rapha team jersey
Continental GP4000S and GP 4 Season tyres. Both are superb.

Worst kit of 2013? Not keen on the Endura overshoes; they haven't lasted. On the whole, I think I've chosen kit well.

Bring on 2014.

Friday, 1 March 2013

Killing a social network, Facebook shareholder style

Thought this was a really interesting blog post and clearly with Sabisu in mind, I have an interest in how social networks function - or don't.

In brief, the blog post describes research of the now defunct Friendster network that ascribes its failure to the fact that each user had an insufficient number of friends for it to succeed, meaning the 'cost' of to each user of maintaining their network was too high for the 'benefit' it gave them. The low number of friends per user also meant that when users left the network, it wasn't sufficiently robust, leading to a cascade of leavers.

This fits with some of my own research (Strogatz et al) and seems a decent qualitative theory. And though it's obvious that Facebook/LinkedIn shareholders need to see return on their investment, it should also be a warning for any software vendor interested in social networks to keep the noise down.

Here's my thinking.

No offence to my Facebook & LinkedIn buddies but they're clogging my timeline up with stuff that's less and less relevant. Some of it is their 'fault', some of it is Facebook's.

You could narrow the really valuable (in this case, probably I mean 'interesting') conversations I have on Facebook to a very small group. Maybe only 5% or 10% of my 'friends'. The rest is noise.

Adverts are all over the place - side bars and in the feed, with the latter being particularly intrusive and particularly intrusive on mobile.

Game updates are all over the place and Facebook has not yet realised that certain users are never going to go for games. Again, the feed is clogged with irrelevant nonsense.

Social marketing messages are all over the place; "Share/like this to win X". Not interested.

Chain letters are all over the place. "Like this if your a mother/father/potato lover".

Sure, Facebook is free. The apps can be ignored, as can irrelevant status updates...but that misses a key point...

Filtering out the crap takes effort.

Even though that 'cost' tends to zero, it's there and it's irritating. And if you're only getting a few valuable interactions out of your network for a lot of filtering...well, at some stage you're going to cut back on that effort and not really maintain your network.

Like Clay Shirky's 'mental transaction costs' (see 'penny gap') it's not the financial cost that will drive people away. All that filtering costs brain time.

I understand that Facebook and LinkedIn need to monetise their user-base. I understand that adverts are a really easy way to do that. But with all the other nonsense on my timeline, it's getting to the point where I simply can't be bothered.

(Interestingly, my Twitter feed isn't suffering as badly.)

Wednesday, 23 January 2013

The first thing you should do with Facebook Graph Search

Just reading this (HT LettersOfNote) and got to thinking: what am I going to do with Graph Search?\

(If you don't know what Facebook's Graph Search is...well, check out the link above and perhaps this. It's interesting and scary in equal measure.)

First, I'm going to show my 5 year old a couple of innocent searches. We're going to talk about privacy. (I know, I'm such a crazy, fun parent.)

I'm beginning to think that privacy education is a serious advantage as they grow up. It may become the most important bit of education kids get as Graph Search proves that anything online with your name on it is a potential threat. One misplaced post could be the end of all opportunities in life.

Then, I'm going to work through some less palatable searches, particularly of my colleagues. Why? So we can pre-emptively address anything that shows up which is a threat. Of course, if it shows up some really unpalatable stuff, we may need to have a chat but it's unlikely that employees of a tech-savvy start-up are going to put up anything that's in that category.

What it will do is force everyone to assess how much slack they cut people. If you have an employee in the EDL, is that a problem for you? If your employees are friends with lots of competitor employees, does that matter?

Then we're going to talk within the company about our attitude to Graph Search. Sales people will see it as a powerful tool to be used by the dark side; those in recruitment will be terrified and excited in equal measure. We need to decide just how far into each other's lives we want to look.

I can't help wonder whether it's a bit like Sauron's ring; even those who would use it for good are ultimately sucked into that which is bad.

Perhaps you start by trying to weed out the racists, but ultimately you end up weeding out everyone you disagree with?

Perhaps Graph Search will increase prejudice? Perhaps it'll play on prejudices you don't know you have?

Thursday, 17 January 2013

It's not the fast miles that count

Runners call it building a 'base'...cyclists tend to look a bit morose, or guilty, usually addressing a loved one or cycling companion with the confession; "I'm not doing the miles".

These days there's a constant focus on speed. It seems expected that you'll have a visionary idea on the train, text it to your development team while the Starbucks drone grunts his/her way through making you a latte, and have a prototype awaiting you by the time you've swiped in.

Well, here's the truth:

Starbucks coffees tend to be mostly sweet milk.
No lasting change has ever been effected at in a morning.
If you need a swipe card to get into your office, you're probably not an innovator

See, like the runners, cyclists, triathletes out there that have to get up early and do the miles, innovation is 90% base mileage. It's all about grinding out the 1% improvements in all areas, day after day. (See Team Sky Procycling.)

Over time, those 1% gains accrete into a win - however you define it.

I hear stories cranked out about Twitter/Facebook/<insert favourite> being invented in 10 minutes on a hack day. Yeah? And the infrastructure? In fact, the most crucial bit, the business model isn't yet finished for most of these dominating platforms, some 6/7 years after they hit the market.

Even the initial idea moves on. Facebook now is not Facebook as was. Twitter also. They evolve because they know that you don't conceive a complete vision in a single blinding flash; you refine it over a period.

The code is the least important bit. The infrastructure is the second least important.

The implementation is what gives the software life. The business model is what gives the software a future.

Tuesday, 15 January 2013

Is the end in sight for bloated MI/BI systems?

There are 2 reasons you do not need a big, monolithic, singular, global datawarehouse.

1. Distributed processing

The techniques and architectures exist to move the aggregation process close to the data - in fact, right on top of it. This makes it quick, efficient and easy to modify without significant overhead - unlike cube -dependent data-warehouses.

Hybrid-cloud architectures make it possible to aggregate on a local basis while making the aggregates available over the cloud to the global enterprise, which is then dealing with relatively small datasets.

So there's no need to spend a fortune on the communications and storage involved with pulling data to a central location, and no need to take the risk of having all your data centralised onto a single point of failure.

So no need for a big datawarehouse.

2. Cloud power

At Sabisu we do much work in the process industry where the perennial question is: do we connect our essential production systems to the cloud?

Sure, you can take advantage of the virtualisation and outsourcing available for risk mitigation and cost reduction, but in fact you're just shifting the risk to the communications provider and you're unlikely to find multi-tenant cost benefits because you're going to want a very private cloud indeed for all your valuable process data.

Cloud computing is valuable to our customers because it gives unlimited, immediately available processing power. This means that all those clever data network modelling techniques that have been the preserve of those with entire datacentres at their disposal are now accessible by anyone with a bit of budget.

So what we have now is an opportunity to try new analysis techniques that do not need a local, on-premise, expensive data-warehouse. All you need is enough communications capability to get the dataset you want to analyse to the cloud, or as described in (1) above, get the right level of aggregate to the cloud.

In fact, you don't need to persist any data in the cloud; you can reconstruct the set of results later if required by supplying the raw/aggregated data.

So, distributed processing and cloud power; an antidote for bloated MIS perhaps?

Thursday, 10 January 2013

Future for big data is small, fast and powered by end-users

I was intrigued by this article on the hype around big data: http://venturebeat.com/2012/12/21/big-data-club/

Last year I was invited to speak at a Corporate IT Forum workshop on MIS with lots of big data debate included. Some of the attendees were bemoaning a lack of 'accessible' big data technology, along the lines of 'we have petabytes of data to process and nothing to do it with', whereas others saw this as absolutely irrelevant as their organisations weren't generating this kind of data in the first place.

At Sabisu we do a lot of work with organisations that generate a lot of data. Some is structured well, lots is structured badly, lots is unstructured. But even these guys don't really have 'big data' issues along the lines of the link above. Virtually everyone we talk to has plain old data issues - the sames ones they had 10 years ago, just on a bigger scale - but not multi-petabyte big data issues.

To put it into perspective, a big enterprise might have 30,000 users, all storing Excel/Word/Ppt docs and emails. Facebook has a billion, all storing video and photos. So chill out. Your organisation probably has an MIS or data management problem but not a big data problem.

That's not to say the technology and techniques pioneered by the Facebooks and Googles of this world don't have value. Every organisation would benefit from working with unstructured, non-relational data in a distributed, resilient architecture...and that's what I take to mean by big data technology.

As a definition that's pretty sloppy. The fact is that distributed algorithms have been around a while. They've just not been 'accessible', which brings us back to our friends running the IT functions at 'normal' sized enterprises.

Our friends are being sold - and are buying - huge data-warehouses that cost a fortune. It is in the interests of the vendors to push the need for big data capability even if a 'normal' sized enterprise doesn't need it. And I don't believe they do.

I suggest that 99% of enterprises could function magnificently on 5% of the KPIs they currently capture. Most of the KPIs have little operational relevance. Most of the data-warehouse manipulation and exploitation is a waste of time. The reason for this is that the end-users cannot ask the questions they need to ask - there is no interface in place, so they ask a question they can get an answer to instead.

Sure, you have an MIS system. And it's self-service right? And your users love it, right? So how many of those KPIs affect your organisation's bottom-line?

Here's where 'accessible' implementations of unstructured, non-relational, distributed data processing will change things. Users would be able to ask questions that directly affect the bottom-line and it won't matter whether the right cube has been built, or batch job run, or ETL function completed, or whatever; the answer will be constructed on the fly by millions of small worker algorithms distributed throughout the IT architecture.

In this way, companies can exploit the data they already have but can't get to - the data in spreadsheets, documents, presentations along with the structured/unstructured line-of-business data. Data Scientists will be roving consultants, building pre-packaged algorithms that users can exploit easily.

Wednesday, 9 January 2013

Running kit list

Chatting to the guys in work about getting in shape...not that I am at the moment.

Here's my rough kit list:

http://www.wiggle.co.uk/ronhill-pursuit-short/

http://www.wiggle.co.uk/ronhill-pursuit-tight/

http://www.wiggle.co.uk/ronhill-pursuit-square-cut-short/

http://www.wiggle.co.uk/ronhill-vizion-long-sleeve-crew-top/

http://www.wiggle.co.uk/inov-8-racesoc-16-socks-twin-pack/

I'm running in old model Inov-8 Terraflys at the moment which they seem to have discontinued. These are the nearest equivalent I think and will be my next shoe:

http://www.inov-8.com/New/Global/Product-View-Trailroc-255.html?L=26

(The new model Terraflys have a bigger heel/toe differential.)

Typically what I wear is roughly temp dependent:

>11C = short sleeve top, shorts
Between 3C and 11C = long sleeve top, shorts
<3C = long sleeve top, tights
<0C = long sleeve top, tights, base layer (gloves, beanie if req'd)

I take into account wind/rain by regarding it as lowering the temp slightly. I've never got it wrong. Nothing I run in is waterproof - I'm only ever 30 mins or so from a warm car/house. Mountain running...well, I'd have more kit.