When I hired for engineers at my company, Server Density (now acquired by StackPath), the first part of the selection process was a writing exercise. We asked every candidate to spend no more than 1 hour researching the answer to a simple question such as “Compare and contrast MySQL and MongoDB”.
The purpose was not necessarily to test their technical skills but to assess their written communication. Errors with spelling or grammar meant immediate rejection but the ability to convey complex ideas was the main focus of the assessment.
Although anecdotal, I found there was a strong correlation between writing capability and engineering skill. Showing that you take the time to proof-read your work is a good indicator that you will put the same care into your day job. This is especially important for coding.
I would now apply this to all roles, not just engineering. The ability to convey ideas accurately, precisely and in such a way that others can understand is a crucial skill.
Examples from successful companies
Some of the most successful companies follow this sentiment.
Amazon famously uses 6-page narratives in order to propose new ideas and read in silence at the beginning of each meeting. Not only does the exercise force everyone to consider the details, it also ensures a significant amount of time and effort goes into the idea in the first place.
We don’t do PowerPoint (or any other slide-oriented) presentations at Amazon. Instead, we write narratively structured six-page memos. We silently read one at the beginning of each meeting in a kind of “study hall.” Not surprisingly, the quality of these memos varies widely. Some have the clarity of angels singing. They are brilliant and thoughtful and set up the meeting for high-quality discussion…
…The great memos are written and re-written, shared with colleagues who are asked to improve the work, set aside for a couple of days, and then edited again with a fresh mind. They simply can’t be done in a day or two.
This results in Amazon consistently operating at a high level of efficiency:
Imagine for a moment that you could go into a meeting and everyone in the meeting would have very deep context on the topic you’re going to discuss. They would be well-versed in the critical data for your business. Imagine if everyone understood the core tenets you operate by and internalized how you’re applying them to your decisions.
How great would it be not to be constantly interrupted by clarifying questions? How great would it be not to have the decisions in the meeting based on the social networking advocacy that happened before the meeting? How great would it be if executives deeply understood your organization from your perspective before asserting they know better how to do it? How great would it be to be able to review the core data going into a decision rather than have someone summarize it and assert that correlation is causality without revealing their work?
This is what meetings are like at Amazon and it is magical.
It’s not just Amazon. If you have ever used any Apple apps on iOS or macOS, you will have noticed how consistent they are in design and style. This is because they have strict, detailed Human Interface Guidelines for all their platforms.
Design consistency is an indicator of quality and the same applies to writing. Apple has a style guide which is just as relevant as the Interface Guidelines:
The Apple Style Guide provides editorial guidelines for text in Apple instructional materials, technical documentation, reference information, training programs, and user interfaces. The intent of these guidelines is to help maintain a consistent voice in Apple materials.
Writers, editors, and developers can use this document as a guide to writing style, usage, and Apple product terminology. Writers and editors should thoroughly review the guide to become familiar with the range of issues involved in creating high-quality, readable, and consistent materials. Apple developers and third-party developers should follow these guidelines for user-facing text.
It’s not just the large, successful companies that take this approach. It applies to smaller, successful businesses as well. Basecamp, a ~40 person company, uses a very similar written pitch to propose and discuss ideas for feature development:
Why don’t we pitch in person? For a few reasons:
1. We feel it’s better to write something up completely. This forces the floor — the person who’s making the pitch can’t be interrupted. They are guaranteed to be able to present their story completely, exactly as they want.
2. Further, we believe writing things out makes you consider them at a deeper level.
3. We’re big believers in asynchronous communication — write it down now and other people can absorb it later when they’re ready. Real-time communication in person or virtually forces synchronization of schedules which is highly inefficient.
4. And last, when it’s posted to Basecamp as a message, all feedback and follow up questions are automatically attached to the original post. This keeps all communication about the pitch centralized in one place on one page so everyone has access to the same story. One version of the truth.
Indeed, the importance of writing skills is something that Basecamp consider as part of their hiring process:
Our top hiring criteria — in addition to having the skills to do the job — is, are you a great writer? You have to be a great writer to work here, in every single position, because the majority of our communication is written, primarily because a lot of us work remotely but also because writing is quieter. And we like long-form writing where people really think through an idea and present it.
The available time does not change. You might be able to develop a system to increase the number of focus hours but you cannot increase the number of available hours.
If you read that paragraph in isolation then it is an obvious fact. Of course you can’t increase the available hours.
So why is this ignored when we’re actually planning work and progressing through a task list?
An interrupt-driven culture whereby new tasks appear without reducing the pool of tasks the team are expect to complete is something I see on a regular basis across many companies.
This is especially prevalent in engineering teams. A sprint or development cycle will begin with a set number of tasks, and then mid-cycle new tasks will be dropped in due to critical bugs, customer complaints or other reasons. This doesn’t necessarily have to be an issue – it becomes a problem when those tasks are forced in without removing other tasks. The available time is the same but suddenly there are now more things to do.
The results of this include:
Deadlines being missed – the work still has to be done, it’s just there is now more of it than when the original time estimate was discussed. Without adjustment then it is inevitable that everything will now take more time.
Existing work taking longer than expected – if a critical bug comes in that needs to be dealt with immediately, that will cause a context switch that has an impact on individual or team productivity.
Multiple teams being impacted – the original work plan may have external dependencies from other teams e.g. a marketing campaign launch.
At my company, Server Density, we eventually implemented a solution to this we called “parachutes”.
Development cycles were a fixed length.
All the tasks to be completed were scoped in tickets each of a fixed length.
Every member of the team had a known level of productivity i.e. how many tasks they could complete in a cycle.
This gave us a total known capacity for the cycle, which allowed us to slot in the tickets that would be completed in the cycle.
We discussed which tickets needed to be done for each cycle, with stakeholders across the business sponsoring the tickets they prioritised, everyone debating the priorities.
As a result, each ticket had a priority relative to the other tickets in the cycle.
If an unexpected task came in, it was first triaged to scope and size it. Once agreed that it needed to be dealt with urgently, the lowest priority ticket in the current cycle was “kicked out” and this new ticket “parachuted” in to replace it.
This resulted in knowing what we were sacrificing to get the new ticket solved, with the stakeholders for the kicked-out ticket being involved in the discussion about whether to parachute it. The cycle remained a fixed length and meant that we had a good level of certainty about what was going to be completed and when.
Dropping a new task into an existing cycle always has a cost. If you don’t acknowledge that time is fixed then the cost ends up being hidden under overtime, surprise missed deadlines and frustration from supporting teams. It is better to explicitly incur that cost in a defined way by sacrificing some other known task than to keep it hidden behind a bad process.
Parachuting tasks is always frustrating, especially for the owners of the kicked out ticket. But the key is visibility and shared decision-making. Without this, teams continue to maintain a fiction that everything will still be completed on time with the same effort, until it isn’t. That’s significantly more frustrating.
Once you have more than 1 person in a company, I think communication is the biggest challenge. I don’t think it matters whether you’re working 100% remotely or have everyone in the same office, the challenge of communication remains similarly difficult.
The challenge and how to solve it changes depending on the size of the company but there are common questions that everyone should have a concise and precise answer for at all times:
What is the overall mission of the company? Why are we doing what we’re doing? How is the company tracking against that mission? What is going well and what isn’t?
What are the company values? How should we evaluate the ways of achieving our mission?
How do we work and collaborate together?
What is expected of me today, this week, this month and this year? How does that link to the overall mission? How am I tracking against those goals? What is going well and what isn’t?
What is the rest of the company doing?
Practicing complete transparency is the best way to smooth the flow of information to different people across the company, whether they are employees, executives, board members or investors.
There are very few classes of information which may not benefit from full sharing – disciplinary and HR matters, complaints and perhaps individual compensation packages (but having a consistent, open set of salary bands makes up for that).
So what do these different areas involve? What are the key things that need to be communicated?
What is the overall mission of the company?
Later in the life of a company it should be obvious what the product is, but it may be harder to explain the why: Why are we building this? Why have we built that feature? What is the big problem we are trying to solve over the next 10-20 years?
This is crucial to understand because every other aspect of the company flows from this single mission. Everyone needs to know the mission because otherwise they cannot understand why the work they are doing is important to achieving that mission.
The best mission statements are short but specific. Easy to remember and obvious, but hard to achieve and involve playing the long game.
One of my favourites is Tesla:
…the overarching purpose of Tesla Motors (and the reason I am funding the company) is to help expedite the move from a mine-and-burn hydrocarbon economy towards a solar electric economy, which I believe to be the primary, but not exclusive, sustainable solution.
These are both very clear about the problem and what the company is trying to do.
But the mission doesn’t need to be as big picture or ambitious as what Elon Musk is trying to achieve – it’s just as important to have a clear mission if your goal is to build a small but successfully sustainable company.
Without a clear, written statement, you cannot align the rest of the company goals, understand how to prioritise what to do next and will certainly find it difficult to hire and retain your team.
Even simple mission statements require constant communication. The whole company need to know not just what the company is working towards but how things are tracking against that goal. With real numbers and commentary on both positive and negative progress, the mission will be reinforced.
This means having regular reporting of KPIs to the whole company. You are probably doing this for investors and your board already so this format can be reused for the rest of the company.
Information about company performance either gets out through actively sharing the real numbers or it is distributed through rumour and gossip. There is no option where information doesn’t travel, so why would you not want to be in full control of the narrative and detail?
Doing this builds trust in good times so that you can call on the team when you enter more challenging times. With everyone aware of the same numbers, everyone has the same context for making decisions, understanding why decisions were made and making their contribution as valuable as possible.
What are the company values?
The company mission is always very high level – it describes “what” and “why” but doesn’t really discuss “how”.
I used to think that the idea of “company values” was a cliché but that is only true if they are vague and written in generic management speak. If the company values don’t prescribe specific behaviour then they aren’t particularly useful.
Netflix has perhaps the most famous description of its culture and values, from the culture deck CEO Reed Hastings published almost 10 years ago:
More recently, these have been written out as a set of high level principles which they then break down into specific values.
Like all great companies, we strive to hire the best and we value integrity, excellence, respect, inclusivity, and collaboration. What is special about Netflix, though, is how much we:
1. encourage independent decision-making by employees 2. share information openly, broadly, and deliberately 3. are extraordinarily candid with each other 4. keep only our highly effective people 5. avoid rules
Whether you use the term “values” or not isn’t the point – it is about having a written description of how you approach things. They describe how the team behaves, help you recruit new people who match those values and evaluate the existing team as to whether they are meeting those expectations.
Note the keyword: written.
The task of writing down the values is a good thought exercise in itself but then they can also be referenced and modified over time. If they’re not written down then there is too much of an opportunity for digression, forgetting key values or people just misremembering.
Another example is GitLab, who are unusually transparent in everything they do. Indeed, this is one of their core values.
GitLab’s six values are Collaboration, Results, Efficiency, Diversity, Iteration, and Transparency, and together they spell the CREDIT we give each other by assuming good intent.
The important thing is making them actionable, for example under “Results” they explain:
Measure results not hours We care about what you achieve; the code you shipped, the user you made happy, and the team member you helped. Someone who took the afternoon off shouldn’t feel like they did something wrong. You don’t have to defend how you spend your day. We trust team members to do the right thing instead of having rigid rules. Do not incite competition by proclaiming how many hours you worked yesterday. If you are working too long hours talk to your manager to discuss solutions.
Making the values actionable and measurable is what transforms them from high level clichés into a true set of operating principles for the company.
How do we work and collaborate together?
Do you message someone directly on Slack? Do you send an email with an announcement? Is it expected that you will engage in discussion in the comments on a Google Doc? Does documentation go in Confluence?
These are questions everyone has when they first join a company and so should be part of the standard operating system of the business. Everyone should be using the same tools and processes and there should be exactly one way of doing things. You know this is going wrong when you see such behaviour as:
Missing crucial information because it was sent as a Slack message late at night in your timezone and there were hundreds of other messages afterwards so you didn’t see it.
Bugs getting reported through anecdote, mixing multiple issues that then get fixed in an ad-hoc way without tracking the resolution, often resulting in duplicated work as other people try to reproduce an issue that has already been fixed.
People don’t check their email often because they don’t get many messages and instead work in real time chat all day, meaning they miss meeting notifications or an important customer message.
Documentation is written in Confluence by one team but another team is using Sharepoint to work on a Word doc.
People are mentioned in document comments but the answer is provided through another channel and the original comment is left open and unresolved.
Information and documentation exists in one place but it isn’t easily searchable so people end up asking individuals for work that has already been done.
Decisions get made without you being aware and you only find out through luck or because someone happens to mention it in passing.
Discipline is easier when the company is small and just getting started – that’s the best opportunity to set the rules and processes, and tell new hires how to do things. When there is a lack of direction, teams will end up using their own tools and systems because it’s better for them but to the detriment of the overall efficiency of the company.
Sometimes changes to working processes are needed, especially when a new tool is discovered that makes a difference to the working practices of the organisation. But adopting something new should be a deliberate evaluation and decision process.
If teams end up spinning off their own processes and tools, that is an indication of poor direction and/or failure of an existing approach. Dissatisfaction with the status quo should trigger a review and debate rather than a secret team/deployment of a new vendor product.
I personally dislike Slack (and any real-time chat) for anything other than general, social chat or deliberate collaboration on an ongoing project e.g. triaging a customer issue with a group of people in real time, doing incident response, or backchannel chat between participants during a sales meeting. These are all instances where a small number of people come together for a specific task and for a specific period of time, after which they disband. Real time chat is an awful tool for anything else because it implies everyone must be available and responsive immediately, at all times. Information is too easily lost.
My personal preference is defaulting to asynchronous communication via email, where I can set up my own workflow, control notifications and respond in my own time and in appropriate detail. Chat has its uses but I consider it (and notifications in general) highly damaging for any kind of focused knowledge work like writing or coding. This is not a unique view:
What we’ve learned is that group chat used sparingly in a few very specific situations makes a lot of sense. What makes a lot less sense is chat as the primary, default method of communication inside an organization. A slice, yes. The whole pie, no. All sorts of eventual bad happens when a company begins thinking one-line-at-a-time most of the time.
With 10 years of remote working experience with the majority of that spent with chat as a core communication tool, my preferred working style now is to assume that everything is asynchronous by default, with the ability to do urgent interrupts rarely.
Real-time sometimes, asynchronous most of the time.
At Basecamp our perfect-world rule of thumb is “real-time sometimes, asynchronous most of the time”.
Basically, right now should be the exception, not the rule. That creates space and attention for the things that really do have to be discussed right now, and allows everything else to be thoroughly discussed asynchronously and thoughtfully over time.
Real time chat like Slack has its place, but should be limited by both configuration and convention. For example:
1. Slack is to be used for informal communication only. Only 90 days of activity will be retained.
2. If you use Slack, please avoid private messages. Private messages discourage collaboration. You might actually be contacting the wrong person and they cannot easily redirect you to the right person. If the person is unavailable at the moment, it is less efficient because other people cannot jump in and help. Use a public channel and mention the person or group you want to reach. This ensures it is easy for other people to chime in, involve other people if needed, and learn from whatever is discussed.
… 6. Slack messages should be considered asynchronous communication and you should not expect an instantaneous response; you have no idea what the other person is doing.
Limiting retention forces everyone into the mode of the chat being ephemeral and minimising private messaging helps encourage transparency, avoiding interrupting specific people. And if it really does need to be synchronous, a video/audio call (which can then be recorded, transcribed and stored in a searchable format) is still probably better.
What is expected of me?
At this point we are down to the day to day activities of individual team members. If the above items are done well, each person will understand what the company is doing, what guides their approach and the key tools and processes they should be using. That leaves the question: what should I actually be doing?
The worst kind of work is where you are simply following orders. Sometimes specific tasks will be assigned e.g. preparing a spreadsheet of KPIs or fixing a reported bug. However, most work should be driven by outcomes, leaving the individual to decide how they achieve it guided by the mission, values and tools.
There are plenty of systems that can be used to decide what should be worked on, with the most popular (at least in startups and Silicon Valley) being OKRs. I’m actually not a big fan of them and couldn’t get them to work well at my company, but I do subscribe to some of the key principles of OKRs, specifically:
“The ability to track results on a quantitative basis.” Key results are not general or subjective actions you plan to take. They should always include numbers to make it clear how much has been achieved. For example, if Mary’s objective is to improve her sales prospecting skills, one key result might be to spend two hours a week shadowing Jennifer, the team member who demonstrates the most prospecting success.
Whatever system you use, the key is that people know what they should be doing and how it links to the overall company mission. OKRs are one way to achieve this but need to be implemented with all the other things discussed above – they won’t just magically solve a communication problem if there is no understanding of the company mission, values and processes. The team should not only know what they’re supposed to be working on and why, but be able to explain it to others:
Following the HR rep’s suggestion, I start to ask questions. Actually, a single two-part question: What are you doing, and Why?
For the What, I rely on a variant of my old I Can’t Be Stupid gambit: If I don’t understand what you’re saying, either you don’t actually know what you’re talking about, or you’re withholding something. As for the Why, please don’t say you’re just following orders — I know you have a mind of your own; and don’t hide behind “marketing demanded it” as if you respect marketing. Tell me how your work serves the common purpose. Does it improve performance, reduce cost, increase reliability, forge a killer feature? If you can’t tell me, perhaps you should go back to your desk, gather your thoughts, and come up with some answers that make me feel smarter and safer.
With all of the above implemented, everyone should have a good idea of where the company is going and what is being worked on, at least at a high level. Knowing what is actually happening on a day to day basis is easier in small companies but as the team grows and the product becomes more complex (possibly with multiple products), it becomes harder to know who everyone is and what they’re individually doing.
This essentially comes down to two things:
Transparency: are the systems, tools and processes open enough so that anyone can look at any document or ticket if they wish (subject to certain restrictions around confidentiality – GitLab again have good guidelines on what is not default-public)?
Real communication: is there regular discussion from the executive team explaining what is going on throughout the business – whether that is highlighting big (and small) wins, releases, problems or just regular reporting on activities.
There is also a question of relevance. In a company of 10 people it makes sense for everyone to know everything, and to be specifically told about all activities. If you have 10,000 people then the level of detail needs to change. Information should still be available if individuals wish to look it up, they just might not be told the complete detail on a regular basis.
Most companies do a poor job at this, especially the bigger ones. Coordination across teams within a single department is often frustrating enough that trying to get all teams fully aware is a big challenge. That doesn’t mean it can’t be done.
One good example where this is a regular problem is with the selection of technology and building internal systems. Large companies, organisations and especially the public sector suffer from duplication of systems which can only really be solved through proper value chain mapping.
Wardley Maps are a great way to understand the landscape, something which most companies are awful at.
Establishing a regular communication and reporting cadence is a standard part of running a public company – financials and progress get announced in an expected format on an expected timeline. If public companies can do this on a quarterly basis, there’s no reason why smaller companies can’t do it too, and likely much more regularly.
At my company, I used to send out monthly updates to all my investors via email. This tied in with my monthly board meetings and included details about all KPIs, an analysis of the results and what the plans were for the next month. I sent out the exact same report to all team members internally as well.
I also ran weekly whole company meetings where everyone had a minute to talk about something interesting they had done during the week, and as the CEO I commented on positive developments, good work I’d seen and provided an update on challenges, problems or anything else that was relevant.
If I was implementing this again I would think about the following:
Having all individual goals and measures for the “What is expected of me” section public on some kind of internal profile directory. This helps to answer the question of “What is the rest of the company doing?” and means people can directly link to their goals when helping others prioritise work. If everyone knows what you’re working on and why, they can figure out how their request can help achieve the goals of both parties.
Regularly discuss all aspects of the company mission, values and processes to get input on what is working and what isn’t. Just because you decide on an approach when the company is founded does not mean that you have to stick to it.
Being more deliberate about having a single source of truth for all documentation, values and the overall company handbook from the beginning of the founding of the company.
Communication is not static. The best way to do something today might change tomorrow. You have to repeat things and ask obvious questions. It can be difficult to know how well communication is actually working but you can certainly tell when it’s poor. Things are often getting done despite the communication overhead. People are guessing what they should be doing and there is a low level buzz about how well the company is truly doing. Nobody has any guiding principles helping them decide what they should be doing and there is often a lot of sudden urgency around arbitrary deadlines like working in a feature factory.
This is a big challenge all companies face, and it is one that is never solved. Whether you have an extreme level of secrecy and control like Apple or at the other side of the spectrum like GitLab, I think monitoring communication and actively working on improving it should be a top priority for the entire executive team.
The product organisation is unusual in that it achieves its goals through influencing other parts of the company rather than through a managerial relationship with individuals that sit with its direct reporting line.
A typical product group will have a leader reporting into the CEO who then has product managers reporting into them. The number and complexity of those products will determine how deep that org chart is but the people actually doing the implementation are usually part of another reporting group e.g. engineers report up to the VP, Engineering (or perhaps the CTO).
This is of course a generalisation, but it’s rare to have engineers reporting into the product manager, for example. The same applies to other aspects of the company – marketing reports though the VP, Marketing or CMO. Sales is led by the VP, Sales or CRO. And so on.
Despite this, the product manager is working very closely with all these areas of the organisation. They define large parts of what the work is (but not the how) but are unlikely to be directly assigning tasks or “managing” the team they are working with.
This means that good product managers must rely on their powers of influence and persuasion to make a case and explain why something should be implemented. They have no direct reporting authority or rank, so must take care to build up respect and authority by the quality of their work.
In practical terms, this means:
Understanding the customer better than anyone else. Sales might know best about what prospects are looking for before they become customers, and support might understand key pain points in product usage for existing customers, but it is the product manager who should have the best understanding of the customer as a whole, across their entire lifecycle using the product. Product manager activities should involve bring on sales calls, visiting customers on-site, participating in UX reviews, dipping into support tickets and chats, running feedback sessions with sales and support, and triaging customer feedback.
Understanding the market better than anyone else. Marketing probably know how to generate the widest reach in the most targeted way and finance truly understand how margins should be optimised, but the it is the job of the product manager to understand why one marketing idea might be better than another based on an understanding of how customers consider alternatives, buy and then use the product. And to give finance the input they need about competitive pricing and operational costs. Product manager activities should involve collaborating on the go-to-market plan, reviewing funnel statistics, engaging with analysts, understanding infrastructure costs, testing competitor products, positioning product pricing packages against features and customer personas, discussing deal win/loss reasons with sales and understanding churn reasons.
Communicating across the organisation. Communication is the biggest challenge in any company, and only gets more difficult as it grows. There are so many levels of granularity in delivering a successful product so it is crucial that everyone understands the long term vision and how that translates into each feature release. Product manager activities should involve discussing the strategic plan with the CEO/executive team and with the wider product manager group, testing out pitches with real customers (can you articulate how you got to where you are today, and where the product is going?), sending out regular product updates to customers (newsletters) and internal groups (changelogs), building training decks for sales/support, then delivering the training itself, writing up roadmaps, converting feedback into product specs, helping debate the prioritisation of key initiatives and explaining to sales/support why something is going to be done (or not done).
Product management truly is the job with the most cross-company influence but is least able to rely on rank and position to derive authority. It means it is one of the more difficult roles to start from scratch in, but has the most opportunity to build influence and authority through results.
The product itself: what should we build, why, and when? This involves working with sales (deal win/loss reasons), support (patterns in support requests), engineering (product development, testing and release) and operations (supporting the infrastructure behind the product).
The go-to-market approach: what should we do to raise awareness and generate new revenue? This involves working with sales (training on new releases, working on sales collateral, being on calls with prospects) and marketing (influencing product marketing campaigns, attending conferences, creating content).
There are nuances within these categories e.g. if the release is about improving a difficult area of the product interface, the go-to-market approach may involve only existing customers and churn reduction rather than new revenue. Or a release may focus on reducing specific alerts and incidents for the on-call team, rather than direct customer functionality. But much of the day-to-day of a product manager will revolve around these two aspects.
Amongst all the things that the product manager is responsible for coordinating it is important to note the key aspect they are not deciding: how.
Whether you agree or disagree that the product manager is like the CEO of the product, once you get past the early product prototypes and have the scope to hire dedicated product managers, their job does not involve deciding how things should be done.
At the point when your team is big enough to include product management, it will also include specialists in engineering, operations, sales and marketing. They will have taken over from the generalist founders and should be much better than the original team who hired them. That’s why you hired them in the first place!
The same applies to product managers. Their job is to get into the details of what should be done and when, coordinating all the resources and people to hit product goals (revenue, launch, NPS, churn, etc). Their job is to define the outcome everyone is aiming for, not to determine how it should be done.
Designers decide how to build the best user experience.
Engineering decides how to implement a feature.
Operations decides how code should be deployed.
Marketing decides how to spend their budget.
Sales decides how to engage potential customers.
Support decides how to deal with inbound customer requests.
The “how” of product is deciding how the product delivers on the business goals through the teams that the product manager then has to coordinate. This involves “what”, “why” and “when”. It includes providing ideas and input into helping the other teams figure out their “how”, but that decision rests with the team actually doing the work.
However, it’s just as important to consider how new employees are onboarded into the new role. The first few weeks and months are critical to set people up for success, and determining if you actually made the right choice in hiring. There’s a reason most employment contracts have a probation period – this is the time to discover if your hiring process worked.
What does success look like?
The biggest mistake I’ve seen is that the job specification only describes the characteristics of someone who you would like to do the job, but no criteria for checking if that person is succeeding or failing. It should include a detailed description of what you expect the person to be doing and how you measure that output.
Indeed, how do any of your team know they are achieving what is needed for the business? There may not be quite as direct a correlation as with sales quotas, but there should be some objective criteria for every role which determines whether that person is performing or not.
This is idea behind OKRs – having objective measures which are regularly reviewed. We didn’t actually manage to make them work at Server Density but we still had metrics that we reviewed on a weekly basis to ensure that we were tracking towards our overall objectives, which were then discussed at the monthly board meetings.
For SaaS businesses, the key metrics are well understood. MRR, churn (net MRR churn and customer churn) and NPS are typically the three key numbers relevant for all areas of the business. Then each team might get something more granular e.g. cashflow for finance, funnel metrics for sales, critical out of hours alerts for ops, ticket satisfaction for support, CPA/payback rate for marketing, etc.
At Server Density, our new engineers had a goal of deploying to production by the end of the first day, although our expectation was that it would be a good 3-6 months before they were fully up to speed to work independently on the codebase. These short term vs long term goals really help set expectations.
This is where most onboarding starts and ends. It consists of setting up the mechanics of how the employee will actually get work done: laptop, email, chat tools, etc. This is obviously important, but should really be the easiest aspect of onboarding, and the one that is the most optimised.
That means all the crucial accounts, applications and systems are already set up and the computer the employee requested is ready to go. So much time is wasted with initial setup of laptops, creating email accounts, etc when this can easily be automated and be completed in advance of the first day.
Everyone has a manager. This is someone who will help you get your job done, remove obstacles, act as a career mentor and hopefully offer proper feedback in one to ones. Getting off to the right start with your manager is often daunting – you don’t know how they work, what they’re like and how best to get what you need from them.
I like the idea of a Managerial Manual which describes exactly this. My favourite example is How to Rands, something I’ve seen implemented by several managers at my company, StackPath, as well.
As the team grows, the culture will diffuses from “how the founders do things”, so the values and mission of the business need to become more formalised. This was something I wasn’t really aware of when I started Server Density – we were very informal about explaining how we did things and only wrote down our approach when we faced a problem with how something was being done.
The Netflix culture is probably the most famous and it is worth reading about what they mean by the term, but every founder and every company has a different approach to things. If I was starting a new company today, one of the first things I’d do would be to write out the mission and core values I wanted to build out. Values are what you hire against and describe how you operate. The mission is a short statement describing why the business exists through what it aims to achieve long term.
You also have to be careful of “culture fit” meaning “same as me”:
Finding the right people is also not a matter of “culture fit.” What most people really mean when they say someone is a good fit culturally is that he or she is someone they’d like to have a beer with. But people with all sorts of personalities can be great at the job you need done. This misguided hiring strategy can also contribute to a company’s lack of diversity, since very often the people we enjoy hanging out with have backgrounds much like our own.
But you do need to find people who both understand and believe in the mission as well as agree that the values are the right way to achieve it.
In addition to considering all the above, there are some impressive but gimmicky things you can do to make new employees feel welcome. I use the word “gimmick” in the sense that these are good ideas to impress someone on their first day, but aren’t really part of “onboarding” in the same way as the more important things above.
Swag – have a parcel of tshirts, hoodies, socks, stickers, etc ready and waiting for them on their desk. All correctly sized, of course.
High quality printed version of the company culture deck/values/mission.
Handwritten welcome note from their manager / the CEO.
Lunch / dinner with the management team / CEO.
Announcement introducing the new employee to the whole company.
Isn’t this a lot of work?
Yes. That’s the point. You can’t achieve anything without a great team, and it is absolutely necessary to spend a significant amount of time on the full end to end process: hiring, onboarding and then ongoing management.
You say you’re too busy. You work at a tiny startup after all. But that’s no excuse, Guthrie says. “You’re not too busy. You just spent all this time, energy and money getting this new person to join. Blowing it is going to cost you even more when you have to start the hiring process again from scratch. Don’t make someone feel like you’re too busy to make them feel good about choosing to work with your company.”
Your goal should always be to make people believe on a gut level that your organization is so amazing that they couldn’t possibly work anywhere else. Building that attitude starts immediately once an offer letter is signed. And if you do it right, when the phone rings and it’s a recruiter on the end of the line offering the next big thing, they’ll say, “Sorry, I’m happy where I am.”
There are many times when you have a goal, you try something to achieve that goal, but it doesn’t seem to work. Whether it is in trying to grow a startup, writing a piece of code or in a more personal area of life, this will be a regular occurrence. It’s often called iteration and improvement, experimentation or trial and error.
But how far does that approach go? When does the trial become an error? What do you do when things aren’t working?
1. Define what “working” looks like
This should be obvious but unless you know what the end result should look like, how can you evaluate whether whatever it is you’re trying is “working”? The best way to do this is with a single number you can measure easily. That is not always possible but the closest you can get to a quantitive metric, the better.
2. Set a timeline
You can’t continue indefinitely. You need to see progress within a reasonable period of time. You already defined what the end goal looks like, but you also need a measure of “progress” towards that goal.
This has to have a time frame because how do you know whether to keep going or stop? Knowing when to stop something is just as important as deciding when to start something.
Are you almost there or do things just keep creeping forward? Is there an illusion of progress if you look at things in isolation, but when you zoom out on the timeline then you really haven’t come far at all?
3. Get a second opinion
It’s difficult to be objective when you are deep in something day to day. You have the full context to make micro, operational decisions but you also need someone who can consider things more independently. In business, this is usually the role played by the board. Talking to close friends is similar in a personal situation.
You need people who will listen and understand, but also who are not afraid to give their true opinion.
Sometimes you can make a decision by yourself but asking a few, experienced people what they think gives you additional data points to align with. You might decide to discount their opinion this time round but each time you come back with the same result, your threshold for dismissing their opinion should increase.
the definition of insanity is trying the same thing over and over and expecting a different result
Going back and tweaking variables, making changes, taking different approaches. These are all normal responses when things aren’t working. This is how debugging works! But there is a limit to how many things you can try. Ask yourself:
How is it different this time from the last time I tried it?
Is it actually different, or just a minor tweak on something we already tried?
Do I need to make a minor tweak or should we have a major rethink?
If I am trying the same thing again, why would it be different this time?
What needs to change in my approach to get an order of magnitude difference in result?
5. What are the consequences of making the decision?
Some decisions are consequential and irreversible or nearly irreversible — one-way doors — and these decisions must be made methodically, carefully, slowly, with great deliberation and consultation. If you walk through and don’t like what you see on the other side, you can’t get back to where you were before. We can call these Type 1 decisions.
But most decisions aren’t like that — they are changeable, reversible — they’re two-way doors. If you’ve made a sub-optimal Type 2 decision, you don’t have to live with the consequences for that long. You can reopen the door and go back through. Type 2 decisions can and should be made quickly by high judgment individuals or small groups.
Is it usually easier to take the path of least resistance and just “keep going”. Repeating with minor tweaks appears like it should be a type 2 decision, and it probably is for a while.
However, the cost of repeating compounds over time. Each time you try one thing, you’re not trying something else. This is the opportunity cost, and a type 2 decision can quickly morph into a type 1 decision.
6. What is the opportunity cost?
The expression in your while loop above needs to be determined by your opportunity cost. What would you otherwise be doing with your time/money/team/life if you weren’t doing the above?
Everyone has a different understanding of their opportunity cost but the cost is never zero. Nobody has unlimited time or money. And it’s not just your own opportunity cost you might be wasting – what about everyone else who is involved? What do you care about? What do they care about? Would you time be better spent pursuing those goals in other ways than what you are doing right now?
I think opportunity cost is the one most people under appreciate. It has the highest impact yet is the most difficult to see, because it only becomes obvious over a long period of time.
As an early startup founder, you have unlimited flexibility to work when and where you want, and for as long as you want. It only makes sense to extend that flexibility to the whole team, right? Indeed, that’s where the idea of “unlimited” holiday came about. I adopted this as our holiday policy when Server Density was founded in 2009.
However, it didn’t really work out as intended. The problem is that there is always something else you should be doing. It’s never a good time to go away and you certainly can’t disconnect completely. I travelled quite extensively in 2010 both to the US for the company and my first ever trip to Japan on holiday. But I didn’t ever have a true “holiday” in that I could completely disconnect from work.
If the founder and CEO can’t take real time off, how do you expect the rest of the team to be able to interpret the “unlimited holiday policy”?
With this policy, there is no quota and anyone can take as much time off as they like, subject to manager approval. The company does not count the days off each person takes, although it is of course still recorded so everyone knows if someone is off.
It sounds good, especially to those new to startups. Everyone likes the idea of “unlimited” because it means they don’t need to worry about whether taking a day from their allowance is a good use of time, and won’t miss out on events because they’ve used all their allowance.
You can use it to make the most of travel opportunities. The further away you go from home, the longer it takes to get into the timezone and so the more time you want to spend in a destination. I know 2 weeks is the minimum amount of time I would want to take for a trip to Japan.
There is no guidance about how much is “reasonable”. You will probably not hire someone who will take advantage and disappear for 3 months, but what is the right amount? Expectations are ambiguous and if the founders/senior team are not taking much time off, the rest of the company may feel pressure to avoid holiday.
Without guidance, some people will take off more time than others. Perhaps those will fewer deadlines or less pressure will find it easier to take holidays. Managers will be reluctant to approve holiday, especially longer trips. And when holiday is taken there will be an unspoken pressure to “check in”, with email or Slack.
If you don’t actually track time off, you can’t tell if someone is abusing the system or let other people know people are off. So you do have to track it, which means it is not “unlimited holiday” but “no measured allowances”.
The startup marketing doesn’t match the reality, which hurts your long term reputation. If a new candidate asks your team (or reads a Glassdoor review), will they be enthusiastic about the reality of working there?
Bonuses to encourage people to take time off
With the realisation that the pure “unlimited holiday” policy wasn’t necessarily working out for everyone, we decided to introduce an incentive to take time off. This amounted to a £1000 cash bonus at the end of the year if you took at least 25 days in the calendar year. This was originally a 50% salary bonus in the final month of the year but we adjusted that to a fixed amount because it became expensive.
Using a financial incentive to encourage behaviour is a well understood approach. It is used in all sorts of areas of life, from sales commissions through to encouraging people to give up smoking.
It makes it clear what the holiday policy expectations are. The actual number of 25 days could be changed but it sets a clear guideline for how many days you are encouraged to take.
It assumes everyone is incentivised by money.
How do you set the amount? £1000 means different things to different people but a percentage of salary is also somewhat arbitrary and can prove expensive.
How do you deal with edge cases e.g. someone who takes 24 days? Is it fair that someone who takes 25 days gets the bonus whereas the person who just misses it does not?
If someone suddenly realises towards the end of the year that they need to take time off, what happens if it is not a good time or lots of people are in a similar position? You could see lots of holiday requests right at the end of the year.
How do you deal with people who join the company part of the way through the year? Pro-rate the 25 days? Pro-rate the bonus?
It introduces a large cash outflow in the final month of the year. As your team grows, this could become a problem for cashflow.
Setting an expected minimum with active encouragement
After trying the above, this is the approach we were using before Server Density was acquired. We had an unlimited holiday policy, tracked all time off but set an expectation that everyone would take at least the UK statutory minimum – 28 days. Anything over that was generally allowed, subject to sufficient notice, availability of any cover and in consideration of occasional deadlines. In practice, we rarely denied requests for time off and most people were in the 30-35-days range each year. We also removed the bonus.
The expectation is clear and it follows the legal requirements. The UK fits nicely in-between the crazily low allowances of the US and the quite-high expectations common in Europe, so it feels like a reasonable middle-ground.
It is a good benefit to talk about when recruiting because it’s still unusual to have “unlimited” holiday in the UK, outside of tech startups.
It is down to managers to ensure their team take the minimum time off, spread throughout the year. This removes some of the responsibility from the team member and makes the manager think about any pressure they might be applying to work under unreasonable pressure.
It is still “unlimited” and so the value of having a day off is less than if each person had an annual allowance. To me, this means taking paid holiday doesn’t have the same feeling of value. If you can have as many days off as you like, you value them less. Using something like time off in lieu therefore has no effect and individuals may appreciate the holiday time less.
There is an upper limit. As a founder, you have to set the example and what if you want to go on a 3 week holiday? That’s almost all your allowance gone.
Introducing financial incentives also doesn’t work. The differences are marginal and it can end up being expensive for the company.
However, the motivations for exploring these kinds of policies are still essentially valid. You want employees to feel trusted to balance the needs of the business and their desire for time off, give them the flexibility to enjoy life and make expectations clear so that nobody feels unnecessary pressure to work so much they burn out.
If I was creating a new company holiday policy today, I would go with fixed allowances. This would probably start at the UK statutory minimum during the employee’s first year and then gradually increase the longer they have been with the business. This is similar to how Basecamp run things – 3 weeks plus a few extra days here and there if needed.
I would also be flexible e.g. if someone had used all their days off but wanted a day or two more for an important event, it seems unnecessarily mean to deny the request. I would also consider allowing some unused time off to be rolled over into the next year. This would encourage employees to think about time off as something of value, but also that the company encouraged people to use it.
Ultimately this is about balancing the requirements of the business and the needs of the team, which should align almost exactly most of the time. If employees feel rested and relaxed, and feel like they don’t need to strictly count time off, the business benefits as well as the employee. The advantage of early-stage startups is that you can experiment with these things.
In December 2017, I decided to set myself a goal of reading 1 book every week, reviewing each one once I finished it. I’m pleased that in 2018 I managed to read 57 (with an average length of 306 pages).
Looking back on the 9 years of running my own company, one of the three lessons I write about after the acquisition was that I would have liked to read more books. Although I can’t possibly remember all the details, over the course of the last year I have felt my thinking being influenced by what I have read even if I don’t remember specifics.
My “to read” list has stubbornly remained at around 100 books because I keep adding titles that get cited in other things I’m reading! I’m aiming to keep up the same pace so at that rate, I have at least another 2 years of reading to go!
So to end 2018, here are the 12 books I rated 5 stars:
I particularly enjoyed historical fiction in 2018, and Robert Harris is the only author I read to achieve 5 stars in 2018:
This book was published in the 1960s and represents the situation at the time, so things have certainly moved on since then, but it is amazing to read about the ignorance and willful disregard for evidence resulting in the extreme damage that humans have done to the environment.
There is no such thing as an entirely pure free market economy. Where negative externalities come into play, such as with the environment, regulation is a necessary component. The degree of carelessness and blindness to any evidence highlighted in this book shows why there are certain areas which always need effective government oversight.
Whilst there is sometimes too much of a play on emotion, this is a great example of how science can be communicated effectively to a general audience, not only to help teach but also to push a policy objective. It’s no wonder that this book kick-started the environmentalist movement and helped policy-makers truly understand what their role should be.
Written in 2012, it’s interesting to see how the degeneration model has played out over the following ~5 years. This has come to a head with Brexit, and the election of Trump.
China continues to grow and the idea that it will adopt a Western approach has basically been discarded. But can it maintain its approach without the institutions and structures that made the West so successful? Perhaps it is developing its own model.
I am increasingly of the view that a proper study of history is the most important thing for anyone who is interested in politics or business, and wants to avoid the mistakes of the past. The specifics certainly aren’t predicted by Ferguson, but the direction is. All it requires is knowledge and understanding of the key events of history.
This book isn’t just about historical events, it is about the importance and relevance of today, and to the future. Applied history.
Anyone familiar with The School of Life will notice the beginning of many common themes. A clever merger of romantic philosophy, relationship advice and a love story, even more impressive knowing it was written when de Botton was only 21.
It’s clear that the current approach to funding the NHS is not going to work in the long term. The UK tax burden is already the highest it has ever been, significantly higher than global averages. With changing age demographics, funding for health is only going to become even more of a problem. That doesn’t even consider social care. How we go about addressing this at the same time as maintaining the crucial principle of universal healthcare is the subject of this book.
The instant reaction to any discussion of changes to NHS funding is to cry about the extreme inequity of the US system. Yet most healthcare systems, especially those in similar-income countries globally, and within Europe, use some form of social health insurance. These are much better comparators as likely alternatives than the US. The outcome measures tell you everything you need to know about those systems – the NHS ranks near the bottom and whilst it has improved significantly over time, the current snapshot of performance today is still poor, relative to alternative systems.
This is a discussion that needs to be had. Unfortunately, given the inability to have any form rational debate about the NHS within the UK, it will take a brave politician to set about implementing the reforms the system badly needs. This book does a good job at describing how we got to where we are now, how we compare to alternative systems in Europe and what changes might be workable given the religious nature of anything to do with the NHS. It serves as a good basis for starting that discussion.
In a very concise and logical manner and by using real examples from the time it was written (1940-43), Hayek convincingly and systematically dismantles the idea of socialism as a realistic form of government. Walking through the key concepts of individual freedom, planning, the rule of law, security, democracy and truth, Hayek demonstrates that socialism inevitably leads to totalitarianism at the hands of the few.
Hayek provides an important understanding of his beliefs about the role of government and what the idea of traditional 19th Century liberalism means (liberty and freedom), which is quite distinct from liberalism as the views of a progressive political party. In writing before the establishment of the welfare state in the Britain, his definition of socialism is in relation to the control of the economy. Most of his examples are from Nazi Germany and the 20-30 years before, whereas for many readers today, examples of socialism which come to mind are more likely to be post-WW2 East Germany and Soviet Russia. He assumes the historical context is known which means it might be difficult for the modern reader to relate, and to apply his theory to how the state should provide social security, healthcare and market regulation today.
Although I started the book already believing that socialism doesn’t work, Hayek’s style and straightforward logic has helped me significantly develop my thinking of the topic. It offers a level of clarity that is so often missing from contemporary debates. There are so many parallels with current events that you can easily replace the historical examples with more modern instances.
The final chapter discusses how an international federal system might help to prevent further war and how it could be structured to avoid socialist control. I found the parallels with the early to mid stages of the EU project fascinating. I feel the historical context of why the EU was created was missing from Brexit the referendum debate, and Hayek only serves to highlight concern about might happen following any breakup of the EU. Parallels with the WTO are also interesting.
I will now be looking for both essays which can help to add detail to the arguments Hayek makes, but also look to discover dissenting opinions and counters to his arguments.
Friedman is not quite as easy to read as Hayek. Whereas I found I could hardly put The Road to Serfdom down it was so compelling, Friedman was more of a challenging read. He gets very technical but this is because he uses real world examples of how to apply his philosophy to the current state of things (as it was in early-1960s America, which is mostly still relevant today).
I like to understand how the philosophy I read can be applied to real world situations and so I enjoyed how Friedman combines both the theoretical and practical. He doesn’t waste time repeating things unnecessarily and his logic is often very straightforward and comes to a rapid conclusion. For example, he dismisses Marx in just a few sentences, entirely effectively as well!
Friedman’s thoughts on inheritance particularly stood out for me as it instantly changed my opinion on the merits of inherited wealth vs “working hard/earned wealth”:
It is widely argued that it is essential to distinguish between inequality in personal endowments and in property, and between inequalities arising from inherited wealth and from acquired wealth. Inequality resulting from differences in personal capacities, or from differences in wealth accumulated by the individual in question, are considered appropriate, or at least not so clearly inappropriate as differences resulting from inherited wealth. This distinction is untenable. Is there any greater ethical justification for the high returns to the individual who inherits from his parents a peculiar voice for which there is a great demand than for the high returns to the individual who inherits property?
Regardless of your political viewpoint, this is a key text to understand the capitalist ideal. Friedman has chapters on taxation, education, discrimination, licensing, social welfare and others. In all he provides solutions that he thinks will much better tackle the problems at hand. I would particularly like to implement his flat rate negative income tax and simplify the entire system of taxation. This book should be required reading for anyone in government, politics and probably even society in general.
There are 2 parts to Poverty Safari. The first is a series of auto-biographical stories which provide the sad backstory for Darren McGarvey’s upbringing and experiences growing up and living in poverty in Glasgow.
It is a brave thing to do to recount such personal tales. Whilst it certainly helps to share these experiences so others can try and understand how many people live, I feel their real purpose is to provide legitimacy and authenticity to what I feel is the main, and second part of the book: McGarvey’s commentary and analysis of the current state of poverty politics.
This is important because McGarvey’s assessment is no doubt controversial and without understanding his background then it would be right to ask how he is qualified to judge the current state of political discussions. Of course, the stories themselves are important to help the reader understand what life is like struggling with poverty but I get the feeling that society is becoming somewhat immune and numb to such otherwise emotive events. The specific, harrowing details might surprise the reader, but the fact that this happens in general probably does not. As such, the dual purpose is crucial: factual information as well as providing credibility for the author to attack the current political landscape.
It is refreshing to read an assessment of the current state of things that to me sounds entirely accurate. Not only does it criticize the entire strategic approach to dealing with poverty:
There is a big disconnect between the grand social engineering agenda of government and the far simpler, unglamorous aspirations and needs of local people, many of whom are not fluent in the ways of jargon.
but it also provides a compelling critique of how the problem is even discussed. Not only does McGarvey examine his own beliefs, but he asks questions of everyone involved:
I always just thought the aim was to dismantle poverty. However, once you see the mechanics of the poverty industry up close, you realise it’s in a state of permanent growth and that without individuals, families and communities in crisis there would no longer be a role for these massive institutions.
It is also good to see acknowledgement of the difficulties in “solving” the problem. We are stuck in a blame narrative which is designed to score political points which only serves to distract from solving the real problems:
Blame for poverty is always ascribed to someone else; an outgroup that we are told not only enables poverty and benefits from it, but also gets a kick out of people being poor.
For anyone who is interested in learning about what it is truly like to experience poverty but is frustrated by how things seem to be progressing (or not), you will find this relatively short book an enlightening read regardless of your political views. And especially if you have political views (left or right), it cuts through the challenges of having a real discussion in our current age of outrage.
Back in August 2017 I wrote a post about: Who has the serverless advantage? AWS Lambda vs GCP Cloud Functions vs Azure Functions. We were three years into commercial serverless offerings and there were still a lot of limitations. My main observation was:
What really matters is the availability and consumption of other services within the cloud provider ecosystem.
Being able to execute functions in response to events is only as useful as what you can actually do within the execution pipeline. This is where the services differ — their ability to to pass data to backend services, perform calculations, transform data, store results, and quickly retrieve data.
AWS benefits from being the leader in cloud by the sheer size of its product portfolio. The core services of compute, storage and networking are commodities — the differentiation is what’s built on top of them.
At the time, Azure Functions and AWS Lambda were similar in functionality and significantly ahead of Google Cloud Functions, which was still in Beta. So how have things progressed for each provider since then, and what does this say about their approach to serverless?
It is an interesting coincidence that both Alexa and Lambda were released in Nov 2014, which suggests that the requirements of one product may have influenced the other. Alexa, a deployed on the Echo devices and in other hardware, in an inherently event-driven product. It is only doing things when you trigger it. It fits the Lambda execution model exactly.
If we assume AWS created Lambda to service Alexa as the first customer, this is important because it means the product development is driven by internal stakeholders who provide the initial demand and use case validation well in advance of input from the general market, rather than copying competitors.
Given how well formed Lambda was on release, I think this is a reasonable assumption. The continued development of the product since 2014 means it has a significant headstart, and the internal use case meant there was always an incentive to invest even in advance of adoption by external customers.
This means that Lambda essentially had no competition for 2 years until Azure Functions and Google Cloud Functions came out in 2016, and Google wasn’t really in the game until March 2017 due to the gap between the Feb 2016 Alpha and March 2017 Beta.
Today, Lambda supports runtimes in Java, Go, PowerShell, Node.js, C#, Python, and Ruby and has added complex functionality such as Step Functions and integration into other AWS products. Many other AWS products generate events which can be processed in Lambda which is proving how you can tie together the full ecosystem to deal with logging, metrics and security in particular. The original Alexa use case is still valid, but many of these improvements are clearly driven by external customers.
Lambda@Edge is also interesting because it allows you to execute your functions in CloudFront CDN locations, allowing for low latency applications close to the user. However, it does have some major limitations (such as 1MB response limits and a maximum of 25 deployed functions) so is clearly still an immature product. But what we know about AWS suggests this will progress rapidly. As nicely visualised by CloudZero, serverless is truly a spectrum:
Azure offers very similar functionality when compared with Lambda, with more language support in experimental runtimes e.g. Bash and PHP.
A big difference is the availability of on-premise functions, where you can run the workers wherever you wish (so long as you have a Windows server). Until the recent release of AWS Outposts, this was a major difference in product philosophy between AWS and Azure. The former was focused on pure cloud, with functionality to help you move there whereas the latter adopted a hybrid approach from the beginning. This makes sense for Azure given Microsoft’s history of on-premise software. Now we see AWS expanding to try and include the relatively small but still significant share of workloads which will always run on-premise.
But it’s not just functions being used – as I mentioned in the original article, the rest of the Azure product portfolio is just as important. HIBP makes extensive use of Azure Table Storage, which you could call another “serverless” product because it is a database as a service – no need to deal with running SQL Server. This is where AWS has often had a lead in the past because of the range of products they have. That’s less relevant today because both Azure and Google have products in all the key areas: storage, compute and databases.
Google Cloud Functions
GCF has only been Generally Available since July 2018 which makes it the youngest platform of the major three. However, it has actually been available in public since 2016. This is something I’ve called App Engine Syndrome in the past:
Cloud Functions seems to be suffering from App Engine syndrome — big announcements of Alpha/Beta features, followed by silence/minimal progress until the next big announcement the following year. The focus of Google’s serverless ambitions seems to be Firebase, not Cloud Functions.
The release notes show regular updates and new functionality during the beta but there was nothing between July 2018 and November 2018.
This is somewhat frustrating because at Server Density we made extensive use of Google Cloud Platform and it is my preferred vendor of the big three – they have the best console UX, documentation and APIs, and I’ve found all their GA products to be well designed and robust.
Indeed, GCF is a good product that is easy to work with through the command line and console. It has good integration into the rest of GCP whether as direct triggers, through pub/sub, storage, monitoring and most recently scheduled functions. The lack of development on the core product is somewhat misleading because of how it continues to be built into the rest of the platform, but it is concerning that the development velocity of what is a growing technology in the industry is seemingly being neglected. At least publicly, and in comparison to the rapid progress of AWS Lambda.
My criticism of Google Cloud has long been that not enough of Google’s own products use it. Of course, they use the underlying technologies and the physical infrastructure, but unlike Amazon.com using AWS for their critical retail operations, it is unclear how much of Google is using the same platform that GCP customers are using. If Google had a use case similar to AWS and Alexa, that might provide incentives to increase the velocity of development in addition to any GCP customers. Maybe they do. But we still see Google falling further and further behind.
AWS, Azure and Google have the position of being the main three cloud providers. Indeed, they are the only ones that matter for general use cases. Their advantage is the vast resources each company can invest in infrastructure and product development. However, that doesn’t mean they are the only players in serverless. There are other, specialist providers who have particular use cases in mind.
Moving caching and request response serving to the edge reduced HIBP’s cost from $9 per day to $0.02 per day. This was not just by being able to serve many requests within Cloudflare’s free tier but also by eliminating almost 500GB of Azure network traffic:
As HIBP shows, avoiding network traffic is a second order benefit of using serverless. Colocating other services on the same platform eliminates a large amount of traffic so all you need to deal with is internal networking. Serverless certainly means not having to deal with scaling server infrastructure and so spending only based on what you use. But it also means that where you use other products like CDN, you can avoid costly traffic back to the origin.
This is a big part of the use case for why we launched EdgeEngine at StackPath – we have huge volumes of CDN traffic where we serve as the edge delivery provider, but requests still have go to back to the origin for dynamic processing. One of the major selling points of StackPath CDN is the diversity of PoP locations around the world. Your origin might be in AWS US East but if you are serving traffic to Spain, you will still have a significant volume of requests needing to go back to the centralised cloud provider.
Network costs at cloud providers are one of the biggest hidden taxes on using public cloud, especially if you use third party services like a CDN. So if you can eliminate those requests entirely you can not only provide better application latency but also save on your bill. Now we have our EdgeEngine product, you can do things like API token validation without it ever having to hit that central infrastructure.
AWS Lambda vs Azure Functions vs Google Cloud Functions
Lambda created the market for serverless and continues to innovate and lead on functionality. It benefits from the vast AWS product portfolio so is often a default choice for those already on AWS.
Azure Functions are just as mature as Lambda. It’s the Lambda equivalent for Microsoft fans, so is an easy default if you use Azure. .NET languages are well supported, as you would expect.
Google Cloud Functions still has the same problems of slow product development but the overall platform portfolio has improved significantly. It’s likely to the first choice for developers starting from scratch, just because the overall experience of working with Google Cloud is better than AWS or Azure. Google is also innovating in other areas, such as with the Cloud Spanner database product. GCF may benefit from those who want something in the platform they chose for other reasons.
Ultimately, serverless is not just about functions. If you want more than just simple request manipulation or one time processing then you need to be able to connect to a datastore and other services like logging and monitoring. It’s been possible to build full applications using only serverless, like HIBP, for a long time. How sophisticated those applications get will now depend on what other services appear to support serverless functions, and what the role for low latency edge deployment plays in adding to the use cases. To quote Troy Hunt of HIBP:
So, in summary, the highlights here are:
Choose the right storage construct to optimise query patterns, performance and cost. It may well be one you don’t expect (as blob storage was for me).
Run serverless on the origin to keep cost down and performance up. Getting away from the constraints of logical infrastructure boundaries means you can do amazing things.
The fastest, most cost-effective way to serve requests is to respond immediately from the edge. Don’t hit the origin server unless you absolutely have to!