Jun 2020 – Dave Rogers

The symptoms of toxic technology

Posted on 29th Jun 202021st Jul 2020 by Dave Rogers in Uncategorized

Part 2 of a series on toxic technology. In part one I introduced toxic technology, and defined what I mean by the term.

The characteristics of toxic technology can be invisible to people working in organisations. It is hard to directly observe the security flaws, tangled architecture or unreadable code without expertise. But the symptoms of toxic technology are common and visible.

Delivery

Toxic technology makes delivery slower, harder and riskier. This is caused by shortcomings in the the less visible design of technology: the code and architecture. This internal design has a significant effect on the ease and risk of change. Some change can be simple, like adding lego blocks to an already assembled structure. But some change require the existing structure to be pulled apart, or new kinds of lego block fabricated. The most complex changes ripple out into connected systems. Toxicity makes even the most simple changes surprisingly complicated.

There’s no perfect internal design for technology – it’s a balancing act between preparing for the future vs. reacting to immediate needs. Preparing involves making technology more malleable and understandable (known as refactoring), and therefore ready for likely future changes. But teams can be wasteful if they over-prepare, investing in changes to anticipate something that never happens. When the pressure to deliver is high, teams refactor less. Failing to refactor over a long time will make technology brittle and hard to understand – change becomes harder and harder to achieve. If the pain of changing technology is too high, it will become perceived as ‘legacy’.

Where legacy technology exists, its impact on delivery can be hard to avoid. If legacy systems hold data, or process transactions which are important to the organisation, then they become bottlenecks for delivery. It can be very challenging to replace legacy systems, so it is common for new technology to layer on top of the old, accumulating over time. Teams must be wary of the infectious nature of toxic technology – when new technology integrates with legacy technology, some changes can only move at the pace of the slowest system.

Operations

The ease of changing technology is also important for its operation – it improves the ability to manage failure. Teams need to be able to release fixes with confidence, knowing they’re unlikely to cause side effects. To help detect and diagnose failure operators sometimes need to add new ways to observe how the system is behaving. Toxic technology increases the risks of accidently making things worse when trying to deal with an emergency.

Toxic technology makes systems brittle. If changes cause unwanted side effects, small incidents can grow into catastrophic failures. Legacy systems will often grow a reputation for stability, but this reputation is a result of infrequent change. When incidents do occur, it takes a long time to recover and the impact on the organisation and users is more severe.

The organisation

A catastrophic IT outage is a big threat to organisations. It creates an existential threat to commercial organisations, and can cause the loss of trust in a public institution. In 2019 British Airways and TSB experienced outages which impacted customers over several days. By lasting for longer periods of time, compound effects are seen – customers will shop elsewhere and public service users could see impacts on their lives. Modern technology companies, currently less burdened by toxic technology have outages which usually only last a few minutes or hours. Sometimes the press coverage of these is high-profile, but compound effects are avoided as business returns to normal. However, in the public and banking sectors, the loss of critical systems for several days is becoming increasingly normal.

Communications around these kinds of persistent outages state, or imply, that there is a single root cause. The truth is that any major technology failure will have multiple complex causes: a succession of technical and organisational failures, alongside human error and bad luck. But organisations are unlikely to reveal these complex causes publicly, because it begins to expose the depth of toxic technology the organisation relies upon. Where they have accumulated toxic technology at scale, small failures can cascade into catastrophes.

Toxic technology is a cyber security risk. Neglect means that technology is not kept up-to-date, and vulnerabilities emerge. Pressure on delivery means good security practices are routinely deprioritised. Meanwhile, cyber crime is increasing year-by-year, in terms of both frequency and sophistication. Insufficient cyber security, caused by toxic technology, will someday result in the failure of a major corporation or institution.

Challenging new legislation like GDPR is beginning to take effect, with British Airways fined £183million in July 2019 for a data breach. Their rivals Easyjet are being investigated for the unauthorised access of 9 million customers’ personal data. Most organisations being fined so far are established organisations with high volumes of toxic technology. Legislation of technology is expanding further – including accessibility, online platforms, and encryption – meaning toxic technology is also becoming a legal concern.

Where technology is subject to standards, the impact of toxic technology can be critical to doing business. Toxic technology is very likely to breach standards through poor security, or a failure to keep up with changing requirements. PCI-DSS is a standard used by the payment cards industry to protect customers’ financial information. Failing to meet the standard can result in legal action, cost of fraud, loss of revenue and range of other negative impacts.

Cost

Toxic technology places a financial burden on organisations. It makes it harder to be strategic and innovative because resources drain away on tactical or emergency manoeuvres. Only the highest priority work can be done, because the cost and risk of change has risen so high.

Toxic technology can get stuck – becoming both too hard to change and too costly to replace. But even if remaining unchanged, costs increase because of a changing context. Most technology is connected with networks, platforms, APIs and ad-hoc user integrations. Whenever external systems change, there are costs to re-integrating the technology. New security vulnerabilities mean new investment in cyber defences. Changing supporting platforms can trigger expensive migrations. People costs increase as access to niche skills and knowledge becomes harder. Commercial contract renewals are renegotiated at higher prices to support ageing technology. And throughout this process of toxic technology growing in cost, modern replacements tend to become cheaper, but remain out of reach due to the cost of switching.

The simplest way for toxic technology to impact an organisation’s finances, is by buying or building something that’s not needed. The technology industry is full of ‘silver bullet’ solutions to complex problems, particularly in domains like data warehousing and cyber security. Faced with a complex public health and economic challenge like Covid-19, governments across the world have responded by spending significant sums on apps. Whilst these apps sound intuitively useful, and are very annouceable for political leaders, there’s no evidence to show they’re effective. If they’re not scrapped entirely, governments are left with complex sustainability and data privacy challenges.

Users

Pain and frustration are not uncommon to experience when using technology. But design is even more challenging when the medium is brittle and slow-to-change. Technology provides its most painful experiences inside large slow-moving institutions, where toxicity has accumulated over decades. Inside these organisations, staff are mandated to use dire technology in order to go about their daily duties. In August 2019, Dr Dominic Pimenta described his experience as a junior doctor in the UK’s National Health Service:

I need up to TEN different programmes to run a clinic. At least ONE will always crash EVERY DAY. /4 pic.twitter.com/qaVOtZ4Lr5
— Dr Dominic Pimenta (@juniordrblog) August 9, 2019

His experiences are very typical for public servants and administrative staff in large, established organisations across the world.

It is common for users of toxic systems to increase their usability with a layer of spreadsheets and paper-based work-arounds. Whilst this can optimise use for long-term users, it makes it harder for new users to learn. As new staff enter the workforce with the raised expectations of internet-era services, they will be less tolerant of technology which is frustrating and confusing to use. The impact of poor user experience is likely to hit certain groups more than others, with the effect of excluding those with permanent, temporary or situational disabilities.

Service outages, caused by neglect of technology, can erode trust with users, and prevent them from meeting their needs. The impact can range from the inconvenient to the life-threatening. Outages to services such as medical advice, security monitoring, housing, and access to money could have enormous impacts on the lives of users.

Toxic technology is also more vulnerable to cyber attack, which can have a significant impact on users, such as the suicides following the Ashley Maddison breach, or the leak of the HIV status of 14,000 individuals in Singapore.

Institutions such as governments, or monopoly service providers present a big risk to their users if toxic technology accumulates. When the quality of the user experience diminishes, there is no choice of alternative, and users must suffer the consequences.

The symptoms are everywhere

These symptoms are common in larger, more stable organisations. But their causes are systemic, and so start ups and high growth tech companies are not immune. Without bold new approaches to building and sustaining technology, supported by changes to how we fund, staff and govern teams, the outcome is the same: the accumulation of complexity and toxicity.

These systemic causes, and ways to mitigate them are the subject of future instalments.

Continue reading part 3.

Sign up for future instalments

Processing…

Success! You're on the list.

Whoops! There was an error and we couldn't process your subscription. Please reload the page and try again.

Toxic technology

Posted on 22nd Jun 202021st Jul 2020 by Dave Rogers in Uncategorized

Part 1 of a series on Toxic Technology.

In 2018 I wrote about toxic technology, a short post explaining the threat organisations face from the legacy technology they accumulate. To explain the idea in more detail, I wanted to write more. This series of blogposts will cover a range of topics which contribute to toxic technology – the way teams work, the strategies we use, core operational processes, and market incentives. Later in the series, I will write about how to avoid, manage and mitigate the risks of toxic technology. This post is the first of many instalments, so if you’re interested, please do sign up for more.

Toxic technology is eating the world

In 2011 Marc Andreessen suggested that software is eating the world. He described the phenomena of new companies using internet-enabled business models to disrupt established markets.

“we are in the middle of a dramatic and broad technological and economic shift in which software companies are poised to take over large swathes of the economy….all of the technology required to transform industries through software finally works and can be widely delivered at global scale”
Marc Andreessen, 2011

At the time he wrote this, in 2011, over two billion people used the internet, up from an estimated 50 million in 2000. Andreessen predicted that in the next ten years “at least five billion people worldwide [will] own smartphones [with] instant access to the full power of the Internet”. A decade later, there are an estimated 5.11 billion mobile users and 4.39 billion internet users globally. The majority of the world’s population are now internet users.

In the decade since, this pattern has continued as companies such as AirBnB, Uber and Snap inc. have disrupted markets. But a different pattern better characterises the most recent decade: not market disruption, but the accelerating use of digital technology in existing organisations. This has created an impact across diverse industry sectors such as government, finance, retail and transportation.

New digital technology has been a trigger for widespread change in the public sector. Governments across the world are now transforming how they work using internet-era methods. The US Digital Service, UK Government Digital Service and e-Estonia movements led the way, and many more are following. In the US and UK, these changes represent a rebirth after an era of outsourcing, where investment in technology was principally done through procurement. Now governments and public sector bodies are building technically skilled workforces, and producing large volumes of their own technology. There are currently over 900 national and local governments and agencies contributing code on GitHub.

Disruption has also occurred across the financial sector. Large technology companies such as Apple, Google and Tencent have disrupted consumer-facing payment services. Fintech companies like Stripe, Square and Ant Financial have created innovative and popular products. But, at the heart of the financial system, established banks remain the dominant force. To compete with market entrants, conglomerates have invested large sums in digital transformation. BNP Paribas invested $3billion in 2017, HSBC $17billion in 2018, and JP Morgan $10.8billion in 2019. Challenger banks like Monzo and Starling represent a more direct challenge to established banks, but whilst their growth is rapid, they remain niche players in the global banking sector.

Small to medium enterprises represent the majority employer in most countries and sectors. These small organisations collectively make a significant contribution to the software produced globally. Typical software produced by SMEs are systems to help with routine administrative tasks such as case management systems and customer records management systems. SMEs also produce millions of websites – whilst many are constructed using templated tools, many also involve writing bespoke code.

Outside professionalised software development communities, people use general purpose tools to create software. They may not identify as software developers, yet they create abundant software. Excel formulas, Microsoft Access databases and customised ‘low code’ platforms are examples. Millions more create software online, editing the markup code of their websites. Rudimentary knowledge of HTML (Hypertext Markup Language) and CSS (Cascading Style Sheets) can help them go beyond standard templates. Content building platforms like WordPress and Shopify democratise software development. This long tail could perhaps be the largest software sector – the user-generated content of the software industry.

A crisis in the sustainability of software

Software is growing in every sector, and within organisations large and small. Software is playing an increasing, vital economic and social role. But is it sustainable for software to keep eating the world? Do we have the resources to ensure all this software remains healthy, and effective? Many patterns exist to suggest this is not the case. There is a crisis in the sustainability of software.

When technology is not sustainable, basic cyber security and maintenance practices lapse. This causes organisations to experience data breaches with increasing frequency and at increasing scale. But many of these attacks and accidents are preventable. High-visibility outages are becoming more common as a result of neglected technology. Systematic records on service outages are not kept, making trends hard to observe. But, as technology in most organisations is ageing, it is reasonable to assume the trend is worsening.

The European Union introduced the General Data Protection Regulation (GDPR) in 2018 to strengthen pre-existing privacy legislation. Whilst remaining subjective, it has become harder to argue the compliance of a large legacy technology estate. Data protection is now a bigger challenge for organisations – it requires more investment in modernisation, and nurturing a culture of maintenance.

It is now expected that digital services are accessible – designed inclusively to make it simple and easy to use by all. Inclusivity affects everyone by including permanent, temporary and circumstantial needs (e.g. deafness, ear infections and noisy environments). Some countries are beginning to legislate in this area, adding new legal responsibilities. Yet, the ways in which existing services exclude are often trivial to identify. Away from mass consumer markets, niche software is often inaccessible for many — staff and specialist users must work-around the flaws. Low quality niche software can be the daily working experience for administrative staff in large organisations, made worse when the design excludes them.

High-growth tech companies provide many of the services that consumers experience daily. When consumers order a taxi, a takeaway or buy a book online, they use technology which has recently been renewed or replaced. High-growth gives the abundant resources to make this possible. Established organisations rarely experience high growth so accumulating technology becomes a maintenance burden. With limited investment in technology, organisations prioritise high profile services. Established airlines provide a good example of this. Buying a flight online feels like a modern internet-era service. A less-used service like changing your flight can be very challenging. Lowest priority of all, office administrators will often use ageing, low quality ‘back end’ systems.

If software is eating the institutions which form the structure of our societies, it must not cause them to fail.

Unsustainable technology inhibits the agility and stability of organisations. It will become a threat to their existence. Businesses will not be able to compete with the agility of younger, leaner organisations. The role of institutions will erode through lack of trust, with citizens opting for market alternatives. The importance of sustainability goes beyond the impact on organisations. If software is eating the institutions which form the structure of our societies, it must not cause them to fail. We must find ways to make digital technology sustainable over decades if these institutions and public trust in them is to endure.

Network and data centre energy consumption is already set to increase as a proportion of global energy consumption. If digital technology is not made sustainable, inefficiency will result in avoidable accumulating energy use. Sustainable digital technology is necessary to avoid the internet revolution being a key contributor to climate change.

Digital technology will not stop eating the world – the promise of automation is too great, and technology can have a positive transformative effects on people’s lives. If it cannot, and should not stop, it needs to become sustainable.

What is toxic technology?

Toxic technology describes the harmful characteristics caused by poor design, or neglect. Poor design is common, in an industry where outputs are often favoured over outcomes. Neglect is systemic, caused by short-termist cultures, processes and practices which inhibit sustainability.

Whilst the impacts of toxic technology are significant, examples of toxic technology are mundane, everyday, and recognisable to most. It is: the broken kiosk at the local museum, the ageing computer-on-wheels trolleyed around the hospital, the unpatched web server that lead to the embarrassing data breach or the strange green-on-black interface from the 1990s used by the back office staff at a big bank. Toxic technology is around us all, powering our banks, care homes, warehouses and submarines. It’s pervasive.

The following are typical toxic characteristics in technology. Each is challenging, and subjective to measure – making toxicity hard to expose.

Insecure – unacceptable risks to breaches of confidentiality, loss of integrity or lack of availability
Unscalable – an inability to respond to change of scale, such as increased usage, number of users, or complexity of the domain
Unreliable – lacking durability, availability and predictability
Non-compliant – non-compliance with the law, standards or an organisation’s policies
Inaccessible – the design excludes users
Hard to support – cannot be maintained effectively and efficiently
Hard to change – cannot be changed effectively and efficiently
Opaque – important information about the service cannot be obtained when needed
Overly expensive – the service isn’t value-for-money
Poorly understood – the service and its technology is poorly understood

Software in particular can move fast to toxicity, more so than physical technologies. Bridges can fail and buildings can decay but the patterns of neglect are reasonably well understood, and occur over decades. Software decay is faster, less predictable and subject to more complex external factors. Cyber security vulnerabilities can emerge in any component part. Open-source communities may become unreliable. Commercial suppliers may go out of business, or stop working in your interests. Even doing the basics like patches and upgrades is challenging due to the norms of culture, practice and process. The software industry is not yet mature enough to match the risk-management rigour of civil engineering.

The term ‘toxic’ is intentionally evocative language to give a sense of active harm, worthy of attention. Terms like ‘legacy’, ‘technical risk’ and ‘technical debt’ are useful, but don’t give a sense of urgency. For most organisations, toxic technology is a growing and ignored problem, so a change of language could help.

Systemic issues are the principle cause of toxic technology, not individuals or teams. This is important to recognise when using the very negative term ‘toxic’. The assumption should be that historic creators and decision makers made decisions in good faith. Ageing technology accrues toxic characteristics which become more visible from a contemporary perspective. Historic code reveals the culture, language and decision making of the time. It should be valued as a form of communication from the past to the present – perhaps even aesthetically appreciated like historic buildings. Toxicity is avoided through understanding that it can emerge over time from even the most thoughtfully designed technology.

Continue to part 2…

Credits

Nick Rowlands (@rowlando) for the idea to publish as a series of blogposts, reviews, and general encouragement to write more.

Steve Marshall (@SteveMarshall) and James Stewart (@jystewart) for their many second opinions on my writing.

Giles Turnbull (@gilest) for timely advice to improve my writing.

Sign up for future instalments

Processing…

Success! You're on the list.

Whoops! There was an error and we couldn't process your subscription. Please reload the page and try again.

Dave Rogers

On digital & technology

Month: Jun 2020

The symptoms of toxic technology

Delivery

Operations

The organisation

Cost

Users

The symptoms are everywhere

Sign up for future instalments

Toxic technology

Toxic technology is eating the world

A crisis in the sustainability of software

What is toxic technology?

Credits

Sign up for future instalments