I’ve known Jamie Sutherland since he first signed up to be Xero’s US president. Hi tenure was somewhat angst-ridden and the personable software industry insider recently left the company to head back to the world of nimble startups. To this end, Sutherland has co-founded a new company, Sonix, that aims to reinvent a sector that dates back as far as managers have written reports and notes. And that is a long time – the Greeks, the Romans and the early Egyptians all had their own examples of people proclaiming missives from on high, while underlings were expected to write down their proclamations for circulation to the masses.

Enter the industrial era and it was Roman rulers or Egyptian pharaohs, but managers and their need to write copious numbers of reports to their own bosses. And the task of dictating these riveting missives generally fell to secretaries and typing pools. While the advent of the word processor and personal computer meant that typing pools were pretty much a thing of the past, the idea of dictation is still alive and well. indeed, in my role as a director, many of the board I sit upon record board meetings with aid recordings being sent to an agency to dictate into a document.

It’s kind of a backwards approach and not exactly at the forefront of what technology can do, which is where Sonix comes in. Sonix is an automated transcription service which has just rolled out of beta. Sonix is aiming to democratize transcription which is still a relatively expensive process – Sonix wants to change what is today both costly & time-consuming.

Artificial intelligence. of course

Sonix is, of course a startup. Add to that the fact that it is located at the epicenter of startup life, Silicon Valley, and it should come as no surprise that the company is disrupting the status quo by way of an oft-used buzzword. In its case, artificial intelligence. As Sonix sees it, AI (and it’s partner in crime, machine learning) have made their way into the modern company’s business strategy. Sutherland points to predictions from consulting firm Accenture which suggest that the AI market with grow to $9.2 Billion by 2019. Speech recognition, in particular, is pitting some of the largest tech companies on the planet off each other. Behemoths like Google, Amazon, IBM, and Microsoft are all vying for leadership in this space. And with machines approaching human parity for voice recognition, the intensity has gotten even hotter. Sutherland states, including the use of the mandatory three letter acronym

We’ve reached an inflection point in ASR [automated speech recognition] opening up a world of opportunities with voice data and given the amount of resources and dollars that are pouring into this space, the technology is only going to get better and better. There will come a day when the technology can match the complexity of how the human ear, voice and brain interact.

Sonix leverages technology from several of the major players and combines it with its own proprietary algorithms and machine intelligence. They claim to have the most accurate automated transcription on the market. Given the fact that all of the major public cloud vendors habitually announce the ever improving levels of speed and accuracy from their own transcription services – that is no mean feat.

Founders with a track record

Apart from Sutherland, Sonix’ leadership has some impressive history. Co-Founders David Nguyen & Stephen Hopkins spent the last five years reinventing payroll by building Gusto. All three co-founders point to their experience disrupting traditional industries with an intense focus on design and user-experience as key to delivering success for Sonix.

Multiple use cases

Sonix is pushing multiple strategies around its value proposition. The first is the democratization angle and in this they have enlisted Sonix user Ryan McColeman to wax poetic about the time and money saving aspects of Sonix:

Transcribing and editing is painful. Sonix actually makes the experience enjoyable.

On top of these benefits, Sonix also touts SEO wins from their service. As they point out, almost 80% of all Internet traffic is video. And video, of course, isn’t easily searchable. Transcribing the audio parts of video all of a sudden open up the world of SEO for video. And, if that wasn’t enough, there is a Trump angle in all of this. Sonix is selling itself as a tool for the new world of fake news. As they say:

Internet companies have become media empires. Anyone with an internet connection can now publish news. This presents huge challenges for readers, especially those that aren’t as tech-savvy. Many of the software tools and fact check services that help identify what’s real and what’s fake are designed to help readers. Sometimes journalists make mistakes and sometimes those mistakes can be very costly. With Sonix, the interview audio is stitched to the transcript so a journalist can be 100% sure what they publish is accurate.

All of these benefits and use cases relate to transcription itself, and are in no way specific to Sonix. While the company has gotten some early traction from the likes of Fortune, KQED radio, and the Washington Post, I wasn’t 100% certain of a defensible differentiation here. The fact that Sonix is cheap (at least compared to traditional dictation services) is a bonus – $5 per transcribed hour is far less than a manual transcriber – but the giants of the tech world, Google, Amazon and Microsoft, can always do it cheaper than Sonix can.

I put this to Sutherland who, having (I’m sure) been asked this question many times before, provided a number of reasons why he believes Sonix can compete. As he see it, the benefits that Sonix can offer that differentiate it from the large cloud vendors include:

  1. Sonix has unique proprietary algorithms make it the most accurate automated service available – While automated transcription is closing in on parity with human transcription, it still isn’t 100% perfect. Sonix’s proprietary algorithms make it the most accurate among automated services in the market today.
  2. Sonix is not a typical transcription service, it’s an online platform – Upload a file to Sonix, and in minutes, you will receive an email notifying you that your transcription is finished. The online transcript includes timestamps, highlighting, and editing functionality built right in so you can play back a recording and edit the text simultaneously. Sonix is like Google docs with the audio stitched to the text.
  3. Accuracy heatmapping – Sonix’s proprietary system highlights areas and words to identify accuracy confidence. This makes it even easier to pinpoint problem areas and quickly clean them up.
  4. Multi-user capability – Sonix has rethought the way businesses create, edit and publish transcripts. Because the transcript is online, it’s easy to share and invite users to edit the same transcript making the whole process much better for individuals collaborating on a project.
  5. Design & usability – All the founders have experience rooted in design. And all have experience in legacy industries that long needed a facelift. Sonix is completely rethinking how transcription should work.


It’s my job to be slightly skeptical in the face of hype and vendor-excitement. While Sonix looks pretty interesting, I’m not convinced that there is massive defensibility here – Sonix feels like a really nice feature addition on top of the cloud vendors’ transcription offering, but I wouldn’t go so far as to suggest that it’s a defensible platform play.

The job of a startup, however, is to fight in the face of adversity and create for itself an opportunity – the Sonix founding team has a track record and I’m going to watch with interest as it finds its place in the market.

Ben Kepes

Ben Kepes is a technology evangelist, an investor, a commentator and a business adviser. Ben covers the convergence of technology, mobile, ubiquity and agility, all enabled by the Cloud. His areas of interest extend to enterprise software, software integration, financial/accounting software, platforms and infrastructure as well as articulating technology simply for everyday users.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.