You can’t keep a good man down: Simon Crosby SWIMs around his new thing

Business

Simon Crosby is something of an over-achiever. Not only does he have the tendency towards wearing lovely pink shirts, he’s also a technology company founding machine. Subsequent to completing his Ph.D., Crosby went on to found a range of technology companies: CPlane, XenSource, and Bromium among them. Crosby is an inveterate tinkerer and has an intellectual curiosity which makes him prone to starting things to solve big problems.

I last caught up with Crosby when he was busy with Bromium, a company that created so-called micro-virtualization a way of isolating individual processes to avoid any damage from rogue workloads. Bromium still exists and is in the scaling phase, so Crosby decided to shake things up and has moved onto his new thing, SWIM. To be clear, SWIM isn’t new, but Crosby’s involvement as CTO, alongside his regular sidekick, Simon Aspinall, as head of sales and marketing, is new.

SWIMming in a sea of data

Just in case you haven’t heard, analyzing the massive and rapidly growing amount of data out there is going to change the world. Well, maybe not find a cure for cancer or end world hunger, but it is fair to say that making sense from the mass of data produced by everything from autonomous vehicles to electric toothbrushes will enable better products and services to be created.

But analyzing all that data is an inherently problematic situation which has historically necessitated gathering up that mass of data and dropping it into a big old database somewhere. Generally what happens then is… nothing. Since no one really knows what question to ask the data, and since very few organizations have the data science resource available to them to actually do anything purposeful, most of these big data warehouses are big, cold, dark impenetrable places where nothing happens.

SWIM wants to change all that by discovering hidden insights and predicting future performance from streaming data. But instead of creating a separate place for the data to live, SWIM sits in the stream, slurping in data at the edge and learning on-the-fly. By moving away from centralized cloud-centric and batch based approaches, SWIM plans to deliver real value, real quick.

SWIM was actually founded a few years ago by current CEO Rusty Cumpston who says, of his company’s vision:

We founded SWIM.AI to address the growing edge data problem, where real-time data from new edge devices and sensors requires fast, local analytics and prediction to aid local decision making. To address these needs, we’ve focused on building a new edge software stack which can efficiently process streaming edge data and generate business insights that can easily be shared and accessed by other applications on any device and across any network. Our leadership team brings an incredibly strong background in software development and innovation, and has developed a product that revolutionizes how edge data is processed, accessed and used for continuous learning.”

And those who can’t SWIM, drown in data

As Crosby sees it, the big cloud vendors want customers to store their data and run their IoT, big-data, ML and analytics apps in the cloud. He contends that approach is fine for cloud-native apps like Twitter, but if organizations already operate infrastructure that delivers streams of data, a cloud-first architecture won’t cut it. As he explains, bandwidth, storage, and processing sound cheap, but costs add up fast. And despite the appeal of big-data analysis and machine learning services and notwithstanding machine-learning tools such as AWS SageMaker, data cleaning, and labeling, and the development and training of models require deep domain expertise. He also contends that on-premises solutions are no better since they demand substantial investment ahead of any return, and saddle customers with maintenance, costs. Add to that the fact that they still need a data-scientist with domain expertise to make sense of it all and you have a proposition which doesn’t stack up.

Enter SWIM EDX

SWIM EDX is, in Crosby’s view, the answer to all these problems. SWIM EDX is designed to gain insights directly from noisy data, without the need to perform batch-based analytics and create complex machine learning models. This edge-based platform is designed to learn on-the-fly as data is created. Designed to be implemented at the edge, and on existing hardware, SWIM EDX is enabled by the SWIM Fabric. And so, what does SWIM actually do? From the marketing briefing material:

SWIM builds an autonomous, resilient edge data processing fabric that spans edge devices. The Fabric supports the distributed actor framework for stateful, distributed, actor-based edge computing – and includes real-time analysis and learning. Actors are stateful “active” objects/services. The Fabric supports true parallelism, actor migration, and replication, automatic load-balancing, and is resilient to failures.
SWIM hides the distribution of actors across the fabric: Actors interact with each other as though all are local. Communication is consumer driven, and back-pressure regulated. The fabric is eventually consistent, and state updates are efficiently shared between instances.

Without getting too down in the weeds, the platform finds the ideal balance between analysis at the edge and centralizing the creation and iteration of machine learning models through the use of digital twins. Again, per the literature:

EDX represents each real-world system (a data source) as a stateful actor that executes on the SWIM Fabric. Each actor is a stateful “digital twin” of the real-world system it represents.
Digital twins are automatically detected in the data and dynamically instantiated. Each is a stateful representation of the system it represents. It analyzes and learns from its own time-series real-world data. It learns much as you do, observing the world, and continually re-computing its analysis and predictions.
Each twin efficiently publishes its state to other twins. Twins share data to solve problems that require insights from multiple twins – creating a dynamic “join” on their combined state.
Each digital twin offers a real-time API that makes it easy to visualize its state and the results of its analysis and its predictions. Application developers use these APIs at the digital twin level to quickly create custom logic for multi-actor joins to combine/correlate/analyze or learn and predict from multiple twins.
Each digital twin consumes data from the real world, analyzes it using a rich set of analytical functions, and trains a deep learning model for the real-world system. Each sample is matched against predicted behavior, and errors are back-propagated to improve predictions.
Edge-optimized implementations (for GPUs and CPUs) of the twin-centric deep learning algorithm gives SWIM EDX an almost 100x speedup over cloud-based ML and is key to its success on low power edge devices. EDX learns a specific model for each twin, rather than a system-wide model. This has many advantages: The models are robust, contextually rich, and permit local responses in real-time.

MyPOV

Simon Crosby (alongside the other SWIM folks, of course) is a smart guy who nicely combines deep technology chops with a laser focus on more commercial aspects. SWIM would seem to be a very smart response to some very real problems that organizations are trying to grapple with. SWIM also very much goes against the orthodox view of things which is that anyone wanting to be a big player in machine learning needs to amass a mountain of data to train their various ML models.

A good example is IBM who famously acquired The Weather Company at least in part to better train Watson, its Jeopardy-beating AI beast, to get even smarter. SWIM goes against that grain and offers up the best of both worlds – fast, effective analysis right at the edge, with centralized model-improving benefits. one to watch, for sure.

You can’t keep a good man down: Simon Crosby SWIMs around his new thing

Leave a ReplyCancel reply