Podcast #53
Welcome everybody to another episode of our Momenta Digital Disruptor Series. This is Ed Maguire, Insights Partner with Momenta Partners, and today our guest is the CEO and co-founder of Crate.io Christian Lutz. Crate is one of the most interesting new software companies operating in the Connected Industry space. Momenta Partners is an investor, so I just want to make that disclosure upfront, but we’re going to dive into some of the context around what Crate is doing and explore some of the ways that Crate as a platform, an enabling technology, is actually helping their customers transform their businesses.
So, Christian it’s great to have you with us this morning.
Thank you, Ed, very much for the invitation, I’m thrilled to be part of this.
Terrific, so first could you provide a bit of background and context about your role at Crate, and some of the experiences that led you to found the company.
Yeah, I’m happy to. So, I’m with Crate since the beginning, one of the Co-founders, and my background is, I studied Mechanical Engineering in Vienna, Austria, and this is my fourth venture-funded enterprise software business. We have been experiencing, my Co-founder and myself, basically these challenges when you approach the dynamics of handling machine data use cases. We felt five years ago actually, the need for a radically new approach to that, and a new eye-catcher for that, and with this experience from scaling out a lot of backends we basically gained insight of what are the requirements for smart data management handling for IoT, and that was the basis of how we started the company.
Could you provide a bit of context of the environment in the market, but before you started the company what were some of the technological and business challenges that had led you to identify a gap in the market, as you saw it?
Jodok my Co-founder and CTO, he was CTO of a company called studiVC back then, and studiVC you can imagine was the Facebook of Germany, was the largest website in Europe, and he had to scale from I think when he started it was like 20 servers, to over 1,000 servers. The requirements for the databases were, how do you scale out, and how do you combine basically the challenge of developers can do SQL, but they cannot scale? And on the other hand you had the first no-SQL type scalable database solutions, but then you didn’t have the developers, and so this was the triggering moment where the idea was created, why wouldn’t we just do a very new approach to that and use a cloud-native no-SQL style document database, with all the benefits of the lovely no SQL world, but implement a distributed SQL query engine on top, which then kind of combines both worlds; so standards, SQL the lingua franca of everybody in this space, but in particular of industrial companies, and with all the benefits of a NoSQL’s architecture. This was the idea, and the brutal leap, because at that time it was very difficult to have NoSQL scale of knowhow, and at the same time a massive demand to scale data management solutions, and this was the core idea actually.
Could you provide a bit of context for those who may not be familiar on the advantages of NoSQL databases, and why it is that traditional relational databases which use SQL, have not been able to achieve the same level of scale?
With traditional SQL database, which is relational data and very often transactional workloads, they were never intended from an architecture point of view to store hundreds of billions of records per day. That means the way you could scale them was you just buy bigger, fatter, beefier machines. On the other hand the great invention that came along with no-SQL architectures was, you suddenly have a distributed data store where the workload is shared across many notes, and you have high availability, you have self-healing, you have real-time search, full text search, and very large data sets, and all these beauties, but the disadvantage so far was that you have to give up the familiar language. So, if you look to for example Mongo, you need to learn how to code in Mongo. If you do Elastic, you need to do the same in Elastic. There have been developments of no-SQL databases, where they basically started with SQL-like languages, so they brought it very close to a SQL language, Cassandra is an example for that, they call it SQL for Cassandra query language. But, all of that is a problem because it’s not SQL, it’s just not compatible, which means all the applications especially in an industrial environment where a lot of investment has happened in team frameworks, applications, they’re all SQL-based. Our intention was, and this is what we basically deliver now, is that we combine these two worlds in a very straightforward way, so it’s a standard anti-SQL, and at the same time you have those benefits of the NoSQL world.
A little disclaimer here is, because there is no warm ice-cream as you will know, we of course focus on machine data used-cases, so we’re not transactional, we’re not interested in transactions and asset compliance, we want to deal with billions of records of sensor data, and focus on high concurrency and real-time for chasing documents, relational documents, and data within one database.
Great, could you talk about how you have evolved the product and the technology to focus on that business case, because there certainly are a lot of different database alternatives for somebody who’s trying to solve a problem, but machine data is certainly an area where there’s not… I would say you guys are hacking away a lot of the jungle in a pretty new area; how has this vision informed the way that you have designed and architected the product?
Yeah, you’re absolutely right. When we started, we had a very naïve and very huge wish which was simply, ‘Hey, we’re going to do my SQL style product that is now distributed, and it’s just awesome and everybody is going to love it’. Then we basically realized that after winning tech grants for example, we had a massive inflow of users using the product, and we realized that we cannot hold up to the promise that comes with saying, ‘This is SQL-compatible’, because people would automatically assume the feature richness of MySQL or SQL-Server developed for 20-years, and yours is a start-up out of the gates, of course you don’t have these full features.
So, when we were realizing that, we were starting to ask users, ‘Hey, what are you actually using Crate for?’ and then we realized that 80 percent is machine data used cases, and so actually our customers pointed us into that direction where we had originally had a much broader ambition, an unrealistic ambition of course. When we realized that, and this was probably 3½ years ago now, we saw this as our chance and dipped into perfectly solving machine data used cases, which means you have usually a combination of firehose of data, a lot of sensors with high volumes of data that has to be written in real-time to a database where it also has to be created in real-time, as the same time as its been written, and that you deal with what we call complex data.
So, its chasing documents, its relational data, its time-serious, its geo-spatial data, and in an industrial context almost every used case involves a BLOB store, which means for example a plastic packaging manufacturer using x-rays of every single plastic bottle, and he needs to store the photos of those x-rays. To store all of that in one system eliminating the hassle of dealing with two or three different systems, that’s what we then realized we actually have to focus on, and we dig deeper every day into delivering features for these use cases.
Could you talk about how your customers can implement the technology; how do you deliver the Crate.io database? Are there multiple options that your customers are looking for?
Yes, so when we started the company, we only had in mind to deliver a code that you download, and you deploy yourself of course. So, we have many thousands of clusters running globally where users would just download the database and deploy it on the virtual machine, or on an IoT device, and wherever they want. About two years ago we realized that we should actually start offering this as a fully-managed database service, and now most of our customers consume through a fully managed solution that we provide currently on Azure with additional clouds coming. If I look to the market then we see a very strong foothold of Azure in the industrial users’ space in Europe, and of course Amazon in the US, and at the same time Ali-Baba in China, and that’s how we also look at the market, and how we roll-out our own cloud offerings.
Great. Let’s dive a bit into some of the use cases and the ways that your customers have been able to solve problems uniquely using the technology that you’ve developed. I think one of the aspects of a database is that it is a technology that provides a platform to users to solve problems, rather than being an application itself. What are some of the ways that you are addressing some of the unique needs that your customers have?
I’ll take the example of a company called ALPLA in North America, a plastic packaging manufacturer, one of the leaders in the world for ridged plastic products. So, if you open your fridge, or you go in your shower you’ll definitely have a product of theirs in your hand, because they deliver the packaging for Johnson & Johnson, Unilever, Henkel, all those kinds of companies. They started out two-years ago with a data driven manufacturing platform using the traditional approach I would say at that time, which is you have a relational database, in this particular case it was Microsoft Secret Server, they scaled it to the absolute maximum, the most expensive machine you can buy, and at the same time hitting the limits of scaling if you want to start storing hundreds of millions of records per day, per factory, and this company has 180 factories around the world in 45 countries. You realize that’s a scalability challenge.
So, the next step you approach in solving that is, you need this relational database because it’s your topology of your sensors, it has ERP information, it may have other relevant information, and all of that is relational type. So, you put as the next step on the side an event store; this is where you store all the sensor data, the chasing documents that are coming in, and now you have two databases, and you need to merge them somehow to do proper applications and queries. So, you start to create safe tables, and you duplicate the data because you have too many certain tables, so they make sense. This gets expensive with cloud footprint, but it’s also a huge hassle in handling and keeping all of that data in sync. At the same time, you maybe don’t get the people to deal with these kinds of technologies as you grow the team. This was when we started working together with them, and we would place basically with Crate all of the other existing data-management solutions running on Azure.
Just to give you an idea, one factory has 950 different sensor types, and the original approach was to have one table per sensor, and store the data in those tables, and then make joints and complex queries to run dashboards. Obviously, that fails from a simple performance point of view, and now when they switched over to Crate, they basically made a huge table that is thousands of columns wide, and billions of records deep, and they created an IoT real-time data lake for all the factories, where basically all the information flows together. As at today we have connected 20 of the 180 factories, and there’s a tough rollup plan over the next two years to bring all the factories online, and this creates a data pool where their own teams and external companies, and other tools like open-source can be put on top of this data platform. And since its running in Azure they also benefit from this integration, so they can use all the surrounding APIs and applications, and services that are from our point of view commodity services, like how to connect to a sensor with a OPC UA Gateway, or how to connect an NQT sensor, or how to do a visualization with Power BI, all of that is commodity really.
The trick is that you build the foundation with a real-time IoT data platform, where the sequel interface allows anybody to deal with it, and any other application also to be integrated with.
What’s interesting is, they’ve been able to use the data and the analytics to improve their uptime, and also save quite a bit of money on the amount of labor that’s required to ensure that the factories are running. We’ve spoken about this before, but what I thought was quite interesting is that this simple backplane of data analytics technology has enabled them to rethink their entire processes and digitize their factory in a way that they had not been able to do before.
I fully agree, and to be honest its always quite astonishing. We see with a lot of offers we make, and POCs we see, we get quite some insight on how companies are dealing with that. When you start talking to companies about this whole real-time IoT approach, everybody thinks, ‘Okay, we have to apply AI, and now let’s do machine learning to figure it out’, the reality is much more mundane, it’s really there is such low-hanging fruits for these companies, which means just the fact that they have all the production data in real-time which allows to trigger real-time alerts to the guys on the shop floor. That’s pretty obvious you would think, but it’s very difficult with this amount of data, and the data cleaning involved, and the data enrichment you have to do, and everything. But this is what immediately improves the efficiency of the shop-floor, and that’s just connecting all the dots together with the right tools. The future of course is that you then apply AI and machine learning capabilities to detect problems in the future, but for most of the companies it’s just about, ‘Hey, let’s get this stuff all online’, and in real-time aware level, and this opens a box of things that saves money.
What you were pointing to on the labor side, this is not just about saving labor costs in a shift, this is about enabling people to do a much more educated, better and more interesting job. For example; in such a factory there may be 10 production lines, and maybe three or four people in the shift 24/7 work these 10 lines in the factory. Before the system was in place, each of these guys had to walk about 10 kilometers per day just to do all the checks on the machines, make sure everything’s running smooth, and now this is being checked by the system and they only get to go where there’s actually something to fix, or there is a problem building up, or something has stopped, and they immediately know, they directly go there, they directly fix it. This not only results in a higher output it also results in a much cooler job for the guys there, because they’re not just stupidly ticking-off checklists, they actually start to deal with the issues and are focused on fixing problems fast, so no waste is produced, or no unnecessary downtimes occur, and that’s also a return on investment for these guys which was in the first year literally when they deployed it.
Could you talk about some of the lessons that you’ve learned working with customers like ALPLA, how the success and the processes of arriving at a real transformation in the business is helping you to make a broader case to other companies in the industrial sector that you’re talking to, and working with; how you build on your lessons and articulate value that goes beyond just being a different technology solution, but leverage the use of technology to drive business value?
I think what we learn and experience every day is, we’re blessed that we’ve had this opportunity to show our capabilities with ALPLA, and this is a hardcore discreet manufacturing company which has been optimized in the last corner of each factory. They hit a kind of class ceiling on increasing their overall equipment efficiency, they tried hard, they just couldn’t get significantly further, and this platform allowed them to make a leapfrog so they could significantly improve the numbers. When we now talk to customers or potential customers, and we’re able to tell you a story of the journey of the past two years with all this learning, that’s super-valuable for these companies because very often they’re not yet at this point, and they can benefit a lot from this learning. Of course, when you deploy such a technology I remember very well that the initial reaction in the first factory was pretty negative, because it was perceived as a NSA style watching device that now checks what you’re doing and what you’re not doing, and this is a tough thing. After working with them and then slowly seeing, ‘Actually, this is helping me. Actually, this is taking away a lot of stupid repetitive control work, and suddenly I’m immediately dealing with issues I can help to improve the efficiency of the whole process’, that’s a very rewarding thing.
Now it’s turned around, now the factories who haven’t yet been deployed with the system are asking, ‘So, actually when can we have that?’ ‘How can we help prepare this, so we can get in as fast as possible?’ simply because the benefit cross-factories is very visible, and the numbers just prove that this is a valuable thing. What we learned is, and I admit this was an unexpected learning, that the human factor in implementing all of that, and making it work well, is a big factor. It’s not about replacing the people on the shop floor, it’s about winning them as partners. The guy at ALPLA who ran this, before he did this digitalization and industrial optimization project, he was a CEO for 14 years of the whole of North America, so he knew every screw on the shop floor. What he did, and this was very cool, the people in the platform that work with the application, and deal with the machines, he called them ‘Heroes’, and this shows the attitude you have when you approach it.
So, when the system produces an alert it goes to a hero, who then deals with this message, and he has to fix the problem. I found this was a very nice detail that really shows how important it is to combine the creativity of people with their domain knowledge and the know-how, how to make it better, and give them a tool in their hands that just does the stuff automatically and in real-time, so they can focus on the issues that will come along their way in every single production, unavoidably.
You’ve hit on a theme that just comes up over and over again, which is the challenge in managing organizationally how to get the best out of people, and that’s the biggest challenge of digital transformation.
I just want to ask one final question which is to ask you to reflect on what you see first of all as your biggest challenges, but then what are you most optimistic about in your vision for Crate, and the value you’re focusing on creating?
From a challenges point of view, I think for us as a company, you know we’re not that huge, so for us it’s still about growing very fast, and revenues, but also getting to the next round of financing and to get the right investors there. A function for that is that you have successful products, and successful customers, and for us the key challenge is now to decide which opportunities, which customers should we really engage and focus on, because we know it’s a long journey. The reward however is, that you start with a customer with maybe 100k to 200k per year in the first year, and then it grows very fast, then you’re at half-a-million, and then you’re at 1 million with just one account a year. So, to line up a couple of those and to decide which ones, and where do we focus our energy on to help them, that’s kind of a key challenge for us, to get us to the next level.
In terms of what we want to achieve at Crate, every day when we’re coding in most of our companies, developers, we’re very developer-heavy, this is really about digging deeper every day into machine data use cases. So, with every customer, and with every requirement we learn more and more of how we can improve the functionality database, and to devote enough resources and energy to continue to innovate there. And because we’re at the very front of what’s happening here, this is also our leverage that we have against these huge companies like the big cloud service providers and data solutions, which are addressing probably a much bigger market in terms of numbers of companies, like all the S&Ds for example. But the big scale industrial customers, they really need a specialized solution and we think that’s also the future, it’s not about one database rules it all, it’s about very distinct data-management solutions for particular use cases, and one big use case is industrial machine data handling, and industrial time series, and that’s where we want to get better and better at every day, and with every customer we listen, and we get more credibility in the market, so it’s a super-exciting time looking ahead.
Awesome. It’s great getting a deeper view into the work that you guys are doing. As you know, Momenta are a big believer and supporter of the work that Crate is doing, as well as the team.
So, again, this has been Ed Maguire, Insights Partner at Momenta with Christian Lutz who is the co-founder and CEO of Crate. Thank you once again Christian, it’s been always-always super-interesting hearing you tell the story.
Thanks for having me Ed, and all the best.
[End]