Oxygen Deprivation: FerretDB with Peter Farkas

00:00:00
/
00:33:55

January 31st, 2024

33 mins 55 secs

Your Host

About this Episode

FerretDB enables users to run MongoDB applications on existing Postgres infrastructure. Peter Farkas (@FarkasP), co-founder and CEO of FerretDB, explains the need for an open source interface for document databases. Peter also discusses the licensing change of MongoDB and the uncertainty it created for users. He emphasizes the importance of open standards and collaboration among MongoDB alternatives to provide users with choice and interoperability. 

Contributor is looking for a community manager! If you want to know more, shoot us an email at eric@scalevp.com.

Subscribe to Contributor on Substack for email notifications!

In this episode we discuss:

  • The epic mountain adventure that inspired FerretDB

  • Why commercial open-source can be additive rather than extractive

  • How compatibility and open standards drives innovation and competition

  • PDFs as an example of corporation-supported standards

  • Three tenets for building a successful open source project

Links:

People:

Peter Farkas:
Open-source is about getting rid of vendor lock-in, about giving choice to the user. And we want to make that happen by creating the open standard and by collaborating.

Eric Anderson:
This is Contributor, a podcast telling the stories behind the best open-source projects and the communities that make them. I'm Eric Anderson. Peter Farkas is with us today. Peter is the co-founder and CEO at FerretDB and the creator of the project by the same name. Peter, welcome.

Peter Farkas:
Pleasure to be here, Eric. Thank you so much for the invitation.

Eric Anderson:
Before we went on air here, we were just recollecting that you have quite a history in databases, running database companies.

Peter Farkas:
Yeah. I think the reason I like databases, especially open-source databases is because it's needed for everything. You name a technology, and you will need a database for it. And it's just so great to work with these technologies to enable bigger things to happen. And the reason why open-source is important in my life is because I strongly believe the databases you use should be open-source. Should be based on an open standard like SQL, MySQL, Postgres, and all the other derivatives. And this is why we started FerretDB as well. Before FerretDB, I worked for Percona, which is probably the most well-known open-source database consultancy firm. I learned a lot there. Actually, my co-founder, Peter Zaitsev is the founder of Percona as well. And we have some other Percona people at FerretDB. Then I went on to found Altinity, which was a much different company than it is today. And worked at Cloudera as well, a bit with Big Data and Hadoop. And then here I am at FerretDB now.

Eric Anderson:
FerretDB, as I understand it, is Mongo on Postgres. How do you describe it?

Peter Farkas:
I think that's a very good summary, a one-liner summary. We basically turn Postgres into MongoDB, a MongoDB compatible database. So how you can imagine that is if you have an existing Postgres infrastructure and you have a MongoDB application, MongoDB is no longer open-source. And with FerretDB you can use your existing Postgres solution to run your MongoDB applications as well. And it's not just Postgres, we also support SQLite, SAP HANA and other backends. So it's possible to turn other databases into a MongoDB compatible database as well.

Eric Anderson:
I wasn't aware of that last part. And you already alluded to a couple reasons why you might be interested in doing this. One was MongoDB is no longer open-source. And the second was you may have an existing investment, you could interpret investment in many ways there. But an existing focus on either Postgres, SQLite or Hana, is that right?

Peter Farkas:
So after MongoDB went proprietary in 2018, MongoDB was adopted by a number of companies and even governments who had a policy that they could only use open-source software in their technology stack. And with the license change on MongoDB side, they found themselves in this impossible situation where they were already using MongoDB. And it was no longer open-source, but there was no alternative to it. And these users, these companies, these governments were looking to find alternatives and they were looking at Postgres. They were looking at some other solutions. But what they found is that there's no solution which would not require them to rewrite their entire application. And with FerretDB, you can skip all of that because you can just turn these relational databases into a MongoDB compatible database. And so we started with Postgres because we do believe in Postgres. We think this is where users are gravitating towards for a reason.
And we ended up supporting other database backends like SQLite and SAP HANA as well because there was demand from the community. Turns out that MongoDB is no longer able to serve use cases where there is an embedded application. For example, on a networking appliance, which uses MongoDB, but it's not practical to turn that into Postgres because of resource constraints. So that's why we decided to support SQLite as well. And now in some network appliances, as I mentioned, we are able to replace the last open-source version of MongoDB, which they still run because they still have to do that. Now, we can turn those appliances into fully open-source solution again with FerretDB and SQLite. And with SAP HANA, so SAP HANA is a very interesting example because SAP just decided to build compatibility into FerretDB. So they contributed as open-source contributors. And they're still building out the compatibility for SAP HANA into FerretDB, which is a great thing to see because that confirms our suspicion that there's a need for an open-source interface for many database backends.

Eric Anderson:
Is there a community fork of MongoDB since the licensing change? We had the folks behind OpenTofu on the podcast a month or two ago. That was a licensing change followed by a very quick community fork that seemed to get enough critical mass. In some ways, FerretDB represents that community fork from Mongo?

Peter Farkas:
Well, it represents the community's desire to have an alternative to MongoDB, but it's not a fork of MongoDB. So we are not using any of the code from the last MongoDB open-source release, simply because it would be a massive undertaking. Also, it would be pretty late. So we started FerretDB three years after the license changed. So much of that code is still. But it's an interesting example that you brought with OpenTofu. So the HashiCorp story, it created a much louder uproar in the open-source community. Partially because when MongoDB came out with the Server Side Public License, they stated that it's an open-source license. They even attempted the open-source initiative to certify the SSPL as an open-source license. So there was a large amount of confusion and there's a large amount of confusion even today, whether the SSPL is an open-source license or not.
So I think MongoDB with introducing that confusion managed to avoid the forking of MongoDB because it was not clear whether the SSPL license is going to be regarded as an open-source license or not. With HashiCorp, this was much clearer from the get-go. It was clear that the community needed to do something and that an alternative is needed. Back then with MongoDB was not as clear. And by the time SSPL was really I guess considered not an open-source license, it was already late for the community to get the right amount of momentum to do a fork. That's just my private opinion on the matter. It's rather interesting how different the two events were.

Eric Anderson:
I think it's a good opinion. HashiCorp is now the 10th or something notable project of late to do this. And we've had some practice on how to respond maybe to these as a community. Whereas with Mongo, I think it was like, "What's going on? What is this?" And maybe the ambiguity not only affected developers. But you've talked Peter, I believe at how even today some legal and large corporate entities feel like it's unclear how much liability they're exposed to operating Mongo. Tell us more about that.

Peter Farkas:
That's right. So we talked to large enterprise users on SSPL before and after founding FerretDB. We tried to understand where enterprise companies are in terms of their perception on SSPL. And the overwhelming feedback was that their legal teams are unsure where the boundaries are when it comes to what the SSPL allows and what the restrictions are. So the ambiguity of the SSPL license confuses large enterprises as well, which we believe, I mean, I don't think it's a resource problem on their side. It's more like yes, the license itself is so ambiguous that it's indeed hard to tell what is allowed and what is not allowed.

Eric Anderson:
Yeah. And so part of that's the language. There's certain lines in there that are just maybe ambiguous as to how they should be interpreted. And then two would be the amount of history on judges and cases clarifying or interpreting that language. I would imagine there's just not a lot of times that those things have been challenged.

Peter Farkas:
Yeah. And just to give an example here. So the SSPL license... And I'm not going to quote the legalese verbatim, I'm not a lawyer. But essentially, what it says is that you are allowed to provide MongoDB as a service if you added enough value on top of it. That it's fundamentally different from just a database which walks and talks like MongoDB. Now, how do you define the value there, the amount of value which would be enough for you to add to be able to run MongoDB as a service? What does providing MongoDB as a service really mean? This is not defined in the license and that's where most of the problem is.

Eric Anderson:
So three years after the license changed, you woke up and decided it was time something happened.

Peter Farkas:
Well, it's a rather crazy story. We went to this epic adventure to the Himalayas to K2 base camp. And I think the idea of FerretDB was a result of the right amount of oxygen deprivation and cold, I guess. We talked a lot about MongoDB, taking the fact that MongoDB is one of the default databases one would use next to Postgres, next to MySQL and next to some other mainstream databases. But the only one which is not open-source. And that's rather weird because usually open-source databases are favored by users. Not just because of the need to avoid risks or license fees, but also because it's much easier to learn an open-source technology compared to a proprietary tech. So we talked about how MongoDB was still able to avoid being forked after all these years. And we tried to understand why. And FerretDB was started with the mission that this needs to be changed, that things need to go back to how they started, which is open-source.
And we also think that the word of document databases would need a similar open standard what SQL has. It's the same or very similar story as when IBM came up with the concept of the relational database. Then IBM came up with the concept of SQL, then it dominated that market for a decade until alternatives started popping up and SQL became an open standard. And we all see that today, SQL is the definition of commodity because it's everywhere. It's taken for granted that yes, if there's a database you can interface with it using SQL. But that is a result of work and vendors coming together and the creation of the open standard. And this needs to happen with document databases and particularly with MongoDB as well. So that is the mission of FerretDB. That's why we exist because we want to change the industry and expand the market the same way as how SQL did back then in the '80s and '90s.

Eric Anderson:
It wasn't clear that after the NoSQL enthusiasm that we would be back to being excited about Postgres and other SQL databases today.

Peter Farkas:
I remember the NoSQL craze, and I think part of it was due to a big misunderstanding. I still worked at Percona. It was 2014 when MongoDB really started coming up on our radar. We were a MySQL company, so we had nothing to do with MongoDB. But I do remember that most voices were all about NoSQL is going to kill relational. And that's a huge misunderstanding because it's not about that. NoSQL is a good tool for many use cases. It makes certain things easier in certain situations, but there's no such thing as the one database which is good for everything. Not even Postgres. There's a good reason why there are things outside of Postgres. There is a good reason why there are many flavors of Postgres because they are all better at something which the user particularly cares about in that specific use case. So hearing that NoSQL or MongoDB is going to change everything by killing relational, I think that was a bit of a nonsense and a result of some misunderstanding.
What is happening today is that the two approaches are converging. So you see a lot of relational databases such as Postgres implementing document related capabilities. And at the same time document databases such as MongoDB started implementing or implemented SQL interface for BI workloads. And there's Yugabyte, for example, which provides database as a service which is capable of running document and relational workloads. And that is where NoSQL belongs right next to relational, right next to what we already had because there's a need for both.

Eric Anderson:
And Ferret fits in that vision because I can have my SQL, my Postgres and then run Ferret right alongside it for my document DB use cases.

Peter Farkas:
Right on. I just sold it to you.

Eric Anderson:
Done. The history of your career that we talked about at the beginning, you described two different models. One was the consultancy and then I don't know if a product company is the right other one. But you mentioned how Percona was different and then Altinity started out a certain way and then kind of changed. Help us understand because I think in the world of open-source database companies, to an outsider, it might not be clear that Percona or Altinity or these other models exist.

Peter Farkas:
The easiest, not saying it's easy, but the easiest way to monetize open-source, first of all, why do you need to monetize open-source? Just to go back to the complete root of the problem here. The reason you want to monetize open-source is because at least my belief is that an open-source project has a much better chance to survive if there is a company behind at least some of it. A large contributor, which nurtures the open-source project and executes on a business strategy which provides the resources. Not just for the business itself, but for the open-source project as well to thrive and grow. I think that's a good thing.

Eric Anderson:
So commercial open-source isn't merely extractive. It's not just taxing the open-source system. It is actually additive in the sense that it brings life and energy to the open-source ecosystem.

Peter Farkas:
I believe so. Because if you take a look at some examples where there were two or three open-source contributors keeping a project alive and suddenly there was a critical bug and there was no one around to fix it. And that resulted in losing trust in the project itself. That's a great example of how expecting everyone to work for free indefinitely and also provide 24/7 support for said technology and project is probably not realistic.

Eric Anderson:
We've had guests on here describing how even after his open-source project was successful, he wasn't sure of the end game. He was like, "There's no real way to hand this off to somebody else and everybody just expects me to maintain it. And I can't do this forever." So commercial open-source gives perpetuity to open-source. That's the first fundamental premise.

Peter Farkas:
Yeah, I think it's always confusing because open-source is regarded as something which can be used for free by everyone. But in reality, for a large user, let's take Apple. If you pick up your iPhone and go to the legal section in the about menu, you will see that your iPhone or iOS is based on 100 different open-source projects and open standards. And someone needs to maintain those. And probably Apple needs assurances that that technology is going to exist even a couple of years later as well. Which brings up the question, how can you increase the level of trust in your open-source project? And that is through ironically monetizing through providing services for it. So if you provide services for your open-source project with your company, then you can provide the necessary amount of assurance to your users that they will have someone who is going to be able to come and fix if something happens. If there's a bug or a missing feature or simply just ongoing maintenance of the code.
And this is what companies like Percona or Databricks or Cloudera or others recognized. If you take Cloudera, they are a company built on Hadoop, which is another free and open-source technology. But most of the users of Hadoop would not be able to take the risk of just using Hadoop without a company like Cloudera, which provides 24/7 support for said software. So it's mutually beneficial for everyone involved. And then there are the hobbies and the smaller users who also benefit from this relationship because they get a strong open-source project as a result, which stands on firm foundation.

Eric Anderson:
And in that context, where does FerretDB fit, Peter?

Peter Farkas:
So FerretDB was not started with having a business in mind. We wanted to solve a problem. The problem was, "Hey, what should we do with this situation where we care about databases, we care about open-source?" And most users using document databases still believe that MongoDB is open-source and still use a proprietary software probably without even knowing it. So we started FerretDB with the intention that we are going to disrupt the current situation where MongoDB is the only company which can provide and can develop MongoDB itself. We were pretty successful with catching the attention of the community. We were pretty successful with making people interested in the problem. And what we need to do now is we need to step up as a company as well, which provides services and as a service solutions to make sure that the project itself is going to be sustainable.
So that's where we are now. What is more important is there are other alternatives of MongoDB. AWS is DocumentDB or Microsoft's Azure, Cosmos DB for MongoDB. We actually working really hard on bringing all of the alternatives together to work on creating an open standard. To make sure that these products will not be merely alternatives to MongoDB. That they will be MongoDB compatible but not as alternatives for MongoDB, but as implementations of the open standard or the eventual open standard. Meaning that MongoDB alternatives will not have to run after MongoDB or be driven by MongoDB Inc's priorities. That's what we are working on.

Eric Anderson:
That's curious. I was at Google working on BigQuery and related Big Data SQL things at one point. And originally, the BigQuery was not ANSI standard SQL. It was a variance which suited the kind of workloads we expected in BigQuery. But that was always a request of the community, of users. And over time Google has now supported some kind of ANSI standard of SQL. And so you would like to work with Microsoft and AWS and others and agree on what does Mongo compatible mean? And compatible may not even be the word in the future. Maybe like OSI standard, ANSI standard, some kind of standard.

Peter Farkas:
Exactly, exactly.

Eric Anderson:
And maybe even the MongoDB company variance may or may not live up to that standard.

Peter Farkas:
Exactly. This is our big goal. This is our desire. And this is something we put a lot of effort in to make sure that this cooperation is going to be a successful one because this is the key to innovation. This is the key to competition, healthy competition on this market. Right now, there's no competition whatsoever because the solutions MongoDB itself and its alternatives are not compatible with each other. So you can't just look at your MongoDB Atlas invoice, be unhappy with the cost and go elsewhere. You can't. You're logged in. There's no opportunity for you to remedy that situation without having to touch your application. Of course, you may be lucky, you may be able to migrate. But as soon as you used even one of the advanced MongoDB features, you're locked into MongoDB Atlas. This is not what open-source is about.
So MongoDB still calls itself open-source in many of its documentation and marketing materials. But open-source is about getting rid of vendor lock-in, about giving choice to the user. And we want to make that happen by creating the open standard and by collaborating not just with the big cloud providers, but we are hoping to collaborate with MongoDB as well. They are also needed in this discussion. It's their interest as well.

Eric Anderson:
And maybe I can develop the value of a standard further because I think there's a void lock-in and there's some economic reasons. But there's all this whole tooling world where you support SQL suddenly there's like... In the world of at least Big Data and there's visualization solutions that are all SQL centric. There's code writing, clients. As long as you stick to a standard, there's a whole plethora of interoperability that emerges. And you're saying we could bring that to the document universe?

Peter Farkas:
Yeah. It does not exist today. And with the open standard and with the collaboration between all of these different alternatives and MongoDB itself. Hopefully, we are looking at a massive expansion of opportunities and increase in interoperability and basically the same thing what we have with SQL. But just to depart a little bit from the world of databases, there is a very simple example on how an open standard can help and expand a given technology. Adobe PDF, we take it so much for granted that if you go to your airline and you want to get your boarding pass, you just click on the download button, you get a PDF or anywhere else with any vendor across many different unrelated software. There are things like PDF, which are common. And you can just print it, you can export it, you can edit it. No one used PDF before 2006 when it was not an open standard.
It was actually a technology with a dwindling popularity, it was non-existent. And only when vendors came together and Adobe also allowed that to happen to a certain degree, PDF became this universal tool. And the use of the technology itself skyrocketed. And it's just unbelievable how popular it is today. Same with if we go back to databases again, SQL. The moment it became an open standard, MySQL, Postgres, and all the others could implement the open standard. And we just take it so much for granted, and there's such a vibrant and amazing amount of innovation in that area. And we believe that the same exact thing is going to happen with document databases if we succeed and we will succeed with creating an open standard out of it. That's the essence of our vision.

Eric Anderson:
That's awesome. And history has shown you can have some success. You've got 8,000 GitHub stars in this big growing community. How did that come about? Any tips you can give for us, Peter, if I were to start an open-source project to make it as successful as yours?

Peter Farkas:
Well, I don't want to make it look like that all the success we had is a result of precisely calculated set of actions. But if I want to reflect on the success we had so far, I think the most important is to address a real problem. And for that, you probably need to either sell your vision even if you don't have your product yet. In our case, we created a tech demo and explained why the world needs FerretDB. And we got a massive amount of positive response on that. We also got a lot of feedback, we needed to address that. And as soon as the community saw that we are a team which works with the community and which listens to the feedback and addresses the questions, they were more and more likely to work with us. And I think that is what we did in order to get this amount of attention from the community.
And it's not just about stars. What we are most proud of is contributions. So I think that community contributions are a lot more important than stars. I think that's the real tool or real metric you can use to measure your success and whether you are doing the right things. Because as soon as you see external contributions, that's when you can be sure that someone is trying to scratch their own itch by improving your code. And that is the sign of real interest. It's very easy to star a project. It takes no commitment whatsoever. But contributing, that's a whole other level, and that's what we are really, really taking seriously in terms of a metric.

Eric Anderson:
What's the state of the project today? You've described that you support Postgres and SQLite. This HANA project is happening. As people dive in, what should they expect out of FerretDB today and what should they expect in the future? What are you working towards?

Peter Farkas:
Yeah, so we've been working on FerretDB for two years now. The expectations towards FerretDB being a MongoDB alternative, most users expect that it is going to be the exact same thing. Unfortunately, that's not the case. So MongoDB has a lot of advanced features, a lot of features, which none of the alternatives implemented. Simply because we have this saying that 85% of MongoDB workloads use maybe 25% of MongoDB features. So that's what we are aiming for, to provide these core set of features, which most MongoDB users can utilize to migrate away from MongoDB Atlas if they want to. On the other hand, we are also in need to address performance, the question of performance. So right now, we are about half as performant in most cases compared to MongoDB Atlas. This is true to other MongoDB alternatives as well. And while most use cases are not really affected by this, this difference is not sustainable.
So we want to create our own Postgres extension, which addresses some of these performance issues. But we also need to introduce other tweaks as well to get to where we want in terms of performance. So all in all, I can't say that we would set expectations, right? If we would say that "Hey, whatever workload you have, just migrate to FerretDB because you're going to have the same experience." That's far from reality. But we are onboarding more and more users and their use cases. And we are developing FerretDB along the way.

Eric Anderson:
The FerretDB projects actually even got more going for it than I realized coming into the show. This idea of a document standard is really interesting. And you've convinced me that your efforts here not only can build an interesting business but can really help advance the open-source community. So thank you for what you're doing for all of us. It's a gift to humanity.

Peter Farkas:
Well, that's probably an exaggeration. But we'd like to think that we are changing the database space for the better by re-enabling the users of document databases to have a choice. I think that's what we do. It's far from revolutionizing healthcare or AI. But as I said earlier, I like the database space because it serves as the foundation of amazing tech such as AI, such as anything else you can think of. And it would make me proud if we could see five years from now that FerretDB disrupted this space in a way that users became better off than they were with MongoDB. That would make me very happy, and that's what we are marching towards.

Eric Anderson:
You can subscribe to the podcast and check out our community Slack and newsletter at contributor.fyi. If you like the show, please leave a rating and review on Apple Podcasts, Spotify, or wherever you get your podcasts. Until next time, I'm Eric Anderson and this has been Contributor.