Files

823 lines
46 KiB
Plaintext
Raw Permalink Normal View History

Episode: 3639
Title: HPR3639: Linux Inlaws S01E60: The Job Interview
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr3639/hpr3639.mp3
Transcribed: 2025-10-25 02:39:20
---
This is Hacker Public Radio Episode 3,639 for Thursday the 14th of July 2022.
Today's show is entitled, Linux and Laws Sayy.
It is part of the series Linux and Laws.
It is the 60th show of Monocromic and is about 54 minutes long.
It carries an explicit flag.
The summary is an interview with Chris Jenkins from Confloynt.
This is Linux and Laws, a podcast on topics around free and open source software,
any associated contraband, communism, the revolution in general, and whatever else,
fans is critical.
Please note that this and other episodes may contain strong language, offensive humor,
and other certainly not politically correct language.
You have been warned.
Our parents insisted on this disclaimer.
Happy Mum!
That's the content is not suitable for consumption in the workplace, especially when
played back on a speaker in an open-plan office or similar environments.
Any miners under the age of 35, or any pets including fluffy little killer bunnies,
your trusted guide dog, unless on speed, and Qt-rexes or other associated dinosaurs.
This is Linux and Laws.
Season 1 episode 6X.
To use a proper regular expression.
Season 1 episode.
6.
Exactly.
6.
Square bracket opened.
1-9.
Square bracket closed.
Martin, how are things?
Yeah, things are fine.
Thank you.
We have a special guest tonight.
Yes, but before we go into today's topic, I think a little bit of bashing the United Kingdom
is in order.
I think they're pretty good at that themselves.
We have an entire government dedicated here.
We could do some Germany-backing for change.
You can go right ahead.
And then dodgy gas practices and stuff like that.
What do we mean gas practices?
Like we keep buying gas from other countries?
Yes, yes.
And not producing their showers more to the point as we discussed last time.
Oh, no.
Here we go again.
Martin.
Martin.
I hope you enjoyed the show.
Yeah, Martin talked with the idea of stopping by at some stage.
But I think he dismisses the light on the concentration.
Off to a great start.
But this is not the shower podcast, never mind personal hygiene.
So without further ado, I would like to introduce our guest.
Chris Jenkins.
For those three people in the audience who do not know Chris Jenkins,
the other one, Kafka, because that is actually the episode,
the topic of the episode today.
Chris, why don't you introduce yourself?
Yeah, hi.
Hello, folks.
I'm Chris Jenkins.
I am.
Oh, God, I'm a geek of old.
I started.
I got my first computer when I was seven and I haven't looked back since.
It was.
Okay.
Okay.
So this is very local or specific, but you may remember a company called Tandy,
who were radio neck in the States.
Absolutely.
Martin, and do you think you or I or we are old, apparently not?
I was seven.
Come on.
Okay.
Don't do the maths.
I'm sorry.
You didn't ask on that.
Could you have had a round for quite a while, still?
They were.
They were.
They will only put me in a decade.
Anyway.
I mean, it literally had like a four character, seven segment display and a microprocessor and
some buttons you held down to make binary chords.
But it came with a good manual and you could load instruction sets onto it.
And it taught you how to add and subtract in binary, which to a seven year old was called.
To this seven year old.
That makes it.
That makes it in the sixties, doesn't it?
Like I'd be able to play 60 or whatever.
I mean, I think it was that technology, but by the time it got into the affordable retail
hand of a seven year old, it was.
Yeah.
It was probably late eighties.
Let's say early nineties.
So I sound younger.
Let's go.
Very often.
You heard I am.
I've reached the age where I lie about my age.
No, no.
There's a point where you stop caring, isn't there, Chris?
There's a spot.
You know, a lot of people will have this.
There's a point at which you have to calculate it from scratch every.
My last.
You go.
Okay.
So I was born in 83, carry the one.
Okay.
So my ages.
Yeah.
So we get.
So anyway, whatever age I am, because I'm not going to calculate it from scratch now.
Well, you see, it used to be important calculus, but I think these days,
these things are called smartphones.
Yeah.
That can help.
That can give you a helping hand here.
But then you have to remember as like, were you born?
Have you passed it birthday?
What time of year is it?
What time of day is it?
Have I taken my medication this morning?
Normally they have food in Canada.
These days.
Maybe I'm wrong.
Yeah.
Yeah.
Do you know it's getting easier to just ask my wife.
So that makes another case for a younger wife.
I see.
I know I, my wife is slightly older than me and I have a rule.
I do not want to be the mature one in the relationship.
Okay.
Fair enough.
He's not going to work.
Martin, take note.
No, we're going to scrap this.
We're going to end this out.
Yeah.
You're better off with older women.
That's my family.
Okay.
Chris, I'm.
Which is to say.
Yes.
I'm a developer.
Have to come through.
I'm a developer.
Have to come through.
I'm just about to ask you why we have you.
That's a bit of a hobby.
Is it?
Yeah.
Okay.
For those.
Why don't you.
Why don't you give.
Why don't you give a short overview of what Kafka is and how
Confluent comes into play.
And then we're going to grill you with a couple of questions.
Maybe not.
Maybe I'll preload some questions on that.
So you've got a patchy Kafka, which is an open source event
streaming.
Platform.
Or is it a.
Is it a queue?
Is an event streaming platform?
Is it a database?
And that question occupies a lot of my thoughts, like what is
he is it?
But it's a.
It's a.
It's a fantastic tool for building event-based systems that work
in real time.
Confluent.
Sorry.
I didn't want to interrupt.
Go ahead.
No, it's not.
It's not.
It's not.
It's not.
It's not.
It's not.
It's not.
It's not.
It's not.
It's not.
It's not.
It's not.
It's not.
It's not.
It's not.
It's not.
Yes.
Sorry.
I didn't want to interrupt.
Go ahead.
No.
It's not.
It's not.
No, no.
It's not.
No.
No.
No.
Sorry.
I didn't want to interrupt.
Go ahead.
Sorry, I said it again. Sorry with cloud were you talking about probably cloud or any cloud or everything is called cloud is this. Oh, yeah, so
That's going to get me grinding my axe on the block chain
Let's not go down that road
So we you come and you can choose whether you're running it on AWS or gcp or as you're you know
We've got different underlying service providers, but we're just one place that will get Kafka running for you
No, no, no, I'd like to get it running on one of those micro processes they've built in Minecraft, but I think we'll save that for Q4
I see, okay
Um, correct me for wrong, but I thought the comfort would also do an enterprise version of the software
Um, you can do like licensed on premises stuff
Um, that's definitely something we do
Yeah, so if you want to host it yourself, but get proper support, we can do that too
That sounds very much like Mongol coach base
Red is
It's another open core company. Yeah, yeah, I think it is. It's like it's not a business model that needs radical innovation
You need to just do it well, right? You need to say here's a great product. We're working on it. It's open source
And now here's a great place for someone else to worry about running it for you
Okay, Chris you mentioned one important fact
Apparently Kafka can be many things. Yeah, how do how do you differentiate yourself from the likes of say rabbit MQ
Red is
What else comes to mind and maybe even
Databases. Yeah, yeah
Well, um, so let me start with
Databases like what's your definition of a database? That's a big question because different people have different answers to that
Databases, yeah, specifically I'm referring to no-speakable databases and Martin will correct me
Of course in a minute because I'm getting normally I'm getting it quite wrong
But databases especially no-speakable databases are the stuff that hipster um folk into the abstract
Who hang out in coffee shops all day long and program in russia and python and another cool languages and awarded
But the next big thing in terms of they need some fancy persistent layer that is close enough to the data
Models that the application use or that the applications use that they do
And they don't have and they don't want to use a non-semitical file system. Let's put it this way
Oh, now I've got to decide whether I want to alienate the hipsters or
The more traditional programmers
I'm gonna try and bridge Martin does it all the time, so don't worry about it. Yeah, okay. We don't make friends here
So it only alienated exactly
I'm a bridge builder. What can I say? I welcome all hipsters and traditional programmers
Um, so it's
um
It's
If you start with the idea of you know how you know how most databases these days when they do replication
They have a replication log, right?
So
Postgres is about to update a table. It writes what it's going to do to a log
And then updates the table
Sorry, Martin. It's great. It's great database
Yeah, I like yeah, Martin. Martin is our post is our resident postgres postgres expert
I'm feeling tested. This is something like a job interview where two people are interviewing
Martin, what did he what did give it away? I wonder
I don't even know what job I'm interviewing for
Sorry, Chris. No, no, I'm done. I hope going right that the stakes have changed. Okay. I need to bring my egg in
Right, so so you've got this idea that
You know, you could see postgres in a different way and you could say its first job is to write to this appendum
Only log of stuff it's going to do and then it's second job is to do it
And then when you want to have a backup database you just ship this append only log as fast as you can
And because it's append only it doesn't actually matter if you get behind because you can always catch up nothing's changed
Right, so you ship that off to another machine and then that machine replays that log of events
And it's able to reconstruct a relational database just from that log of what it was going to do, right
So if you take that big idea and say why don't we start with that if if we've got this append only log
Which actually works great for high availability and replication
And you can if you want to you can build up a whole relational database on it
Why don't we start there make that append only log really good and see what happens
And that's conceptually a way to see Kafka right it's a system for recording facts of what happens
And then replaying over that list of facts in an interesting way
Which is very that's very abstract. Let me try and pin that back down. So
You've got a company and people are buying stuff and every time someone buys something you just record an event on this append only log
They bought the x they bought why somebody else bought z right and you just record that
And then someone comes along and says what are our sales this month? Well, they
Build their state machine which maybe looks a bit like a mini relational database and just runs over that log of events and comes up with the answer what we sold right
Meanwhile someone else wants to know activity by continent right so somebody else runs their different state machine
over the exact same
Uh list of events and they end up with uh, we got 30,000
clicks from Europe this quarter
Right
So you're rolling different state machines that might be as complex as a relational relational database might be as simple as a
as a rolling sum
over this dream of recorded facts
And that's the fundamental idea of event driven architectures record what happened process it later
And then you get into this really interesting territory where
You because it's con just like in the replicated database
It's constantly this state machine that's expecting new facts to come in you very easily pop out a
A summary that not only captures all those facts, but reacts to new facts in real time updating the running total of what we sold the running analytics
Of where people have been clicking around the world
Okay, yeah, no, I just good good district. That's clearly did the finishing of the event streaming
Um, what you call it paradigm or something like that
But um, so i mean it's a big thing, right?
That's um, that's i've tried to cram in what Kafka is and what event streaming is in in one answer
But they're very intimately related and that's what separates it from something like rabbit mq where
You put a fact on the queue and by the time it gets to the front
It's processed and then maybe it's thrown away
Well, yes, I know we'll come
Follow me just go ahead
I was gonna just pick up on one other thing um, you mentioned the use of replication log or writer head log as it's called in postgres but
For database replication
This can also happen synchronously, right? So it doesn't have to um, it can wait for the acknowledgement to
That the other side has processed this as well. So it's
You I think you describe an asynchronous approach to replication, but there are most there are a lot of databases
I have synchronous application
Yeah, and and the Kafka does you can like you can you can say
This thing isn't written until it's been written to three replicas, for instance
Yep, so I think we probably
Do we do you want to go to the bay with the event streaming
Approach versus the let's say the the current state approach in a database
We could discuss that for a bit. Yeah, we have another two hours
Your editor must love you guys
He does he does
Post-production is our favorite department
Um, so okay, let's just uh, I mean you mentioned the revenue queue you you described in the messaging
Part of revision coup. I don't know if you're familiar with the latest developments, but
I'm a little bit of data rabbit. Yeah
Okay, well, so so it's a rabbit just introduced the streaming data as well. Just like redness has
A couple of years ago three years ago come on what it is now. So there's definitely a
Uh, a need for
Streaming data up and only data whatever you want to call it right in in data stores
So so as you can see from from the likes of redness and then rabbit adding those as well
And I guess Kafka is really the front runner there by
Implementing that
Yeah, the event streaming process uh paradigm in the first place
I was really at least a market on that
Yeah, so so the argument that uh, you get with them
With customers is then why do I not just land my data in their base run a query over it and um
Instead of doing a um having to roll up through uh, you know, however many events I have to find what I need to know right
Um, so you can argue that there is every time you ask the same let's say okay
I want to ask the sales for my current state of the number of sales
I need to go back every time to calculate over my event stream. What is the current state of sales right which is means picking out all those
um transactions in my event log whereas you know
Uh, in a database table that would be a current state kind of a distuster
amount of sales um
That have happened so far. I think I think there are kind of two answers to that the first is um
First is you don't
I think that's more likely to happen in a relational database right say
Say you're recording purchase through this
You've you've you've got to whether you're using event systems of relational databases
You are going to record every single purchase
And then what you get in often in a in a relational database you say okay
I want to know the sales totals
And that starts out being a really fast real-time query because you're a brand new business with not many sales
And by year five it's a full table scan
That's running over every possible sale and taking ages and you and I've been in companies like that and you work to make it
Faster you maybe just reproduce you put an index on time and you just have the last 24 hours of sales and then there's a batch
For the longer stuff
But you end up trying rolling over that large set of facts you had to record
Whereas in an event streaming database
It's kind of like that replication maintaining the state over the log. So
You run your state machine over each new sale and you end up with a running total
And when the next event the next sale comes in you don't re run over every single event that ever happened
You just add that total in
So it's uh, it's an order one operation for each new fact
Um, and people who worked on Kafka have put a lot of effort into making sure
That the sensible set of operations you can do with it is also the very fast set of things to update in real-time
Welcome to the complexity podcast
For this chance even more
Yeah
And at the moment this turned into a job interview I had to bring in big old notation
That's a reply
H.I. Would sort of the package don't worry
Uh, changing, we have that child
Unless you file them but yes
Changing text just slightly you mentioned scalability here and real time
I get the scalability bit because the whole thing is written in Java, but I don't get is the return aspect because it is written in Java
Yeah
So um
About mentally so
Well, then we've got to get into the discussion of we're talking about hard real-time or soft real-time, right?
Soft real-time means next year hard real-time it's now, right?
Yeah
I think a soft real-time is the real-time you can argue about whether it's real-time enough
Exactly
Whereas hard real-time is like
Here's a picture tell me if it contains
Stop sign on traffic light that's red and you've got 50 milliseconds to answer that question
And if you don't answer it in 50 milliseconds your answer is useless because I have a new picture by then
Oh, you're dead or something exactly yeah, yeah
So hard real-time is like if you cannot answer the question by now
Then you're your value is useless to make go away
And so capture is definitely a soft real-time system
It's like as fast as possible and being faster than batch is very useful
But there's no drop dead date
The way I understand it because if you take a look at the architecture you do a lot of
persistence on this and stuff
Hmm
But appending so
Random access right to a file you could argue is expensive
Appending to an existing pointer is one of the cheapest things the kernel can do, right?
That depends on the implementation in the underlying operating system I suppose
I'm gonna I'm gonna say that if your database is
primary operation is appending to an existing file that's already open
That's pretty cheap that's not gonna be a bottleneck
I mean most far systems would support journaling and that plays right into into that game for example
Hmm, yeah, I mean
However, you slice it
A pen is you get into more problems with file systems when you are trying to do
competing mutation of blocks, I would say and caffeine does not do that
Okay, and that's that again comes back to this how do we do data
But applications relational databases things because I years ago I worked on a company where they had a system
That replicated at a hot standby oracle database that worked on replicating individual blocks on disk
And I swear that hot standby system calls more downtime than
actual downtime in the system
You know the the database went down more often because of trying to keep a hot standby going than anything else
If replicating individual blocks on disk doesn't work
No, understood, but how do you see yourself against hard real-time systems like soft readiness for example
Um, you know, you're gonna have to tell me more about redis because I haven't really used redis
How is it hard real-time system?
It's a key value store that does all it's processing in my memory hence this hard real-time fact
Mm-hmm
Martin correct me if I'm wrong you joined after me, but you left before
I leave or I left or whatever it's it's um
Memory first, isn't it there are always yes
Um, yes, the system is optional yeah exactly, but the only persistence
That's right. No, I mean the idea could be redis is slogan right?
the idea is and
And and and
If you're listening the email russis
Sponsor at it. It's a little sort of you, but never mind
Hey, no, the idea behind you can't you can't solicit for sponsorship during my job interview
Well, we we're actually we can because we own the podcast. Oh, yeah
Sorry. Well, it's really hard. Yes indeed. Sorry about this. No jokes aside. No, the thing is basically
Went to have a when some of us are some for lipo design. It's about
20 years ago. Maybe some of like this. We had this
rule the head this notion in mind that that each and every data is actually kept in memory
Every piece of data rather and that all the processing would be done in memory
I mean hence this kind of real-time notion of redis
Some people called the in-memory data grid and I think they have a point
Where if I take a look at Kafka the architecture is different
Yeah, it's it's more designed. I mean, so if you think about its origin
It's coming out of the early days of linkedin
When they're trying to deal with a tidal wave of incoming data that they want to process as fast as possible
That was for Microsoft acquired the company you are after
I think it was before don't quote me on that
But I think it was before and the people who were working on it spun out into some of the people who were working on it spun out into confluent before the Microsoft acquisition
Uh linkedin provide the first couple of receive funding rounds right or at least one of them anyway
Oh, you'll have to I'll transfer you to our business department who will happily answer financial questions
Exactly. We're going to do the next episode with him. Don't worry. So just
I think confluence financial people would love the tone of the show they'd be well up for it
No worries. Okay, get back to the original question now. Sorry
Yeah, but so so they're dealing with this floodgate of incoming data and and now you've got this system
Which is capable of processing comfortably like two million events a second coming in to a cluster of nodes
Which is a very different I think
different use case of
This tsunami of data coming in. There may be a key value store would be ideally suited to
Hmm
I mean Martin already mentioned this but red is introduced the notion of streams about
Seven years ago
Maybe
So the idea is basically to have something I wouldn't say similar to Kafka, but something
remotely resembling
scalable
A scalable message bus, but in memory. So you have consumer groups
You have time sensor and and all the rest of it
Mm-hmm. So on the technical level it doesn't seem to be too far off
Of something called Kafka
Yeah, I can see that I think this that sort of brings
The other thing I was going to mention when we were talking about
Rolling up this long stream of facts, right the other thing is
You one thing we can turn ourselves with is you persist this log of I know sales transactions for
five years and it still stays fast because a new purchase come is just a
Bigot notation order one operation rolling on your sales total
But then and this is a big part of it then the auditors come in your accountants come in and they say we need to
Roll over your entire history of purchases for a legal taxation reason for instance
And then you've got then you're concerning yourself with okay, so now we do need to reprocess all that data
But we have all that data we captured all those facts
They've been persistent in one form of long-term storage or another
So we can reprocess that entire history for a different use case
Catchering things in a way that you can maintain efficiently maintain the current state
And maintain how you got there because you never know how you're going to have to reprocess that data in the future
You know marketing comes in with a different historical requirement
You hire some new analytics people have some new ideas about how we could plug that data into Facebook to optimize our adverts things like that
All right, that's when you want to be able to reprocess the historical stream as easily as you are dealing with the current stream
Interesting
And of course once the auditors come in if you just run a
residence instance without persistent nevermind streams or not
And that
Instance then crashes and persistence isn't there you're pretty much screwed
Yes, I get that angle yes
That's quite fun like you see you see this
Playing out in things like the idea of data mesh where people are
Taking this persistent log of events and using it for their own purposes
But they're making it read only available to other teams and they find new use cases for the same stream of events that you weren't even thinking of
If you make that data persistent and available
Yeah, I wanted to pick up on one thing that we often hear about Kafka, which is the zookeeper scenario and there was a
Apple requests or a plan to remove this
How is that going?
It's going quite well actually
I was at
The Kafka summit in London a couple of weeks ago and there was a talk about how it's going
So let me see how well I can reconstruct that
They
Summarizing check the release notes for details they have
Have done the work to replace
Zookeeper with something called k raft rafting Kafka
Which you which interestingly uses Kafka as its persistent state store
So it's all native Kafka
throwways o-keeper and have that kind of
low balancing leader election protocol baked in
It's working. It's in beta
I think I got the impression that the main thing stopping it being general like put this in production availability was tooling around it
You know, there are some there are some tooling issues that they want to iron out before they say you should go into production with this
But the core of it's there
Just one increase. What's wrong with zookeeper?
It takes about five minutes to install if you have the right operating system and only with two days to configure correctly
And then the end of the day
So what's wrong with that?
I think I think one thing people are going to miss is the joy of running to completely separate and complex services just to get one running
I think
I think there's a there's a whole market of people that like to compile their own kernels that will miss do keep
Indeed
But in the ad but leaving those to one side fuel moving parts is better, I think
I get the notion that I reckon this has fun one of the most I wouldn't say criticisms in the past of Kafka but certainly quite a few people that I've met
We're into happy with the with the fact that actually zookeeper zookeeper is prereq for this for this whole architecture
I think there's a lot of sympathy internally for we're all looking forward to the day when that's no longer a requirement
Um, it was definitely if you again if you go back to the millions of years ago in internet years in what 2013 2014
It probably was the right choice they made at the time for um getting a reliable
leader election system up and running um, I'm not sure
I'm not sure the smart move in those early days would have been to roll their own version of craft when they had other things to get going
So I think zookeeper was a good choice
And it's a good choice now to retire it and make it built in
And like all software projects do we wish it had gone into production last year? Of course we do
Interesting perspective so people you heard first the next version of Kafka will we will be without zookeeper maybe
Did you are you signing that in my blood?
No, I'm not
And no, this is not part of the evaluation process of the interviews. So don't worry. Oh, okay
Uh jokes aside, where do you see Kafka going though that we have tech of the major technical bitch?
Um, I think we're gonna see
I think there's um, there's a big push generally towards real-time systems
I think if you think of the number of companies that are still running on batch systems
There is a huge market for just solving those right for just going from
You get your reports overnight or at the end of the week
To you can see live data and everyone in yep everyone in the business can see the live data. They're interested in
You know, I would like to see
marketing people getting live
analytics dashboards that's bespoke to the company while the sales people get live sales figures by region while I as a developer get
live usage statistics for
Different parts different features of the system
I'm getting those live and not ad hoc in batch, but everybody can see it right now
I think that that market is just so huge
I can't imagine when we'll get to the end of that task
Interesting, but wouldn't that mean turning some of the aspects from soft to hard real-time in certain certain areas?
I don't think so. I think I mean
I think hard real-time's not quite something we're doing. We're I so I host our podcast
Uh, on which I will interview you sometime and grill you if you like
by all means
And we had someone
We had someone on who was dealing with like data coming in from shops around the country, right?
And he was apologizing half apologizing that they were getting data in seconds
Now to me that's real-time
To a hard real-time
Robotics engineer. It's probably not but to me
Seconds instead of end of the week is a complete revolution, right?
I get that notion, but if marketing screams at you
Because the conversion isn't going quite well because some middle tier
Is choking the website because because a million users are just hitting the website
These requests then basically are poured into a middleware system and that doesn't act fast enough in that case
Soft real-time pretty much quickly turns into hard real-time because okay
We're not talking about life's being on the line, but rather cash being lost in terms of revenue not being able to record to be recognized
Yeah, I think I mean so to go back to the example with there are systems out there doing like two million events a second
um
If you've got so many customers
That you're breaking those kinds of limits
You've got really interesting problems we'd like to help with
But that's a lot of capacity for dealing with real-time
When you're contrasting it with the actual reality on the ground at the moment which is people getting
hourly reports at best
And that's and degrading from there when things are busy
And by the way, yeah, I'm not making this up because
The previous marketing you pop and before Martin fired it once again
exactly had this problem
Of course, I'm joking
I mean so nice problem to have I years ago I used to work a startup I co-founded and
A colleague of mine was always worrying about what happened if we get ten times as many customers
Which is a question I respected but wanted that problem right that'd be great
We're a startup no one knows we exist if we can get ten times as many customers. I will gladly stay up late dealing with that problem
How does the
The open source project itself
Operate let's put it that way. Obviously there's lots of different models in in open source projects
And as curious if you could share a few words
So um, it's it's fairly as I gather it's a fairly standard kind of a patchy ish structure
We have PMC's
Who what no, sorry take that back. We don't have PMC's a patchy Kafka has PMC's who
Oversee the project some of them work for confluence some of them work for red hat
Some of them work for other companies. I can't name off the top of my head
But they're like any a patchy project. They're kind of
Maybe sponsored
Independent people building this thing
Anyone can become a committer anyone can open a PR anyone can submit stuff
They are actively soliciting new committers
And if you're a committer for a certain number of months maybe pushing on to years and you want to be one of the PMCs
You can become so
There's um a feature request or design suggestion process called uh the Kafka
Kafka improvement process or KIP
Where you can suggest something prototype it say we should do this as part of the project
Get assigned a unique KIP number
um and
On it goes and it's been very successful and then cat confluence role in that is
We're actively engaged in it um, sometimes we
Create features that are part of the cloud product that we then roll back it that we
Then try and get committed back into the open source project
um
What else can i tell you about it uh if you there was a nice talk which will be going on youtube soonish at the Kafka summit recently
Talking about how you can become an open source contributor to Kafka
And it actually didn't seem too scary
honestly
Okay, so as the overall kind of
The project committee you know is is there such a thing um
You know, I mean, I don't know if you're familiar with the bus press model, but they have um
Like a committee of five or are you still i think it's been more now, but there's never you know one company or um
Controlling the majority and all the kind of stuff so
Is this something that the Kafka model it mainly it has a bit of a
Is it not just confluent
Oh, no, it's absolutely not just confluent um it's
It's individuals who have been working on Kafka for years in most cases
some of them are
Paid by their company to free up their times so they can contribute it somewhere independent um, I mean like
Working on it entirely financially
Um, but yeah, basically it's an open source project with lots of people putting their time and effort in
And then
Kafka they're sorry confluent having a lot of the original builders of Kafka
Is actively supporting that where it can
It's not unlike other open source project being backed by a company that
Well red is of course comes to mind, but other companies exactly work the same way
Yeah, and it's kind of mistaken
It's kind of tricky because you I mean I as a developer want the open source project to remain independent
Um, and I think Kafka has has that sure Kafka has that independence and it guards it fiercely and which I totally respect
Um at the same time you do want these big open source projects to
Survival more than goodwill and developers working in their spare time so every time a company like
might in a red hat
Contributes puts cash on the line to make sure someone has the time to do it. I think that's a good thing
As long as those conflicts of interest are
policed
Right you mentioned the Apache software foundation your license under in the patchy license
They're from the computer if not completely mistaken um, so I believe so and uh, I'm not a lawyer um, there are
So there's a patchy Kafka core check this with someone who who is a lawyer
But Apache Kafka is an open source project. I believe under the Apache license um
Then
Confluent has built some features that you can
Also use that complement
Poor open source Kafka well and they're under a cat confluent community license
Uh, and then there's some proprietary stuff that's just in the cloud product
And there's a lot of kind of
Reintegration along that stream from confluent trying to get useful features back into the open source project
For depending on which part of it you want to use there's probably an open source license for you
You left the incubator status sorry for those people who do not know
Apache projects are classified in certain
categories stages stages. Thank you very much Martin
And incubator and incubator project is basically something that it's just it's just budding
But I reckon Kafka left this
About what
Six and years ago um, I um
Trying to google uh, I'm literally googling this now because I didn't have those dates in my head
It entered incubation in
2011 and graduated October 2012 apparently wow okay. It's a 10 years almost
What's that in dog years long time long time exactly and now the status is
Oh
There's another status Martin you are Apache you are president Apache pro um expert no
I
Thought it was graduated for some
No, I think I know I'm not the original
You're the licensed guy
There are a lot more legal questions in this job interview than I expected
Martin did you send out the shot at the wrong job description once again wow
I've got to do legal. I've got to say I was a marketing
I think the description probably also as an opening
No joker status it doesn't really that matter it doesn't really matter that much
But I thought it's it would be interesting to see from from from the open source perspective where the foundation wants to take this
In terms of where the committee that is aracking back by the Apache foundation
Completely correct move from wrong is headed
um
I I wouldn't like to speak for that except I know that um getting rid of zookeeper is a big priority
um, I know that's a headline change that's coming in
Uh
One thing a colleague of mine danica find us is she has a regular um podcast announcing what changes have been made
So if you search for confluent danica fine, I'm sure you'll find she does a great job of bringing um what release notes to life
Because those can be pretty dry
Okay, so if you want to keep you up with the recent notes catch her
We will do
And with that, I think we're almost done Martin. No
Did I get the job?
Of course
But but before but before I tried
I think I might offer
But before I try to make an offer, there's of course the boxes to discuss
But and so let's explain the concept first exactly why don't you why don't you explain the concept
Uh, are we just doing boxes are also anti-boxes uh, we're just well actually we can do anti-boxes too
If we have time. Yeah, so in in short
Uh every every recording we do a
Pick of the the week or the choice of something that stood out of interest to you. Oh, it's a book
that a vision article
Anything else that you
Want to find memorable and it could also be an anti-boxes and you know
Microsoft have done so they were all again for examples to find out
Um
You know, I'm gonna go for a wild card because it's honestly burning in my heart now
Have you heard of a game called monster hunter
Not myself, but why don't you explain it to us?
It is it's completely ridiculous. It was one of these things that was a Japanese phenomenon
For years and finally broken to the west and if you've heard of capcom you probably have
Um the people who make like mega man street fighter and all that stuff
It became their best selling game of all time a couple of years back
And it works like this you live in a land of dinosaurs
You pick up a weapon and you beat the dinosaur to death and then you use the dinosaur parts to make a slightly larger weapon
And go and beat a slightly larger dinosaur to death. I'm only friendly game
Yeah, and and the there's a plot and it's completely paper thin
It's just an excuse to hurt dinosaurs
But it is honestly one of the most fun games I've ever played and I'm a bit obsessed with it at the moment
Okay, interesting what's your box?
I don't have one this week
Do you have an anti-apox at least?
Uh
No, no, not nothing nothing unusual. I mean I think I think I think I'm not even too good
I haven't
Okay, well, actually, no, I have had to use weak people most recently as well
So there you go
Okay, definitely sounds good. Okay, cool. You want to go into a little bit of detail or should we skip that?
No, let's let's not some border listeners with with
It's history anyway, right? So it's moving on
My pox of the week would be a TV series. I think season out a season one is out now
And season sorry season one season two is out now and season three is just gonna appear on Netflix
I may be wrong about the about the season numbering. It's called love death and robots
Yeah, yeah, yeah, what seasons one and two. Yeah, yes. Okay, so I got the numbering right. Yes
And Chris why don't you explain why this is a great show?
Um, so I may be biased
Okay, so it's um, it's kind of sci-fi anthology series animated um
And it's just a series of self-contained stories about the future. Maybe some cyberpunk stuff robots
They're usually very funny. Sometimes they're very weird
They're always interesting and they are about 15 minutes long
So I tend to consume that I just catch consume them on train journeys and plane journeys
That's my binge snack watching
I couldn't have put it better just to add to this
I've just discovered that recently and the writing alone of these episodes because
The devil is of course in the details, but the but the crafting of the of the episodes is just it's just awesome
Let's put it this way. Whenever I watch that episode. Yeah, something that you would enjoy
Exactly. I've yet to see an episode that is not
Well written. Let's put it this way. It reminds me but I've I've watched
I'm thinking all of the episodes of the of the two seasons so far
And all of the episode most of the episodes reminds me remind me of a well crafted
As an excellent short story
Hmm. Yeah. Yeah, that's
Because yeah, sorry go ahead and one thing I'd say is if you're if you're listening an interesting
Pick it up
Don't watch them in order you don't need to just look at the description. It's fine one that takes your fancy and that's the way in
Yes, they're all yeah, they're not interrelated. There's no storyline
It's just short stories in an animated fashion
Mm-hmm
But before we close off the show, Martin, there's a little bit of feedback that we should go through. Oh, yes
Go through
Should I read this and then you comment? Yeah, you go
Okay, yes, it's by cyber grew and it's on the Unix philosophy
You're understanding of the Unix philosophy is missing what any what many consider it's most important caveat than it
If Wikipedia follows by the describing or outlining the unit the Unix philosophy details. Maybe tissue and maybe in the show notes
As summarized by Salos and I reckon that's under Wikipedia article the Unix is a collection of programs
That each do one thing only and do it well system D is a grabback of lots of functionality
And does not do any of them particularly well hence why people say that system D is not the Unix philosophy
I agree that the old style in system had a lot of issues and needs to be replaced however
I do not agree that system D is the solution
I would have preferred a properly designed layered and modular in the system instead of the all-in-one solution of system D
i.e. bare metal server used to run containers would have the same root level
module but different application specific modules as the as a GUI based tablet system D was designed for GUI based systems
I do not necessarily concur as this is overkill in the property for backhand service running Docker anyways
Deportment another good show and stop setting yourself short mom take note
I think you are up to a double digit number of listeners by now. I think we've surpassed that
Matthew would have covered on this
Uh, it's yeah, I mean
System D has its place. It does things reasonably well other things that
Most people don't use it for we probably want to avoid
um and
Find there is a Linux philosophy, but yeah, I'm my philosophy is always as awesome works and
I'm happy
If it works don't break it exactly
No, I mean it's I think I just my two cents very very shortly on this
I think I really I explained that in the episode already you
Is that a group?
Some of the of the arguments is some of the argumentations valid yes, but at the end of the day
It's still that little kind of Swiss army knife in terms of independent code bases modules that basically make up a system
And if you take a look and this is of course totally biased what I'm supposed to say now
The the amount of innovation that system D has brought to the table in terms of Linux of the last 10 years maybe less
Has yet to be matched by
A low level system component in the system called Linux, but that's but that's just my my personal opinion
Needless to say there are some caveats for example gnome is relying more and more on system D
And this is something that I do not like about this architecture, but there's just my personal two cents and it's also on system D Chris
No, that's way too controversial for me. I'm gonna go to something. I'm gonna talk about a safe topic like blockchain
It's cool
So we must invite you again Chris. It has been wonderful to have you on the show. This is a bit of great pleasure
And thank you very much for your time and do expect the invite on blockchain topics on for another job or any
Anytime soon. Oh, yeah, let's do it. Let's we can chew to the world on that
Excellent. Thank you. This is the Linux in-laws. You come for the knowledge
But stay for the madness
Thank you for listening. This podcast is licensed under the latest version of the creative comments license
type attribution share like
Credits for the intro music go to blue zero stirs for the songs of the market
To twin flames for their peace called the flow used for the segment intros
And finally to the lesser ground for the songs we just use by the dark side
You find these and other dd's licensed under cc hmando a website dedicated to liberate the music industry
From choking copyright legislation and other crap concepts
Oh
No, which episode did you listen to?
And on the sticks in mind is russ there was nothing to the russed episode
That that balls it down to about 50% off the episode
You have been listening to hacker public radio at hacker public radio does work today show was contributed by a hb
You are listening like yourself if you ever thought of recording
Podcast and click on our contribute link to find out how easy it really is
Hosting for hbr has been kindly provided by an honest host.com
The internet archive and our sing.net
On the satellite status today show is released on our creative comments
Attribution 4.0 international license
You are on the right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right hand right