Episode: 773 Title: HPR0773: Gabriel Weinberg of DuckDuckGo Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr0773/hpr0773.mp3 Transcribed: 2025-10-08 02:11:47 --- . Hello ladies and gentlemen my name is Kent Falon and today I'm very pleased to bring you an interview with the founder of Duck Duck Goh Mr. Gabriel Weinberg hi Gabriel did I pronounce that correctly Gabriel but thank you for having me hi so you're the founder of Duck Duck Goh just in case any of our listeners don't know can you tell us a little bit about the site sure there to go is a general purpose search engine so it's designed to replace your generally Google use it does a number of things sort of differently better but you know generally it works the same as a regular search engine we've tried to focus on a lot more instant answers above the results and we also focus on getting rid of a lot of spam and irrelevant results and then we the third thing we focused on is privacy and we try to say that you know we have real privacy which means that we really don't know who you are when you search and there's you know no way for us to track you so aside from IP addresses I guess and cookies we don't store IP addresses actually and we don't say cookies by default we also don't store user agents which have been known to track you're be able to track your sessions even if you don't store those other things and yeah and they're doing stuff now with fonts on your PC to be able to identify your browser yeah the FF put out a tool that you're referring to is you can do like a unique fingerprinting of your browser which is pretty interesting but yeah we obviously don't store any of that stuff either okay just to give the listeners a little bit of a background so tell me how you ended up did you just wake up one day and decide okay I want to go against up against them you know most powerful company in the world and do a better search engine or how did this come about it came about more of a sort of dissatisfaction with Google and then a recognition that I was messing around with Wikipedia and delicious and finding often better links external links in those sources and so I was actually going there to search for things instead of Google and you know the thought occurred to me okay well what if you take this to the logical conclusion Google's getting more and more spam in it and there's more and more of these external sources crowdsource APIs that have good results on so what if you sort of mashed those all together would it create a decent search engine and so I built sort of a prototype on a weekend and I messed around with it and liked it and sort of just grew organically from there okay fantastic it does I must say I've been using it for a while now and it gives different results to Google I guess I guess people are now trained that the Google results are the correct search results can you tell me why that would be you know why I might want to why am I getting different results in your search engine as opposed to if I type in hacker public radio for instance and into Google why do I get two different results right so the short answer is is that you know obviously Google and Dr. Go are somewhat black boxes and we don't know each other's algorithms so they're naturally going to lead to somewhat different results but a sort of a more there's more of a philosophical answer I guess which is that you know each search engine sort of tries to concentrate on different things and so what we've tried to concentrate on more is this notion of zero click info this more conceptual results and we'd also try to do things like this is where different from Google more where we'll really respect your query more and not try to change it around and give you results for things that are slightly different than what you searched for and we're way more aggressive at removing the you know SEO type of sites so you'll see a lot less of those but ultimately it depends on a given query you know what's going on and so these are just somewhat generalizations but one thing I'd say is that like yeah we definitely give different results than Google so there there is definitely if you're having trouble with doing deep searches there's really a reason to go to other search engines and use multiple ones because you will get different results I just saw on your webpage that's this whole concept of a bubble I wonder could you explain that to the listeners yeah so there's this concept that was best or more recently nicely enumerated a book called The Filter Bubble and the concept goes like this when you search at Google now and most other search engines and even other sites use other sites like Facebook they're using what you previously done on those sites I you Google what you've clicked on and what you search for to tailor your results on different searches so when you search for you know something like climate change it may be impacted by what you previously searched for and where you live and things like that even though you don't that may think those things are related and so what that means is because you often search and click for things that you like you end up seeing more and more of things that Google thinks you like and that may leave out some opposing viewpoints and you know things that you're less likely to click on but otherwise contain information that's valid so what they call us is a bubble because you're sort of living in a bubble that Google is presenting to you and you're you're missing things that are outside of that bubble which generally over time may contain opposing information to your core beliefs okay I actually did the example on your website and although I don't search that much in Google I did end up getting very different results than what was on the what was on the example page for other people so they're obviously taking Google reader into account as well so it's kind of scary stuff I wonder could you just tell us a little bit about your own background and how you you know what your educational background is how you kind of got into doing these servers and putting this stuff together what it runs on that sort of thing sure so I'll start with background and you can hit me up for the server questions yeah so I I grew up in Atlanta I spent a couple years in Philippines before that and then I went to college in Boston Massachusetts at MIT I got a degree in physics and then I graduate degree in technology and policy and then basically right out of school I started doing startup stuff and I started an educational software company that was about increasing parental involvement in schools and that ultimately didn't really go very far and then I started another company that was about finding old friends and classmates pre-facebook yeah that did that was one that did well and I sold that in 2006 which enabled me to sort of take on this bigger problem duck to go you know so then about a year after that I started you know mess around duck to go and I've been doing it for about three years now I'm sure you have a massive team of people working behind the scenes there so can you tell me how many people are employed full-time on duck to go you would be surprised maybe you already know but it's just me full-time still at the moment although there are I don't want to short sell other people's contributions because there are many people who have contributed you know significantly but I'm still the only full-time person okay and how how in the name of all can you attempt to to best the likes and resources of Google or Yahoo or Bing or anybody else for that matter this is the beauty of the external API age so I did initially start out doing my own crawling and everything and building everything from the ground up but quickly realized of course that you know you need you mean Microsoft and Google basically spent hundreds of millions dollars a year on that component alone so obviously I'm not doing that right and so what happened was is Yahoo boss came out which was there exposing their search feed and I decided I could use that and concentrate on value ads identified which are you know adding sort of better results on top of theirs and getting rid of spam and then over time it also turned into changing their results a lot which is why you see our results are pretty different than Bing and Yahoo even though we use their feeds and so now it's it's a and then we also started using a bunch of external other external APIs like well from Alpha and there's about you know 40 sources and so what you get is an amalgamation of our code plus everyone else's code which is sort of what I call hybrid search engine and what that really enables you to do is if I'm using say like a I want to get good music results say right then I'm going to use a music API from a company that just concentrates on music it's as if we had employees that we're doing that but we can use it just based on calling their API so if you think about it we have tons and tons of people working for us but they're not working directly for Duck Duck go you don't have to pay exactly I mean it's similar to the argument of open using open source right yeah exactly what do you just when I typed in Hacker Public Radio there for instance I get some pages obviously that are from Wikipedia because it says Wikipedia and I see the Facebook link there but results brought you by Bing built on Yahoo you know what does this mean is is also there yes does that refer to a single search or does it refer to all of them so when you use the Bing or Yahoo APIs they require attribution so it is that's how I've decided to be good to them and display it which I think is fair yeah but it's what it's not doing is using their feeds exactly which is why it doesn't look the same so really all that means is that on some calls we use their services for some things and but it's giving them proper attribution yeah on your right and proper I guess I must say once I guess once people start using Duk Duk Go and okay I'll ask the question why Duk Duk Go calm I wish I had a decent answer for you it popped into my head one day and I really liked it and my wife liked it so I went with it I'm generally super bad at names and it seemed like a good name so I just went with it there is a you're not from the U.S. so there there is a childhood game in the U.S. called Duck Duck Goose yeah which is probably where it was derived from in my head but it was they're gonna be coming after you for copyright infringement right it's nothing related to that absolutely but you probably have a different name for the game or something it's like where children chase around each other in a circle after they tap each other on the head and we call it Tigga thank you what do you call it Tig Tigga yeah or tag but I must say coming from Ireland you should never let the truth stand on the way of a good story so you can always I'll have the listeners here send in better better stories where the name came from now I'm completely lost sorry no that's fine what I wanted to say was once you get over the it's not Google thing I was very comfortable putting it as my home page especially because when you go into search preferences you can specify that it's HTTPS that your local country is the Netherlands that your language is going to be this that you want the links on the side you do or you don't want the ads all that sort of stuff comes up and you can have it as a URL parameter on the bottom of your page so you seem to be really serious about this whole no tracking privacy thing why do you feel so strongly about that I you know got into it not really thinking about that stuff at all and it wasn't like a core thing when I started but then we're really triggered it was some well two things really one there were some comments on Reddit that were like why are you stored in the stuff and I had never really thought about it you know it's really the default when you turn on your web server that it prints out IP addresses and then I Google came out with a report that was saying how many requests they have gotten from governments across the world and law enforcement and that they have to deal with these and court and whatnot and you know I looked at it and I thought a it's actually pretty creepy that I would know what people are searching for you know yeah and b I don't want to deal with any law enforcement requests whatsoever so that the first one is more of a privacy user protection creepiness thing a second is more of a personal preference I just don't want to deal with it you know yeah and so that's sort of where it came from and then once it became that then I then you have the the mode of thinking about it and once I started thinking about it over time incrementally some from user feedback some from my own I realized there's a whole bunch of other things you can do and other leaks that were going on that we could you know close and so I've done that over time yeah it is I must say it's what's what I like about it is it looks like Google you know 10 years ago when I started off yeah it was just you type in your question you get the results that's it thank you very much ma'am and just to let the listeners know if you know if you're there and you've run a search on drco and it doesn't work out you can just go back up to the page put an exclamation mark g and a space and then it'll put the whole that whole string and send it over to google for you that's a stroke genius can you tell us more about that sort of that functionality yeah it's called I had called it bang syntax off of the unix bang and what it does is it'll send it you know there's different commands for different search engines but the basic ideas it'll send your query anywhere you want to so you can even do that on the home page you can bypass us altogether and send it right to google or amazon or you know wherever you want to go there's a there's a thousand of the commands at this point so a lot of sites are covered slash w is wikipedia for for so the genesis was really I actually just built this feature for myself and didn't really have any intention of exposing it because originally I just wanted I constantly I'm searching cpan because it's written in mainly in pearl and cpan is where all the pearl modules are stored yeah and going to search the cpan at org is a pain and then you got a search and cpan search their site is already so slow yes that skipping the first page just saved me a lot of time and so I just built this in and then over time I realized every time I showed it to someone they're like what are you doing and then they thought I was pretty interesting so the next I exposed it and then over time people who you know understand the syntax seem to really like it yeah I think it's uh I think it's fantastic I now have uh ducto go on all my own machines as the home page and it's right there it's up in two seconds there's no java script going on in the background there's no notifications uh really sends google pluses come on it's it's turns the google home page into an application and you know I just can't be dealing with that first thing in the morning I just said no to go back to the question about the servers and stuff can you give us a background of your massive data center that's out there in uh Philly so I started um using a ISP local here and running our own servers because that's what I had been used to um when that sort of somewhat recently but really sort of reach capacity I've switched to amazon EC2 um so that's really the front where all the front-end stuff is right now um and unfortunately I had to move in doing that from free bsd to a boon two um or telenex and uh but it's worked out well so far I mean there's there's things I like and don't like about EC2 but it generally is a great alternative okay uh it's feeling they play more stuff that we all know and love why uh why is it a problem going from bsd to linux um there's no real problem with it it's just that I had been doing uh you've been using bsd for a long time and was less familiar with linux you know aside I had all these scripts and things that were pretty bsd specific that I had to like you know port over and that was just sort of a one-time pain yeah I got you um so it runs um you mentioned pearl it runs on a boot to bsd um so what else is what else is there we're using engine x mainly for the web server um we use a bunch of different data stores uh solar postgres uh flat files this cdb which is a weird read-only database format um we use memcash for caching then there's a bunch of other side components that some other people have written in like there's some python stuff and um there's uh so we have a jabber um client that answers things over iam and that's written in node um okay but yeah mainly the bulk of it is in pearl and javascript so there's there's there's you know some front-end stuff that helps do all these externally api calls um most of that's in javascript looking on your wikipedia page which is cool to have a wikipedia page um it says that you started off a size with the option uh a community size with with the view to open sourcing is is that something that's going to happen or is a practical or so we have yeah so i am very much focused on on doing more of that we have a github account with a bunch of repositories now and um i'm trying to people write all the time asking to help out if they can have out and i'm trying to make it so anyway any time someone helps that'll be open source um or if they want and so more and more of it's coming open source the actual core some of the core stuff i don't i'm not prepared yet to open source for various business reasons but i definitely would like to open source smart as possible okay speaking of business uh where you're gonna make your money from on this so as you noted before there are a few ads on it but um you can turn them off um that it turns out to be enough um for the moment to break even and hopefully over time it'll you know become more profitable yeah um surprisingly unsurprisingly i had the i was very impressed that the option was there to turn it off and uh did for a second to see if yes in fact it turns them off and then i turned them back on because uh well hey why not give you uh give you a few shackles if um it works you also do affiliate programs via amazon and and the like yep that's right the um we're somewhat limited in that approach because well we're so exploring but basically i'm wary of going through third parties which can do some tracking you know yeah um although the the good thing about amazon and ebay is that they they run their own affiliate programs so you don't have to do that okay very good um the one thing that would prevent me from moving from google of course would be the the changing logo from time to time yeah until you get that fixed i'm afraid i'm gonna have to stay with google okay um we're actually working on setting a custom logo um but i actually see here i'm a bit bit facetious here yeah i know i know we see that you do have uh you're being facetious but there you'd be surprised that i get actually requests usually i'd say from the UK more than not that it's the duck is too unprofessional i can't use your search engine please professional enough i am serious this is this comes up this comes up a lot actually and it seems a bit crazy to me but um because we have all these sort of cool logos that people made i wanted to make the option in any case for you able to set um one of these alternative logos but at that point you might as well just let people set their own logo yeah well just to let people know that if you do login you have the logo the duck logo does change from time to time for different celebrations it is kind of cool um you know you should name the other one uh duck duck pro we better go register that before before this errors um was there anything else i'm just looking down through the uh list of questions that i had was there anything else that i missed or that you'd like to bring up um no but if you'd like uh i don't know how a technical audience says but i'm happy to answer any other weird off-technical questions you have our audience varies from the novice right up to the the uh geekiest of the geeks so um if there's anything you want to tell us uh for free to do so you can um sorry come no i don't have anything i would just say um after listening to this whole podcast he should give it a try yes you you should definitely do that the way to do that is to set it as your default for a week because you sort of have to give it a little bit of time yeah and there is that phase where it isn't uh uh it isn't google and once you get over that it's it actually turns out to be quite nice especially they uh the red box gives you a lot of interesting information and because you can see the sources that it comes from Wikipedia or it comes from archive.org then you can uh i find myself trusting it more you know or the red the stuff on the red box i know is going to be an actual result um okay i just will last thing here you see in your website that you give 10% of your income to free and open source projects yes what prompted you to do that well i had been sort of like we talked about before I rely on these external APIs and usually those are businesses and in some sense that's a win win or we even will pay but in the case of open source i mean our the deducted go is essentially built on open source software and we're not you know paying for that obviously um and so i wanted to encapsulate that by giving a giving something back i mean i'd use it also my previous company and we didn't really do that um but i thought this would be a good way and honestly i hope to uh it's it's by other people do the same that hasn't happened too much yet but um either way i enjoy doing it okay so how does yahoo make money on your search results if if they are put into your page and they have no tracking or any other information they charge us per call uh so by searching on duck to go the money is going to eventually go into Microsoft to yahoo to Microsoft yes at least you know where it's going okay how much is a requester can i not ask that question oh it's uh you know i actually it varies by kipa call and i don't have it off top of my head but if you just search yahoo boss yeah they're pricing is you know public and everything okay um the just to be clear right if you want to mess around with the Bing API there is a free Bing API but you you're limited if you um you can't commercialize it yeah i understand no i i'm must say i'm very very happy with the with the results that i'm getting back i like the um exclamation mark or the bank command it's pretty cool and uh yeah look forward to uh the surge of interest coming from the HPR community as uh once this show good is aired well thank you very much okay um shall we call this uh show call us a day there sounds good okay uh just like to thank you very very much for coming on and uh recording this it's i'll um i can't tell you when it's going to be up it'll probably next week but i'll send you a link to us when uh it gets posted okay thank you again for having me no problem bye thank you for listening to Hacker Public Radio for more information on the show and how to contribute your own shows visit Hacker Public Radio dot org you