Files
hpr-knowledge-base/hpr_transcripts/hpr3045.txt

87 lines
11 KiB
Plaintext
Raw Normal View History

Episode: 3045
Title: HPR3045: OSS compliance with privacy by default and design
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr3045/hpr3045.mp3
Transcribed: 2025-10-24 15:38:42
---
This is Hacker Public Radio episode 3,045 for Friday 3 April 2020.
Today's show is entitled, Oscar Compliance with Privacy by Default and Design,
and as part of the series' social media, it is hosted by Ahuka,
and is about 16 minutes long, and carries a clean flag. The summer is
next. How can open source software manage the mandates of regulations like the GDPR?
Quote-
This episode of HPR is brought to you by archive.org.
Support universal access to all knowledge by heading over to archive.org forward slash donate.
Music
Hello, this is Ahuka, welcoming you to Hacker Public Radio and another exciting episode.
And this is going to continue our march through the activity pub conference of 2019.
This time I'm going to take a look at a talk called OSS Compliance with Privacy by Default and Design.
This was a talk presented by Christina DeLyle, and Christina is the Data Protection Officer at a company called XWiki SAS.
Now this is the company set up to manage the open source project XWiki, which you may have heard of.
There is a link in the show notes if you want to find out a little more about all of that.
But basically we're talking about someone who is responsible for protecting data in the European community.
And that's one of the things that has become very important with the passage of the general data protection regulation, usually abbreviated as GDPR.
And the GDPR was passed by the European community in 2016, and then it began to be enforced in 2018.
I guess they had built in a little bit of a time lag to allow people to read the regulations and move into compliance.
Not a bad thing to do. One of the effects of that was, as she pointed out, GDPR was searched for on Google in 2018 more often than Beyonce and Kim Kardashian.
Well, that's something. Now, what does all of that mean? Well, it meant that companies were forced to review their privacy strategies, because the burden approved for accountability was shifted to the companies.
And no longer assume everything was okay as long as no one complained. Now they had an affirmative burden to demonstrate that they were in compliance.
And the default was shifted in many cases from opt out to opt in. And the companies would need to have the policies and auditing to show that they had complied.
The data collection and the life cycle of data was affected. You are now limited in the purposes for which you can collect data, and you cannot take that data and use it for other purposes.
And you have to be completely transparent about what data you collect, for what allowable purpose you are collecting it, how long it is retained, and if it is shared outside of the European community.
Breaches that exposed data must be reported within 72 hours, and any company or organization that regularly collects such data must have a data protection officer, which is of course the position that our speaker holds in her company.
Now, although the GDPR directly applies only to the European Union, it has been used as a model in other countries as well. I can tell you that here in the United States there are some people talking about that as a model that should be adopted, probably not immediately by the entire US government, but perhaps on a state-by-state basis, because that's often how things work in this country.
And of course other countries outside of the EU are certainly looking at similar kinds of regulations, and I think it is fair to say you would be wise to treat this as something that is likely to spread.
Now, all of the GDPR requirements have implications for cis admins, including in the Fediverse, and that's why this was an important talk to have at the activity pub conference.
You know, we may think that somehow just because what we're doing is open source, and we're the good guys, we don't have to worry about all this other stuff. Well, not so.
So I think it's worth seeing how people are preparing for all of this.
Now, so far the biggest fines under the GDPR have been for things like telemarketing, spam, data breaches, and surveillance.
Now, the GDPR creates two roles, the data controller and the data processor.
Now, the data controller is responsible for determining the purpose for which the data is collected, how it will be collected, and how it will be processed.
In other words, this is the policy side.
The data processor is in essence a contractor that actually does the processing on behalf of the data controller, and this is specified in the data processor agreement that both parties enter into.
Now, a single organization can perform both roles depending on how the data is handled, but it is the data controller that bears most responsibility for GDPR compliance.
Because the controller is the one setting the policy there, so they're determining what data they're collecting and what's going to be done with it.
Now, with this background, we need to consider how the open source community, and particularly the Fediverse, can operate and still be compliant.
Now, as individual members of the community, the open source community, each of us is a data subject, and therefore each of us has enforceable rights under the GDPR.
At the same time, organizations in the open source community can be data controllers or data processors.
Now, Christina used the example of GitHub, which is the controller of the personal data from your free private user account, and may also be processing invoices for your account.
In the Fediverse, such as Mastodon, a good example, the users are all data subjects, but what about the instances?
Are the admins of a Mastodon instance data controllers? Are they data processors?
And she brought up the example of Google in the milestone case that I'm sure you've all heard about the right to be forgotten.
Google essentially argued in court that they were only a processor, but the court said, now you're a controller, and Google was ultimately ordered to remove the site from their results.
Well, that's one thing with the giant corporation with thousands of employees. Google can certainly appoint a data protection officer and put in place procedures to make sure that they're compliant, but a lot of the Fediverse is not on that scale.
So what if an admin is asked to install a Mastodon instance and does so without necessarily knowing how it works internally? Is that person a processor?
And if they decide to do any data analytics, does that make them a controller?
Well, this then leads to a discussion of legitimate reasons for collecting data under the GDPR. Well, first one, there's compliance with the law.
If there's a specific legal obligation, such as for invoicing, that's a reason to collect certain data.
If there's a legitimate interest, for example, a bank will collect certain data as part of deciding whether or not to make a loan, but this does need to be carefully assessed.
Another legitimate reason might be if you have a contract with the data subject, but this is important. This requires that you have specific and informed consent.
The data subject must take an affirmative action to grant this right, and it must be freely revocable.
This can get pretty interesting in the details. If you have uploaded a photo to your profile, you have consented to it being used, and you're the one who uploaded it, but you have to be able to later delete that photo if you wish.
If you contribute code to an open source project, they can refuse to remove that code because they have a legitimate interest.
Removing the code might be very damaging to the whole project. Now, this is manifest in the developer certificate of origin introduced by the Linux Foundation in 2004, and I quote,
I understand and agree that this project and the contribution are public, and that a record of the contribution, including all personal information I submit with it, including my sign-off, is maintained indefinitely, and may be redistributed consistent with this project or the open source licenses involved.
Okay, that's a paragraph D of the developer certificate of origin. I think it makes that pretty clear.
Now, on the other hand, a comment on a Fediverse post could be removed under the right to be forgotten, or perhaps anonymized with any identification removed.
Now, for open source software, there are certain advantages. Okay, open source can deal with the GDPR, number one, transparency.
Okay, the code is freely available to be inspected and is developed out in the open.
So there's no way that there should be no way, at least, if this really is open, for any funny business to be going on with your data in how the program operates.
And also open source software, generally, we want to be careful about this, is privacy oriented. The code is developed by people, for people, not to advance a corporate objective.
Now, that's the ideal case. I think we have to take note of the fact that a lot of corporations are now starting to incorporate open source software,
and they may not share any of our ideas about privacy and people centered. I think in a lot of cases, it's just open source is a way of getting their hands on some software that they don't have to spend much money to develop.
So, you know, that's why I frequently distinguish between open source software and free software. Free software, I will maintain is something that respects the four freedoms of the free software foundation.
And open source software does not necessarily do that. And so it is not quite as good as free software in my view.
OK, but back to open source software in the GDPR. Challenges. OK, we just talked about the corporate environment.
The GDPR was written to basically address companies. When they wrote the GDPR, they weren't thinking about your little side project, you know, where you've got 10 people working outside of their regular job and their off hours to build some piece of often awesome software.
The GDPR was written on the assumption that they were dealing with Facebook and Google and Twitter and people like that. So, you know, if you're looking at big companies, they will have data protection officers, they will have the procedures in place.
They can hire lots and lots of people. Now, will privacy be designed in from the beginning is another interesting thing. One of the things we see, and I think this is sometimes as much an issue with open source software is with any of the corporate software, is that things like privacy and security are things that wait until
the end, if at all. That's not the main purpose here. And then when vulnerabilities are discovered, is fixing them going to be a high priority? That's an open question here.
Now, we know that code being open means it can be done. There is that saying that Venus Torvalds that, you know, with many eyeballs, all the bugs are shallow, etc. But a lot of projects never get the many eyeballs.
You know, what about auditing and reviews of the code? We know that many projects run based on what each individual feels like doing that it can be scratching an itch, but compliance with the GDPR may not be anyone's particular itch to scratch.
So Christina then wraps up with basically suggesting a kind of a privacy pledge to be added to the activity pub protocol to cover some of these problems at least.
So anyway, I thought this was an interesting talk and I hope you got something out of this little report about it. This is Ahuka signing off and reminding you was always to support free software. Bye bye.
Thank you very much.