Episode: 4063 Title: HPR4063: Re: ChatGPT Output is not compatible with CC-BY-SA Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr4063/hpr4063.mp3 Transcribed: 2025-10-25 19:09:54 --- This is Hacker Public Radio, episode 4,063 for Wednesday, the 28th of February, 2024. Today's show is entitled ReChat GP T Output is not compatible with the Creative Commons Attribution Share alike for International License. It is hosted by DNT and is about 27 minutes long. It carries a clean flag. The summary is a response to Hacker Public Radio 3,983 in which can argue that shows using chat GP T Output can be posted to Hacker Public Radio. Hello and welcome to another exciting episode of Hacker Public Radio. I am your host today. My name is DNT. And this is a response show to the one that can did a while back. It's been a while now. I did a little write-up about this after spending some time doing some research and thinking about that, but then I ended up never recording the show. But anyway, I picked it up today and here it is. So I'm just going to read this stuff that I wrote that I think maybe I was going to email the list or maybe I was always thinking about reading this for a show. So I'm just going to read it because I don't want to take too much time on that. So this is about the show when Ken said that content written by chat GP T or in other chat GP T Output couldn't be in Hacker Public Radio show because it's incompatible. Because the chat GP T license or terms are incompatible with the Creative Commons attribution, share-like license that we use. So I want to offer a different opinion on this here. I mean, I shouldn't even call it an opinion. This is, you know, I'm not a lawyer, of course, and I'm just, I am interested in things about copyright and free culture in general. So I spend some time thinking about this and I think there are some interesting things about it that hackers may be interested in hearing about. Anyway, on whether chat GP T Output can be licensed under a Creative Commons license, I think it's true that it can't, but it doesn't mean that chat GP T Output can't be in a Hacker Public Radio show. I would argue. It just means that that portion of the show would not be eligible for any kind of protection under the Creative Commons license. Because it would be unenforceable. So what that means is if I make a show and I license it under the Creative Commons license, attribution, share-like, and I pasted my chat GP T Output, someone else could copy the output, not the rest of the show, just the output, and they would not have to give me attribution. They would be able to ignore those terms of the Creative Commons license because the chat GP T Output is not eligible for that kind of protection. And the reason I think that's the case is, so it's not that the terms of chat GP T are incompatible, it's that chat GP T itself is incompatible with any kind of copyright protection, which of which the Creative Commons license relies on the fact that there is copyright protection. Through the Creative Commons license, you're exactly giving up some of the protections that by default you get from copyright law. So I have an article that discusses some of this that I'm going to put in the show notes. It's from December 2022. It discusses decisions that have been made by the courts in the US about chat GP T and other generative AI. And then here are a couple of things. So first of all, one of the concepts about copyright in the US is that what can be copyrighted, what can be eligible for copyright protection, are expressions of ideas, but not ideas. So if you express an idea, someone else can basically absorb that idea, learn it from your expression of it, and they can re-express the same idea. But as long as they did not copy your expression of the idea, or as long as it's sufficiently different, you would not be able to claim infringement on their part, right? Against your previous expression of the idea. So if ideas are used but not copied, then the use would not mean copyright infringement, right? Because you don't own the idea, you own your expression of the idea. So another important concept about copyright is that the work for it to be copyrightable, a work must be the result of original and creative authorship by a human author. So that's the first thing. Chat GP T is not the result of original and creative authorship by a human author. So right there, it can't be eligible for copyright protection. It's, there is a famous thing about someone, it was like a drawing that was made by a monkey, or something like that, and then an artist that had staged this situation for the monkey to draw on a piece of paper, wanted to claim ownership over the piece. But the, yeah, I think that didn't fly. I'm maybe a little bit wrong about the details of this, but there have been discussions about this in terms of, there have been legal decisions about this in cases where animals had produced a piece, like a physical work that then someone wanted to claim ownership of and wanted to get copyright protection for. So, but anyway, first going back to the ideas thing. So ideas are not copyrightable, only expressions of ideas are copyrightable. And, you know, copyrightable, I should really say eligible for copyright protection, right? So if Chat GP T is going around the internet, gathering ideas, not copying them, it is not infringing on copyright. It is learning stuff about the probabilities of what words come after what other words, right? And with that information, it's producing some new output, right? So, I mean, yeah, there's some strange stuff about what is an idea is Chat GP T really working with an idea here, right? What is actually going on? That I think hasn't been settled, but there are some things that there have already been some decisions about. So in general, it's thought that only human expression can get copyright protection. Chat GP T is not seen as human expression. That's definite, right? So it cannot get copyright protection. Thus, Chat GP T actually shouldn't even go so far as to say that they assign ownership of the output to you as they do in their terms because they don't have ownership to assign in the first place, right? Instead, it should say in their terms that no one can own the output because it is not eligible for copyright protection. So essentially, it is, if it's public, it's in the public domain, right? Any Chat GP T output, like as far as some court decisions have gone, that I think that would be the terms that would actually reflect the understanding by courts in the US, right? So however, within any CCBYSA, Creative Commons Attributions Share Like Licensed Peace, or any other piece, you can use content that is not eligible for copyright protection because no one can challenge you for using it, right? That's the essential reason why. It's because it's not eligible for copyright protection, so no one can tell you you can't do it. So by the same token, if someone reused your Chat GP T that they copied from your show, you wouldn't be able to challenge them either on the basis of your Creative Commons license because it's not eligible for protection. However, other portions of your show would be eligible for protection, such as your inputs, I mean, your prompts, right, that you wrote for Chat GP T to generate this output, those prompts, they would be eligible for protection, I would think, and whatever else you put in your show that's not the Chat GP T output. So I think just because portions of a show aren't eligible for protection on the basis of a Creative Commons Attribution Share Like Licensed, it doesn't mean the show can't be published under that license, right? So of course, if you posted a show that was entirely Chat GP T output and you tried to license it under a Creative Commons license, that license would just be unenforceable, right? You sure you could do it, but based on recent court decisions in the US, that license that you're trying to put on this would not be enforceable. So, and then furthermore, as for Chat GP T terms of views, I think the terms of views apply to using Chat GP T, not to using the output of Chat GP T, right, after you have used Chat GP T. So since the output is not eligible for copyright protection, no one has any say over how we can or can't be used. So on whether Chat GP T output can be licensed under the Creative Commons license again, as can discussed in HPR 3983, I would voice the opinion that the terms of views written by OpenAI don't place any restrictions on the nature of Chat GP T output use only on the nature of Chat GP T use itself. So the output is after you have used it, use Chat GP T, right? So the distinction is that a lot of times people use Chat GP T to obtain information instead of to obtain content and actually that's the primary purpose of Chat GP T. I don't think their purpose in creating it is to generate text that you're going to then reuse somewhere else, right? The purpose is supposed to be to get information. And of course, as we know, sometimes you get wrong information. So the terms, I think, are more concerned with the information than with the content. And here I'm talking about how you use Chat GP T to learn how to craft a bomb. That's the thing, that's what I'm talking about here. You use Chat GP T. The terms of views talk about why are you using Chat GP T? Are you using it to learn how to commit crimes or whatever? That's what the terms of use are talking about. Not about how are you going to use the output of Chat GP T? No, it's why are you how are you using Chat GP T? So let's read this little bit here from Chat GP T terms, which can provide it in the show notes for 3983. Subject to your compliance with these terms, open AI hereby assigns to you all its right title and interest in and to the output. This means you can use the content for any purpose, including commercial purposes such as sale or publication, if you comply with these terms. So unfortunately, it doesn't explicitly mention what rights you can choose not to reserve in the output. And I haven't been able to find any legal opinions on this, but I don't think I have ever seen any cases where you're forbidden from relinquishing rights over intellectual property. So anyway, the way I see it, this section that I just read, it shows that it supports the idea that the terms are about using Chat GP T and not about using the output of Chat GP T use. So as long as you observe the terms while using Chat GP T, the output is yours is what the thing says, the terms of use. So subsequent use or licensing of the content is not subject to these terms, it shouldn't be assumed to be. So then the terms that appear in the usage policies document, which I'll put the link in the show notes, in my view support this idea even more, because they're quite clearly about Chat GP T usage as opposed to usage of Chat GP T outputs. So let's look at section three of Chat GP T terms titled content. Here's what it says, chapter paragraph B, similarity of content. Due to the nature of machine learning, output may not be unique across users and the services may generate the same or similar output for open AI or a third party. For example, you may provide input to a model such as, quote, what color is the sky, end quote, and receive outputs such as quote, the sky is blue, end quote. Well, other users may also ask similar questions and receive the same response, responses that are requested by and generated for other users are not considered your content. So this is very similar to the way trademarks that happen to be a word with a common meaning, such as Apple, are described as quote, weak trademarks and are entitled to much less protection. So it is questionable whether open AI even could restrict anyone's use of Chat GP T content. In fact, I think there have been court rulings in the US, as I said earlier, declaring that generative AI content is not eligible for protection at all. So perhaps a bigger question is whether even the rights reserved under Creative Commons license can actually be reserved by anyone on Chat GP T content, as I said earlier, meaning that any direct use of Chat GP T content would have to be put in the public domain. If anything, re-licensing Chat GP T content could come to be seen as copyright infringement upon those who hold copyright over the training data. But I think the fault here is with open AI and assigning to you the rights over it, rights that open AI could not have over the content themselves. If anything, what I'm trying to say here is I think it's pretty clear that open AI does not have the authority to assign rights over Chat GP T output to anyone. Maybe we, as a society, our courts will decide that they actually infringed upon other people's copyright when they used their copyrighted material in their training data. I think that's there is a very strong case there. It may very well happen. Even recently, Sam Altman admitted that it would be impossible to train Chat GP T without infringing on copyright, which we hope some very strong opinion will come out of that eventually. In my view, it should be, no, we are not going to support you breaking our rules so that your company can be very successful. That's just not how democracy works. So anyway, so now we're coming back to Hacker Public Radio. Please don't take it to mean that I am a champion of generative AI. My greatest hope for AI is that it will kill the web a little more. We have basically companies have basically been treating people as generative AI and getting them to write this really crappy copy and copying stuff around back and forth so that now you search for anything online and all you find is all these ads and stuff. So I think there's some good potential for all these companies to start using generative AI and for things to get even worse and to maybe finally lead to the collapse of this web that we have today and then we'll get to rebuild an internet that we maybe like better, like the ones that many of us here on Hacker Public Radio mostly live in. So that's right now I think that's a good thing that could come from it, at least this kind of a mass appeal form that ChatGPT represents. But I am a technical writer at work, so I do see that there is some potential for some applications in the world of tech companies that could be really interesting and I am definitely not closing my eyes to it. But as far as Hacker Public Radio goes, I don't think we should have any considerable amount of ChatGPT generated content of course not. I think it would be wise to have a generative AI policy anyway, maybe declaring that we're maybe I would suggest that it could be we are open to generative AI content as long as the subject of the show is in some way related to generative AI itself, right? We don't want we're not going to be interested in the show that uses generative AI output as just a means to talk about something else, right? We would prefer for you to just speak in your own words basically is what I'm saying. But anyway I don't it's pretty obviously it's pretty obvious that we don't have a problem with AI content creeping into Hacker Public Radio it doesn't seem so so I wouldn't worry about it for now anyway. So anyway that's my view on all that. I invite you to think about that and see what you think, see if you agree with me, or if you have another opinion I would love to hear about it so please record your your response and let's keep the conversation going. This is like an important topic of our time I think so it's kind of I don't know seems like it's a good idea to discuss it. I'll say a couple more things here. Here's what what else I've got written down. Let's see. Yeah so in a way I think it's kind of like generative AI hacks copyright a little bit by sort of converting making it easier to convert copyrightable content into non copyrightable ideas, right? Of course it doesn't do that directly as we know it just it just figures out the probabilities of words coming after other words but when we look at it we make ideas out of it right by accident sort of so so that's kind of why I think it it would be a mistake for for free culture advocates to to oppose the use of generative AI. So let me read this other bit here. So as with fair use the eligibility for for copyright protection this is kind of the basis of the fair use doctrine. The eligibility for a copyright protection hinges on how substantially you added your own authorship to the portion of something that is not eligible for copyright protection. So then here's a quote from this article from the federal register a document I mean about just copyright registration guidance. This is the office of the US government where you where you register copyright your copyright on stuff you know you know you you can go and register the copyright to kind of strengthen your claim if there is an infringement later. So they wrote the ninth circuit has held that a book containing words quote authored by non-human spiritual beings close quote can only qualify for copyright protection if there is human selection and arrangement of the revelations. I'm not sure what context this was about here but the article is specifically about copyright registration guidance were for works containing material generated by artificial intelligence. So in the in the fair use doctrine the degree of copyright protection to which such a work would be entitled is based on the degree of human selection and arrangement of the non copyrightable source material is what that says. So it can only qualify for copyright protection something that is not eligible for for copyright protection such as chat GPT output it is it is only eligible for copyright protection if there is human selection and arrangement right and how how strongly it can get protection right is based on the degree of human selection and arrangement of the non copyrightable source material. So what that means is like you know there's always a fair degree of judgment that's made by the courts whenever the judge copyright infringement to determine whether something would fall under fair use or not the fair use doctrine in the US allows for for a lot of nuance you know so so what this means for chat GPT is that the janitors would have to make this kind of judgment case by case on whether or not to reject a show and then they may very well decide they don't want to do that and and that's fair enough and decide that that we're rather we're just not going to accept any any shows that have any any chat GPT output at all that that would be completely fine of course. So okay I'll shut up now sorry that this was a little bit messy in the end I kind of read my notes and kept adding as I went which just ended up making some things a little bit redundant and I mean I just hope at least some of you understood what I'm trying to say and again let me know what you think these these are just some things that I have come to think about all this stuff from a perspective of someone who is just interested in how how copyright works in our society and someone who is very much a free culture advocate so um yeah um don't take too much time just get a microphone and record your response show and then post it to hacker public radio don't edit it much either all right so thank you for listening um and come back tomorrow for another excellent episode of hacker public radio bye you have been listening to hacker public radio at hacker public radio does work. Today's show was contributed by a hbr listener like yourself if you ever thought of recording podcast and click on our contribute link to find out how easy it really is hosting for hbr has been kindly provided by an honesthost.com the internet archive and rsync.net on this otherwise stated today's show is released on their creative commons attribution 4.0 international license